GapMind for Amino acid biosynthesis

 

Aligments for a candidate for carB in Caulobacter crescentus NA1000

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate CCNA_02994 CCNA_02994 carbamoyl-phosphate synthase large chain

Query= BRENDA::P00968
         (1073 letters)



>FitnessBrowser__Caulo:CCNA_02994
          Length = 1099

 Score = 1269 bits (3283), Expect = 0.0
 Identities = 675/1099 (61%), Positives = 812/1099 (73%), Gaps = 33/1099 (3%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDI SILI+GAGPIVIGQACEFDYSG QACKALR EGYR+ILVNSNPATIMTDP++
Sbjct: 1    MPKRTDISSILIIGAGPIVIGQACEFDYSGVQACKALRAEGYRIILVNSNPATIMTDPDV 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            ADATYIEPI  E+V KIIEKERPDA+LPTMGGQTALN AL LE  G L +FGV MIGA A
Sbjct: 61   ADATYIEPITPEMVAKIIEKERPDALLPTMGGQTALNTALALEADGTLAKFGVEMIGAKA 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADVGFPCIIRPSFTMGGSGGG 180
            + IDKAEDR++F  AM K+GLE+ RS   HT++EA+     VG P IIRPSFT+ G+GGG
Sbjct: 121  EVIDKAEDRQKFRDAMDKLGLESPRSRACHTLDEAMEGLEFVGLPAIIRPSFTLAGTGGG 180

Query: 181  IAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCSIENFDAM 240
            IAYN EEF+EI  RGLDLSPT E+L++ES++GWKEYEMEVVRDK DNCIIVCSIEN D M
Sbjct: 181  IAYNVEEFKEIVERGLDLSPTTEVLVEESVLGWKEYEMEVVRDKADNCIIVCSIENIDPM 240

Query: 241  GIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNGRLIVIEM 300
            G+HTGDSITVAPA TLTDKEYQ MR AS+AVLREIGVETGGSNVQFAVNP +GR++VIEM
Sbjct: 241  GVHTGDSITVAPALTLTDKEYQWMRAASIAVLREIGVETGGSNVQFAVNPADGRMVVIEM 300

Query: 301  NPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSIDYVVTKIP 360
            NPRVSRSSALASKATGFPIAKVAA+LAVGYTLDEL NDITGG TPASFEPSIDYVVTKIP
Sbjct: 301  NPRVSRSSALASKATGFPIAKVAARLAVGYTLDELKNDITGGATPASFEPSIDYVVTKIP 360

Query: 361  RFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPK--VSLDDP 418
            RF FEK+ G+   LTT MKSVGEVMAIGRT +ES+QKALRGLE G +GFD       DDP
Sbjct: 361  RFAFEKYPGSEPLLTTAMKSVGEVMAIGRTFKESVQKALRGLETGLSGFDEVEIAGADDP 420

Query: 419  E-ALTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLEEKVA 477
            +     + R L     DR+  IA AFR GL+VD V    + + WFL QI E+VR E  V 
Sbjct: 421  DNGKEAVIRALGVPTPDRLRVIAQAFRHGLTVDEVNAACSYEPWFLRQIAEIVRQEGWVK 480

Query: 478  EVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCAAEFA 537
              G+    A   R+LK +GF+DARLAKL    E  +R  R   ++ PV+KR+D+CA EF 
Sbjct: 481  AGGLP-QTAQGFRELKAQGFSDARLAKLTASTEKAVRAARQALNVRPVFKRIDSCAGEFL 539

Query: 538  TDTAYMYSTYE-------EECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALR 590
              T YMYSTYE        +CE++PS   +K ++LGGGPNRIGQGIEFDYCC HA+ AL 
Sbjct: 540  ASTPYMYSTYEFGALGQIPQCESDPSA-AKKAVILGGGPNRIGQGIEFDYCCCHAAFALD 598

Query: 591  EDGYETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPK----GVIVQYGGQT 646
            + G E+IMVNCNPETVSTDYDTSDRLYFEP+T EDVLE++ +E  K    GVIVQ+GGQT
Sbjct: 599  QIGVESIMVNCNPETVSTDYDTSDRLYFEPLTAEDVLELLHVEMSKGTLAGVIVQFGGQT 658

Query: 647  PLKLARALEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAK 706
            PLKLA ALE AGVP++GTSPDAID AEDRERFQ  +  L + QP NA   + + A  +  
Sbjct: 659  PLKLAHALEEAGVPILGTSPDAIDLAEDRERFQQLLNGLNIAQPENAIARSWDEARAEGD 718

Query: 707  EIGYPLVVRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDA 766
            +IG+P V+RPSYVLGGR MEI+ D   + RY   A  +S + P+LLDH+L  A EVDVDA
Sbjct: 719  KIGFPFVMRPSYVLGGRGMEIIRDHEAMERYIAGAGEISLEHPILLDHYLSRATEVDVDA 778

Query: 767  ICDGEMVLIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLM 826
            +CDG+ V + G++EHIE+AGVHSGDSACS+P ++L  E  + +++Q  ++A  L VRGLM
Sbjct: 779  LCDGKDVFVAGVLEHIEEAGVHSGDSACSMPPFSLKAETVEELKRQTVQMALALNVRGLM 838

Query: 827  NVQFAVK-----NNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQGVTK 881
            NVQFA++     N  +Y++EVNPRA+RTVPFV+K  G P+A +AA++MAG+SLA  G+ K
Sbjct: 839  NVQFAIEEPHSDNPRIYVLEVNPRASRTVPFVAKTIGQPVAAIAAKIMAGESLASFGL-K 897

Query: 882  EVIPPYYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVM----------GVGRTFAEAFAKA 931
            +V   + +VKE V PF +F GVD +LGPEMRSTGEVM          G+G  FA AFAK+
Sbjct: 898  DVPYDHIAVKEAVFPFARFAGVDTVLGPEMRSTGEVMGLDWKRDGETGMGPAFARAFAKS 957

Query: 932  QLGSNSTMKKHGRALLSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGINPRLV 991
            QLG    +   G A +SV+E DK  +V+    L   GF++ +T GT   L   G+    V
Sbjct: 958  QLGGGVKLPTKGTAFVSVKESDKPWIVEPVKLLQAAGFKVLSTEGTQAYLAAQGVQVEHV 1017

Query: 992  NKVHEGRPHIQDRIKNGEYTYIINTTSGRRAIEDSRVIRRSALQYKVHYDTTLNGGFATA 1051
             KV EGRPHI D +KNG    + NTT G++A+EDS  IRR+AL  KV Y TT  G  A A
Sbjct: 1018 KKVLEGRPHIVDVMKNGGVQLVFNTTEGKQALEDSFEIRRTALMMKVPYYTTSAGALAAA 1077

Query: 1052 MALNADATEKVISVQEMHA 1070
             A+ A A  + + V+ + +
Sbjct: 1078 QAI-AGAPAEALDVRPLQS 1095


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3085
Number of extensions: 143
Number of successful extensions: 19
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1099
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1053
Effective search space:  1081431
Effective search space used:  1081431
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate CCNA_02994 CCNA_02994 (carbamoyl-phosphate synthase large chain)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.23314.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1439.4   0.0          0 1439.2   0.0    1.0  1  lcl|FitnessBrowser__Caulo:CCNA_02994  CCNA_02994 carbamoyl-phosphate s


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Caulo:CCNA_02994  CCNA_02994 carbamoyl-phosphate synthase large chain
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1439.2   0.0         0         0       1    1051 [.       2    1079 ..       2    1080 .. 0.95

  Alignments for each domain:
  == domain 1  score: 1439.2 bits;  conditional E-value: 0
                             TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltve 71  
                                            pkr+di+++l+iG+GpivigqA+EFDYsG qa+kal+ eg++++Lvnsn+At+mtd+++ad++YieP+t+e
  lcl|FitnessBrowser__Caulo:CCNA_02994    2 PKRTDISSILIIGAGPIVIGQACEFDYSGVQACKALRAEGYRIILVNSNPATIMTDPDVADATYIEPITPE 72  
                                            689******************************************************************** PP

                             TIGR01369   72 avekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineev 142 
                                            +v+kiiekErpDa+l+t+GGqtaLn a+ le  G L+k+gv+++G+k e+i+kaedR+kF++a++++++e 
  lcl|FitnessBrowser__Caulo:CCNA_02994   73 MVAKIIEKERPDALLPTMGGQTALNTALALEADGTLAKFGVEMIGAKAEVIDKAEDRQKFRDAMDKLGLES 143 
                                            *********************************************************************** PP

                             TIGR01369  143 akseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqvlvekslagwk 213 
                                            ++s+++++ +ea+e  e +g+P i+R++ftl+GtG+gia+n ee+ke+ve++l++sp+++vlve+s+ gwk
  lcl|FitnessBrowser__Caulo:CCNA_02994  144 PRSRACHTLDEAMEGLEFVGLPAIIRPSFTLAGTGGGIAYNVEEFKEIVERGLDLSPTTEVLVEESVLGWK 214 
                                            *********************************************************************** PP

                             TIGR01369  214 EiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkiirelgvege.cnvq 283 
                                            E+E+EvvRD++dnciivc+iEn+Dp+GvHtGdsi+vaP+ tLtdkeyq +R as++++re+gve++ +nvq
  lcl|FitnessBrowser__Caulo:CCNA_02994  215 EYEMEVVRDKADNCIIVCSIENIDPMGVHTGDSITVAPALTLTDKEYQWMRAASIAVLREIGVETGgSNVQ 285 
                                            ****************************************************************988**** PP

                             TIGR01369  284 faldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtk.etvAsfEPslDYvv 353 
                                            fa++P + r+vviE+npRvsRssALAskAtG+PiAkvaa+lavGy+Ldelknd+t+  t+AsfEPs+DYvv
  lcl|FitnessBrowser__Caulo:CCNA_02994  286 FAVNPADGRMVVIEMNPRVSRSSALASKATGFPIAKVAARLAVGYTLDELKNDITGgATPASFEPSIDYVV 356 
                                            *******************************************************878************* PP

                             TIGR01369  354 vkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekllglklkeke.....aesdeel 419 
                                            +kiPr++++k+ + +  l+t mksvGEvmaigrtf+e++qkalr le++l+g+++ e +      + +e++
  lcl|FitnessBrowser__Caulo:CCNA_02994  357 TKIPRFAFEKYPGSEPLLTTAMKSVGEVMAIGRTFKESVQKALRGLETGLSGFDEVEIAgaddpDNGKEAV 427 
                                            ****************************************************7665443100114556778 PP

                             TIGR01369  420 eealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvelekeleeeklkelkkellkkakkl 490 
                                             +al  p+++Rl +ia+a+r+g++v+ev+ ++ ++ +fl++++++v+ e  ++   l   +++ ++++k++
  lcl|FitnessBrowser__Caulo:CCNA_02994  428 IRALGVPTPDRLRVIAQAFRHGLTVDEVNAACSYEPWFLRQIAEIVRQEGWVKAGGLP-QTAQGFRELKAQ 497 
                                            899999**********************************************977776.77899******* PP

                             TIGR01369  491 GfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpYlYstyeee......kddvevtekk 555 
                                            Gfsd+++akl+ ++e++vr++r++l++ pv+kr+D +a+Ef a+tpY+Ystye        + +++ +  k
  lcl|FitnessBrowser__Caulo:CCNA_02994  498 GFSDARLAKLTASTEKAVRAARQALNVRPVFKRIDSCAGEFLASTPYMYSTYEFGalgqipQCESDPSAAK 568 
                                            ****************************************************9765555555566667779 PP

                             TIGR01369  556 kvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvstDydiadrLyFeeltvedvldiie 626 
                                            k ++lG+Gp+Rigqg+EFDyc+ ha+ al + g ++i++n+nPEtvstDyd++drLyFe+lt edvl++++
  lcl|FitnessBrowser__Caulo:CCNA_02994  569 KAVILGGGPNRIGQGIEFDYCCCHAAFALDQIGVESIMVNCNPETVSTDYDTSDRLYFEPLTAEDVLELLH 639 
                                            *********************************************************************99 PP

                             TIGR01369  627 kekve....gvivqlgGqtalnlakeleeagvkilGtsaesidraEdRekFsklldelgikqpkgkeatsv 693 
                                             e  +    gvivq+gGqt+l+la++leeagv+ilGts+++id aEdRe+F++ll+ l+i qp++++a+s 
  lcl|FitnessBrowser__Caulo:CCNA_02994  640 VEMSKgtlaGVIVQFGGQTPLKLAHALEEAGVPILGTSPDAIDLAEDRERFQQLLNGLNIAQPENAIARSW 710 
                                            8865422228************************************************************* PP

                             TIGR01369  694 eeakeiakeigyPvlvRpsyvlgGrameiveneeeleryleeavevskekPvlidkyledavEvdvDavad 764 
                                            +ea+   ++ig+P ++RpsyvlgGr+mei++++e +ery+  a e+s e+P+l+d+yl+ a+EvdvDa++d
  lcl|FitnessBrowser__Caulo:CCNA_02994  711 DEARAEGDKIGFPFVMRPSYVLGGRGMEIIRDHEAMERYIAGAGEISLEHPILLDHYLSRATEVDVDALCD 781 
                                            *********************************************************************** PP

                             TIGR01369  765 geevliagileHiEeaGvHsGDstlvlppqklseevkkkikeivkkiakelkvkGllniqfvvkd.....e 830 
                                            g++v++ag+leHiEeaGvHsGDs++++pp +l++e+++++k+++ ++a +l+v+Gl+n+qf++++      
  lcl|FitnessBrowser__Caulo:CCNA_02994  782 GKDVFVAGVLEHIEEAGVHSGDSACSMPPFSLKAETVEELKRQTVQMALALNVRGLMNVQFAIEEphsdnP 852 
                                            **************************************************************986432224 PP

                             TIGR01369  831 evyviEvnvRasRtvPfvskalgvplvklavkvllgkkleelekgvkkekksklvavkaavfsfsklagvd 901 
                                            ++yv+Evn+RasRtvPfv+k++g p++++a+k+++g++l++ +    k+  ++++avk+avf+f+++agvd
  lcl|FitnessBrowser__Caulo:CCNA_02994  853 RIYVLEVNPRASRTVPFVAKTIGQPVAAIAAKIMAGESLASFGL---KDVPYDHIAVKEAVFPFARFAGVD 920 
                                            69***************************************886...99999******************* PP

                             TIGR01369  902 vvlgpemkstGEvmgigrd..........leeallkallaskakikkkgsvllsvkdkdkeellelakkla 962 
                                             vlgpem+stGEvmg++ +          +++a++k++l  + k+++kg++++svk++dk  ++e +k l+
  lcl|FitnessBrowser__Caulo:CCNA_02994  921 TVLGPEMRSTGEVMGLDWKrdgetgmgpaFARAFAKSQLGGGVKLPTKGTAFVSVKESDKPWIVEPVKLLQ 991 
                                            **************9763211222222226788889999999***************************** PP

                             TIGR01369  963 ekglkvyategtakvleeagikaevvlkvseeaekilellkeeeielvinltskkkkaaekgykirreave 1033
                                            ++g+kv++tegt+++l+ +g+++e+v+kv e +++i++++k++ ++lv+n+t+ +k+a e+++ irr+a++
  lcl|FitnessBrowser__Caulo:CCNA_02994  992 AAGFKVLSTEGTQAYLAAQGVQVEHVKKVLEGRPHIVDVMKNGGVQLVFNTTE-GKQALEDSFEIRRTALM 1061
                                            **************************************************997.888999*********** PP

                             TIGR01369 1034 ykvplvteletaeallea 1051
                                            +kvp+ t+ ++a a+++a
  lcl|FitnessBrowser__Caulo:CCNA_02994 1062 MKVPYYTTSAGALAAAQA 1079
                                            *********999988877 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1099 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.10u 0.04s 00:00:00.14 Elapsed: 00:00:00.12
# Mc/sec: 9.17
//
[ok]

This GapMind analysis is from Aug 03 2021. The underlying query database was built on Aug 03 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code, or see changes to Amino acid biosynthesis since the publication.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory