GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Bacteroides thetaiotaomicron VPI-5482

Align Carbamoyl-phosphate synthase arginine-specific large chain; Arginine-specific carbamoyl-phosphate synthetase, ammonia chain; EC 6.3.5.5 (characterized)
to candidate 350085 BT0557 carbamyl phosphate synthetase (NCBI ptt file)

Query= SwissProt::P03965
         (1118 letters)



>FitnessBrowser__Btheta:350085
          Length = 1075

 Score = 1102 bits (2850), Expect = 0.0
 Identities = 574/1097 (52%), Positives = 760/1097 (69%), Gaps = 35/1097 (3%)

Query: 26   EGVNSVLVIGSGGLSIGQAGEFDYSGSQAIKALKEDNKFTILVNPNIATNQTSHSLADKI 85
            E +  VL++GSG L IG+AGEFDYSGSQA+KALKE+   TIL+NPNIAT QTS  +AD+I
Sbjct: 3    ENIKKVLLLGSGALKIGEAGEFDYSGSQALKALKEEGIETILINPNIATVQTSEGVADQI 62

Query: 86   YYLPVTPEYITYIIELERPDAILLTFGGQTGLNCGVALDESGVLAKYNVKVLGTPIKTLI 145
            Y+LPVTP ++  +I+ E+P+ I+L FGGQT LNCGVAL + G+L KYNVKVLGTP++ ++
Sbjct: 63   YFLPVTPYFVEKVIQKEKPEGIMLAFGGQTALNCGVALYKEGILEKYNVKVLGTPVQAIM 122

Query: 146  TSEDRDLFASALKDINIPIAESFACETVDEALEAAERVKYPVIVRSAYALGGLGSGFANN 205
             +EDR+LF   L +IN+   +S A E  ++A  AA+ + YPVIVR+AYALGGLGSGF +N
Sbjct: 123  DTEDRELFVQKLNEINVKTIKSEAVENAEDARRAAKELGYPVIVRAAYALGGLGSGFCDN 182

Query: 206  ASEMKELAAQSLSLAPQILVEKSLKGWKEVEYEVVRDRVGNCITVCNMENFDPLGVHTGD 265
              ++  L  ++ S +PQ+LVEKSL+GWKEVEYEVVRDR  NCITVCNMENFDPLG+HTG+
Sbjct: 183  EEQLDVLVEKAFSFSPQVLVEKSLRGWKEVEYEVVRDRFDNCITVCNMENFDPLGIHTGE 242

Query: 266  SMVFAPSQTLSDEEFHMLRSAAIKIIRHLGVIGECNVQYALQPDGLDYRVIEVNARLSRS 325
            S+V APSQTL++ E+H LR  AI+IIRH+G++GECNVQYA  P+  DYRVIEVNARLSRS
Sbjct: 243  SIVIAPSQTLTNSEYHKLRELAIRIIRHIGIVGECNVQYAFDPESEDYRVIEVNARLSRS 302

Query: 326  SALASKATGYPLAYTAAKIGLGYTLPELPNPITKTTVANFEPSLDYIVAKIPKWDLSKFQ 385
            SALASKATGYPLA+ AAK+GLGY L +L N +TKTT A FEP+LDY+V KIP+WDL KF 
Sbjct: 303  SALASKATGYPLAFVAAKLGLGYGLFDLKNSVTKTTSAFFEPALDYVVCKIPRWDLGKFH 362

Query: 386  YVDRSIGSSMKSVGEVMAIGRNYEEAFQKALRQVDPSLLGFQGSTEFG-DQLDEALRTPT 444
             VD+ +GSSMKSVGEVMAIGR +EEA QK LR +   + GF  + E     +D+ALR PT
Sbjct: 363  GVDKELGSSMKSVGEVMAIGRTFEEAIQKGLRMIGQGMHGFVENKELVIPDIDKALREPT 422

Query: 445  DRRVLAIGQALIHENYTVERVNELSKIDKWFLYKCMNIVNIYKEL----ESVKSLSDLSK 500
            D+R+  I +A     YT+++V+EL+KIDKWFL K MNI+   +E+     + K ++DL  
Sbjct: 423  DKRIFVISKA-FRAGYTIDQVHELTKIDKWFLQKLMNIMKTSEEMHEWGNNHKQIADLPV 481

Query: 501  DLLQRAKKLGFSDKQIAVTINKHASTNINELEIRSLRKTLGIIPFVKRIDTLAAEFPAQT 560
            +LL++AK  GFSD QIA  I          L +R  RK  GI+P VK+IDTLAAE+PAQT
Sbjct: 482  ELLRKAKVQGFSDFQIARAIGYEGDMENGSLYVRKYRKAAGILPVVKQIDTLAAEYPAQT 541

Query: 561  NYLYTTYNATKNDVEF--NENGMLVLGSGVYRIGSSVEFDWCAVNTAKTLRDQGKKTIMI 618
            NYLY TY+   NDV +  +   ++VLGSG YRIGSSVEFDWC V    T+R +G +++MI
Sbjct: 542  NYLYLTYSGVANDVHYLGDHKSIVVLGSGAYRIGSSVEFDWCGVQALNTIRKEGWRSVMI 601

Query: 619  NYNPETVSTDFDEVDRLYFEELSYERVMDIYELEQSEGCIISVGGQLPQNIALKLYDNGC 678
            NYNPETVSTD+D  DRLYF+EL++ERVMDI ELE   G I+S GGQ+P N+AL+L     
Sbjct: 602  NYNPETVSTDYDMCDRLYFDELTFERVMDILELENPHGVIVSTGGQIPNNLALRLDAQNI 661

Query: 679  NIMGTNPNDIDRAENRHKFSSILDSIDVDQPEWSELTSVEEAKLFASKVNYPVLIRPSYV 738
            +I+GT+   ID AE+R KFS++LD I VDQP W ELTS+E+   F  +V +PVL+RPSYV
Sbjct: 662  HILGTSAQSIDNAEDREKFSAMLDRIGVDQPRWRELTSLEDINEFVDEVGFPVLVRPSYV 721

Query: 739  LSGAAMSVVNNEEELKAKLTLASDVSPDHPVVMSKFIEGAQEIDVDAVAYNGNVLVHAIS 798
            LSGAAM+V +N+EEL+  L LA++VS  HPVV+S+FIE A+E+++DAVA NG ++ +AIS
Sbjct: 722  LSGAAMNVCSNQEELERFLKLAANVSKKHPVVVSQFIEHAKEVEMDAVAQNGEIIAYAIS 781

Query: 799  EHVENAGVHSGDASLVLPPQHLSDDVKIALKDIADKVAKAWKITGPFNMQIIKDGEHTLK 858
            EH+E AGVHSGDA++  PPQ L  +    +K I+ ++AKA  I+GPFN+Q +   ++ +K
Sbjct: 782  EHIEFAGVHSGDATIQFPPQKLYVETVRRIKRISREIAKALNISGPFNIQYLAK-DNDIK 840

Query: 859  VIECNIRASRSFPFVSKVLGVNFIEIAVKAFLGGDIVPKPVDLMLNKKYDYVATKVPQFS 918
            VIECN+RASRSFPFVSKVL +NFIE+A K  LG  +     +L    + DYV  K  QFS
Sbjct: 841  VIECNLRASRSFPFVSKVLKINFIELATKVMLGLPVEKPEKNLF---ELDYVGIKASQFS 897

Query: 919  FTRLAGADPFLGVEMASTGEVASFGRDLIESYWTAIQSTMNFHVPLPPSGILFGGDTSRE 978
            F RL  ADP LGV+MASTGEV   G D   +    +++ ++    +P   IL    T ++
Sbjct: 898  FNRLQKADPVLGVDMASTGEVGCIGSD---TSCAVLKAMLSVGYRIPKKNILLSTGTMKQ 954

Query: 979  Y--LGQVASIVATIGYRIYTTNETTKTYLQEHIKEKNAKVSLIKFPKNDKR-KLRELFQE 1035
               +   A ++   GY+++ T  T KT+ +  I+      +L+ +P  +   +  E+   
Sbjct: 955  KADMMDAARMLVNKGYKLFATGGTHKTFAENGIES-----TLVYWPSEEGHPQALEMLHN 1009

Query: 1036 YDIKAVFNLASKRAESTDDVDYIMRRNAIDFAIPLFNEPQTALLFAKCLKAKIAEKIKIL 1095
             +I  V N+         D  Y +RR AID  +PL    + A  F         + I I 
Sbjct: 1010 KEIDMVVNIPKNLTAGELDNGYKIRRAAIDLNVPLITNARLASAFINAFCTMTVDDIAI- 1068

Query: 1096 ESHDVIVPPEVRSWDEF 1112
                       +SW+E+
Sbjct: 1069 -----------KSWEEY 1074


Lambda     K      H
   0.317    0.134    0.380 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2754
Number of extensions: 115
Number of successful extensions: 18
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1118
Length of database: 1075
Length adjustment: 46
Effective length of query: 1072
Effective length of database: 1029
Effective search space:  1103088
Effective search space used:  1103088
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate 350085 BT0557 (carbamyl phosphate synthetase (NCBI ptt file))
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.11614.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                          Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                          -----------
          0 1577.2   0.4          0 1577.0   0.4    1.0  1  lcl|FitnessBrowser__Btheta:350085  BT0557 carbamyl phosphate synthe


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Btheta:350085  BT0557 carbamyl phosphate synthetase (NCBI ptt file)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1577.0   0.4         0         0       3    1051 ..       2    1057 ..       1    1058 [. 0.98

  Alignments for each domain:
  == domain 1  score: 1577.0 bits;  conditional E-value: 0
                          TIGR01369    3 redikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltveaveki 76  
                                         +e+ikkvl++GsG+++ig+A+EFDYsGsqalkalkeegie++L+n+niAtv+t+e +ad++Y++P+t+++vek+
  lcl|FitnessBrowser__Btheta:350085    2 KENIKKVLLLGSGALKIGEAGEFDYSGSQALKALKEEGIETILINPNIATVQTSEGVADQIYFLPVTPYFVEKV 75  
                                         789*********************************************************************** PP

                          TIGR01369   77 iekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineevakseives 150 
                                         i+kE+p++i+l++GGqtaLn++v l ++G+Leky+vk+lGt+v+ai+++edRe+F ++l+ein++++kse+ve+
  lcl|FitnessBrowser__Btheta:350085   76 IQKEKPEGIMLAFGGQTALNCGVALYKEGILEKYNVKVLGTPVQAIMDTEDRELFVQKLNEINVKTIKSEAVEN 149 
                                         ************************************************************************** PP

                          TIGR01369  151 veealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqvlvekslagwkEiEyEvvRDsk 224 
                                          e+a +aa+e+gyPvivRaa++lgG Gsg+++nee+l  lveka++ s  +qvlveksl gwkE+EyEvvRD++
  lcl|FitnessBrowser__Btheta:350085  150 AEDARRAAKELGYPVIVRAAYALGGLGSGFCDNEEQLDVLVEKAFSFS--PQVLVEKSLRGWKEVEYEVVRDRF 221 
                                         ************************************************..9*********************** PP

                          TIGR01369  225 dnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkiirelgvegecnvqfaldPeskryvviEv 298 
                                         dnci+vcn+En+DplG+HtG+siv+aPsqtLt++ey++lR+ +++iir++g++gecnvq+a dPes++y+viEv
  lcl|FitnessBrowser__Btheta:350085  222 DNCITVCNMENFDPLGIHTGESIVIAPSQTLTNSEYHKLRELAIRIIRHIGIVGECNVQYAFDPESEDYRVIEV 295 
                                         ************************************************************************** PP

                          TIGR01369  299 npRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklg 372 
                                         n+R+sRssALAskAtGyP+A vaakl++Gy+L++lkn+vtk+t+A+fEP+lDYvv+kiPrwdl kf++vd++lg
  lcl|FitnessBrowser__Btheta:350085  296 NARLSRSSALASKATGYPLAFVAAKLGLGYGLFDLKNSVTKTTSAFFEPALDYVVCKIPRWDLGKFHGVDKELG 369 
                                         ************************************************************************** PP

                          TIGR01369  373 tqmksvGEvmaigrtfeealqkalrsleekllglklkekeaesdeeleealkkpndrRlfaiaealrrgvsvee 446 
                                         ++mksvGEvmaigrtfeea+qk+lr++ ++++g+ ++++ +++d +  +al++p+d+R+f+i +a+r+g+++++
  lcl|FitnessBrowser__Btheta:350085  370 SSMKSVGEVMAIGRTFEEAIQKGLRMIGQGMHGFVENKELVIPDID--KALREPTDKRIFVISKAFRAGYTIDQ 441 
                                         ************************************9999999988..9************************* PP

                          TIGR01369  447 vyeltkidrffleklkklvelekeleee.....klkelkkellkkakklGfsdeqiaklvkvseae......vr 509 
                                         v+eltkid++fl+kl++++++++e++e      ++ +l+ ell+kak +Gfsd qia++++ + +       vr
  lcl|FitnessBrowser__Btheta:350085  442 VHELTKIDKWFLQKLMNIMKTSEEMHEWgnnhkQIADLPVELLRKAKVQGFSDFQIARAIGYEGDMengslyVR 515 
                                         *************************9764433355599**********************965444456667** PP

                          TIGR01369  510 klrkelgivpvvkrvDtvaaEfeaktpYlYstyeeekddvevtek.kkvlvlGsGpiRigqgvEFDycavhavl 582 
                                         k+rk++gi+pvvk++Dt+aaE++a+t+YlY+ty++  +dv++  + k+++vlGsG++Rig++vEFD+c+v+a++
  lcl|FitnessBrowser__Btheta:350085  516 KYRKAAGILPVVKQIDTLAAEYPAQTNYLYLTYSGVANDVHYLGDhKSIVVLGSGAYRIGSSVEFDWCGVQALN 589 
                                         ***************************************998765599************************** PP

                          TIGR01369  583 alreagyktilinynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvki 656 
                                         ++r++g ++++inynPEtvstDyd++drLyF+elt+e+v+di+e e+++gviv++gGq+++nla +l++++++i
  lcl|FitnessBrowser__Btheta:350085  590 TIRKEGWRSVMINYNPETVSTDYDMCDRLYFDELTFERVMDILELENPHGVIVSTGGQIPNNLALRLDAQNIHI 663 
                                         ************************************************************************** PP

                          TIGR01369  657 lGtsaesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeele 730 
                                         lGtsa+sid+aEdRekFs++ld++g++qp+++e+ts+e+++e+++e+g+PvlvRpsyvl+G+am++++n+eele
  lcl|FitnessBrowser__Btheta:350085  664 LGTSAQSIDNAEDREKFSAMLDRIGVDQPRWRELTSLEDINEFVDEVGFPVLVRPSYVLSGAAMNVCSNQEELE 737 
                                         ************************************************************************** PP

                          TIGR01369  731 ryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkki 804 
                                         r+l+ a++vsk++Pv++++++e+a+Ev++Dava+++e+++++i+eHiE aGvHsGD+t+++ppqkl  e++++i
  lcl|FitnessBrowser__Btheta:350085  738 RFLKLAANVSKKHPVVVSQFIEHAKEVEMDAVAQNGEIIAYAISEHIEFAGVHSGDATIQFPPQKLYVETVRRI 811 
                                         ************************************************************************** PP

                          TIGR01369  805 keivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkleelekgvkk 878 
                                         k+i+++iak+l+++G++niq+++kd++++viE+n+RasR++Pfvsk+l++++++la+kv+lg  +e+ ek   +
  lcl|FitnessBrowser__Btheta:350085  812 KRISREIAKALNISGPFNIQYLAKDNDIKVIECNLRASRSFPFVSKVLKINFIELATKVMLGLPVEKPEK---N 882 
                                         *******************************************************************665...8 PP

                          TIGR01369  879 ekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakikkkgsvllsvkdkdke 952 
                                           + ++v++ka++fsf++l+++d+vlg++m+stGEv +ig+d++ a+lka+l+++++i+kk+ +l++++ k+k 
  lcl|FitnessBrowser__Btheta:350085  883 LFELDYVGIKASQFSFNRLQKADPVLGVDMASTGEVGCIGSDTSCAVLKAMLSVGYRIPKKNILLSTGTMKQKA 956 
                                         8999********************************************************************** PP

                          TIGR01369  953 ellelakklaekglkvyategtakvleeagikaevvlkvseea.ekilellkeeeielvinltskkkk.aaekg 1024
                                         +++++a++l++kg+k++at gt+k+++e+gi++++v++ see  +++le+l+++ei++v+n++++ ++ + ++g
  lcl|FitnessBrowser__Btheta:350085  957 DMMDAARMLVNKGYKLFATGGTHKTFAENGIESTLVYWPSEEGhPQALEMLHNKEIDMVVNIPKNLTAgELDNG 1030
                                         **************************************99876699****************9986655899** PP

                          TIGR01369 1025 ykirreaveykvplvteletaeallea 1051
                                         ykirr+a++ +vpl+t++++a+a+++a
  lcl|FitnessBrowser__Btheta:350085 1031 YKIRRAAIDLNVPLITNARLASAFINA 1057
                                         ************************987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1075 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.06u 0.02s 00:00:00.08 Elapsed: 00:00:00.08
# Mc/sec: 13.29
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory