GapMind for catabolism of small carbon sources

 

Alignments for a candidate for gcvP in Bacteroides thetaiotaomicron VPI-5482

Align Glycine dehydrogenase (aminomethyl-transferring) (EC 1.4.4.2) (characterized)
to candidate 350675 BT1147 glycine dehydrogenase [decarboxylating] (NCBI ptt file)

Query= reanno::WCS417:GFF4367
         (946 letters)



>FitnessBrowser__Btheta:350675
          Length = 949

 Score =  969 bits (2505), Expect = 0.0
 Identities = 501/936 (53%), Positives = 646/936 (69%), Gaps = 14/936 (1%)

Query: 13  ARHIGPRQEDEQQMLASLGFDSLEALSASVIPESIKGTSVLGLEDGLSEAEALAKIKAIA 72
           +RHIG  +ED   ML  +G DSL+ L    IP +I+    L L   L+E E    I  +A
Sbjct: 8   SRHIGINEEDTAVMLRKIGVDSLDELINKTIPANIRLKEPLALAKPLTEYEFGKHIADLA 67

Query: 73  GKNQLFKTYIGQGYYNCHTPSPILRNLLENPAWYTAYTPYQPEISQGRLEALLNFQTLIS 132
            KN+L+ TYIG G+YN  TP+ I RN+ ENP WYT+YTPYQ E+SQGRLEAL+NFQT + 
Sbjct: 68  SKNKLYTTYIGLGWYNTITPAVIQRNVFENPVWYTSYTPYQTEVSQGRLEALMNFQTAVC 127

Query: 133 DLTGLPIANASLLDEATAAAEAMTFC----KRLSKNKGSNAFFASIHSHPQTLDVLRTRA 188
           DLT +P+AN SLLDEATAAAEA+T       R  +  G+N  F   +  PQTL V+ TRA
Sbjct: 128 DLTAMPLANCSLLDEATAAAEAVTMMYALRSRTQQKAGANVVFVDENIFPQTLAVMTTRA 187

Query: 189 EPLGIDVVVGDERELTDVSPFFGALLQYPASNGDVFDYRELTERFHAAHGLVAVAADLLA 248
            P GI++ VG  +E       F  +LQYP S+G+V DY + T++ H A   VAVAAD+L+
Sbjct: 188 IPQGIELRVGKYKEFEPSPEIFACILQYPNSSGNVEDYADFTKKAHEADCKVAVAADILS 247

Query: 249 LTLLTPPGEFGADVAIGSAQRFGVPLGFGGPHAAYFSTKDAFKRDMPGRLVGVSVDRFGK 308
           L LLTPPGE+GAD+  G+ QR G P+ +GGP A YF+T+D +KR+MPGR++G S D++GK
Sbjct: 248 LALLTPPGEWGADIVFGTTQRLGTPMFYGGPSAGYFATRDEYKRNMPGRIIGWSKDKYGK 307

Query: 309 PALRLAMQTREQHIRREKATSNICTAQVLLANIASMYAVYHGPKGLTQIARRVHQLTAIL 368
              R+A+QTREQHI+REKATSNICTAQ LLA +A  YAVYHG +G+  IA R+H +T  L
Sbjct: 308 LCYRMALQTREQHIKREKATSNICTAQALLATMAGFYAVYHGQEGIKTIASRIHSITVFL 367

Query: 369 AKGLTALGQNVEQAHFFDTLTLNTGANTAA--VHDKARAQRINLRVVDAERVGVSVDETT 426
            K L   G     A +FDTL      + +A  +   A ++ +NLR  +   VG S+DETT
Sbjct: 368 DKQLKKFGYTQVNAQYFDTLRFELPEHVSAQQIRTIALSKEVNLRYYENGDVGFSIDETT 427

Query: 427 TQADIETLWAIFA-----DGKALPDFAAQVESTLPAALLRQSPVLSHPVFNRYHSETELM 481
             A    L +IFA     D + + D   +  S +  AL R +P L+H VF+ YH+ETE+M
Sbjct: 428 DIAATNVLLSIFAIAAGKDYQKVEDVPEK--SNIDKALKRTTPFLTHEVFSNYHTETEMM 485

Query: 482 RYLRKLADKDLALDRTMIPLGSCTMKLNAASEMIPVTWAEFGALHPFAPAEQSAGYLELT 541
           RY+++L  KD++L ++MI LGSCTMKLNAA+EM+P++  EF ++HP  P +Q+ GY EL 
Sbjct: 486 RYIKRLDRKDISLAQSMISLGSCTMKLNAAAEMLPLSRPEFMSMHPLVPEDQAEGYRELI 545

Query: 542 SDLEAMLCAATGYDAISLQPNAGSQGEYAGLLAIRAYHQSRGDERRDICLIPSSAHGTNP 601
           S+L   L   TG+  +SLQPN+G+ GEYAGL  IRAY +S G   R+  LIP+SAHGTNP
Sbjct: 546 SNLSEDLKVITGFAGVSLQPNSGAAGEYAGLRVIRAYLESIGQGHRNKILIPASAHGTNP 605

Query: 602 ATANMAGMRVVVTACDARGNVDIEDLRAKAIEHRDHLAALMITYPSTHGVFEEGIREICG 661
           A+A  AG   V  ACD +GNVD+ DLRAKA E+++ LAALMITYPSTHG+FE  I+EIC 
Sbjct: 606 ASAIQAGFETVTCACDEQGNVDMGDLRAKAEENKEALAALMITYPSTHGIFETEIKEICE 665

Query: 662 IIHDNGGQVYIDGANMNAMVGLCAPGKFGGDVSHLNLHKTFCIPHGGGGPGVGPIGVKSH 721
           IIH  G QVY+DGANMNA VGL  PG  G DV HLNLHKTF  PHGGGGPGVGPI V  H
Sbjct: 666 IIHACGAQVYMDGANMNAQVGLTNPGFIGADVCHLNLHKTFASPHGGGGPGVGPICVAEH 725

Query: 722 LTPFLPGHAAMERKEGAVCAAPFGSASILPITWMYISMMGGAGLKRASQLAILNANYISR 781
           L PFLPGH+     +  V AAPFGSA ILPIT+ YI MMG  GL +A+++AILNANY++ 
Sbjct: 726 LVPFLPGHSIFGSTQNQVSAAPFGSAGILPITYGYIRMMGTEGLTQATKIAILNANYLAA 785

Query: 782 RLEEHYPVLYTGSNGLVAHECILDLRPLKDSSGISVDDVAKRLIDFGFHAPTMSFPVAGT 841
            L++ Y ++Y G+ G V HE IL+ R + + +GIS +D+AKRL+D+G+HAPT+SFPV GT
Sbjct: 786 CLKDTYGIVYRGATGFVGHEMILECRKVHEETGISENDIAKRLMDYGYHAPTLSFPVHGT 845

Query: 842 LMIEPTESESKEELDRFCNAMIAIREEIRAVENGTLDKDDNPLKNAPHTAAELVSE-WTH 900
           LMIEPTESES  ELD F + M+ I +EI+ V+N   DK+DN L NAPH   E+V++ W H
Sbjct: 846 LMIEPTESESLAELDNFVDVMLNIWKEIQEVKNEEADKNDNVLINAPHPEYEIVNDNWEH 905

Query: 901 PYTREQAVYPVPSLIEGKYWPPVGRVDNVFGDRNLV 936
            YTRE+A YP+ S+ E K+W  V RVDN  GDR L+
Sbjct: 906 SYTREKAAYPIESVRENKFWVNVARVDNTLGDRKLL 941


Lambda     K      H
   0.319    0.135    0.398 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1951
Number of extensions: 68
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 946
Length of database: 949
Length adjustment: 44
Effective length of query: 902
Effective length of database: 905
Effective search space:   816310
Effective search space used:   816310
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (26.6 bits)

Align candidate 350675 BT1147 (glycine dehydrogenase [decarboxylating] (NCBI ptt file))
to HMM TIGR00461 (gcvP: glycine dehydrogenase (EC 1.4.4.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00461.hmm
# target sequence database:        /tmp/gapView.29367.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00461  [M=939]
Accession:   TIGR00461
Description: gcvP: glycine dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                          Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                          -----------
          0 1296.4   0.1          0 1296.2   0.1    1.0  1  lcl|FitnessBrowser__Btheta:350675  BT1147 glycine dehydrogenase [de


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Btheta:350675  BT1147 glycine dehydrogenase [decarboxylating] (NCBI ptt file)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1296.2   0.1         0         0       1     936 [.       9     941 ..       9     944 .. 0.97

  Alignments for each domain:
  == domain 1  score: 1296.2 bits;  conditional E-value: 0
                          TIGR00461   1 rhlGpdeaeqkkmlktlGfddlnalieqlvpkdirlarplkleapakeyealaelkkiasknkkvksyiGkGyyat 76 
                                        rh+G +e +   ml+ +G+d+l++li++ +p +irl+ pl l  p +eye  +++  +asknk +++yiG G+y+t
  lcl|FitnessBrowser__Btheta:350675   9 RHIGINEEDTAVMLRKIGVDSLDELINKTIPANIRLKEPLALAKPLTEYEFGKHIADLASKNKLYTTYIGLGWYNT 84 
                                        9*************************************************************************** PP

                          TIGR00461  77 ilppviqrnllenpgwytaytpyqpeisqGrleallnfqtvvldltGlevanaslldegtaaaeamalsfrv...s 149
                                        i+p viqrn++enp wyt+ytpyq+e+sqGrleal+nfqt v dlt +++an sllde+taaaea+ + + +   +
  lcl|FitnessBrowser__Btheta:350675  85 ITPAVIQRNVFENPVWYTSYTPYQTEVSQGRLEALMNFQTAVCDLTAMPLANCSLLDEATAAAEAVTMMYALrsrT 160
                                        *****************************************************************99999874222 PP

                          TIGR00461 150 kkk..ankfvvakdvhpqtlevvktraeplgievivddaskvkkavdvlGvllqypatdGeildykalidelksrk 223
                                        ++k  an  +v++++ pqtl v+ tra p gie+ v+  ++++ + +++ ++lqyp ++G++ dy ++++++++  
  lcl|FitnessBrowser__Btheta:350675 161 QQKagANVVFVDENIFPQTLAVMTTRAIPQGIELRVGKYKEFEPSPEIFACILQYPNSSGNVEDYADFTKKAHEAD 236
                                        344468999******************************************************************* PP

                          TIGR00461 224 alvsvaadllaltlltppgklGadivlGsaqrfGvplGyGGphaaffavkdeykrklpGrivGvskdalGntalrl 299
                                          v+vaad+l+l+lltppg+ Gadiv+G++qr+G p+ yGGp a +fa++deykr++pGri+G skd+ G+   r+
  lcl|FitnessBrowser__Btheta:350675 237 CKVAVAADILSLALLTPPGEWGADIVFGTTQRLGTPMFYGGPSAGYFATRDEYKRNMPGRIIGWSKDKYGKLCYRM 312
                                        **************************************************************************** PP

                          TIGR00461 300 alqtreqhirrdkatsnictaqvllanvaslyavyhGpkGlkniarrifrltsilaaglkrknyelrnktyfdtlt 375
                                        alqtreqhi+r+katsnictaq+lla++a  yavyhG  G+k ia ri+++t+ l + lk+ +y   n++yfdtl+
  lcl|FitnessBrowser__Btheta:350675 313 ALQTREQHIKREKATSNICTAQALLATMAGFYAVYHGQEGIKTIASRIHSITVFLDKQLKKFGYTQVNAQYFDTLR 388
                                        **************************************************************************** PP

                          TIGR00461 376 vevgekaase.vlkaaeeaeinlravvltevgialdetttkedvldllkvlagkdnlglsseelsedv.ansfpae 449
                                         e+ e+++++ + + a ++e+nlr    ++vg+++dett  +    ll ++a+    g + +++++   ++ ++++
  lcl|FitnessBrowser__Btheta:350675 389 FELPEHVSAQqIRTIALSKEVNLRYYENGDVGFSIDETTDIAATNVLLSIFAIAA--GKDYQKVEDVPeKSNIDKA 462
                                        *****9988779999**************************************66..6666666543314479*** PP

                          TIGR00461 450 llrddeilrdevfnryhsetellrylhrleskdlalnqsmiplGsctmklnataemlpitwpefaeihpfapaeqv 525
                                        l+r+  +l++evf +yh+ete++ry++rl++kd++l+qsmi lGsctmklna+aemlp++ pef  +hp+ p +q+
  lcl|FitnessBrowser__Btheta:350675 463 LKRTTPFLTHEVFSNYHTETEMMRYIKRLDRKDISLAQSMISLGSCTMKLNAAAEMLPLSRPEFMSMHPLVPEDQA 538
                                        **************************************************************************** PP

                          TIGR00461 526 eGykeliaqlekwlveitGfdaislqpnsGaqGeyaGlrvirsyhesrgeehrniclipasahGtnpasaamaGlk 601
                                        eGy+eli++l ++l  itGf ++slqpnsGa GeyaGlrvir y+es g++hrn  lipasahGtnpasa  aG++
  lcl|FitnessBrowser__Btheta:350675 539 EGYRELISNLSEDLKVITGFAGVSLQPNSGAAGEYAGLRVIRAYLESIGQGHRNKILIPASAHGTNPASAIQAGFE 614
                                        **************************************************************************** PP

                          TIGR00461 602 vvpvkcdkeGnidlvdlkakaekagdelaavmvtypstyGvfeetirevidivhrfGGqvyldGanmnaqvGltsp 677
                                         v+ +cd++Gn+d+ dl+akae++ + laa+m+typst+G+fe+ i+e+++i+h  G qvy+dGanmnaqvGlt+p
  lcl|FitnessBrowser__Btheta:350675 615 TVTCACDEQGNVDMGDLRAKAEENKEALAALMITYPSTHGIFETEIKEICEIIHACGAQVYMDGANMNAQVGLTNP 690
                                        **************************************************************************** PP

                          TIGR00461 678 gdlGadvchlnlhktfsiphGGGGpgmgpigvkshlapflpktdlvsvvelegesksigavsaapyGsasilpisy 753
                                        g++Gadvchlnlhktf+ phGGGGpg+gpi+v  hl+pflp++++           ++++vsaap+Gsa ilpi+y
  lcl|FitnessBrowser__Btheta:350675 691 GFIGADVCHLNLHKTFASPHGGGGPGVGPICVAEHLVPFLPGHSIFG--------STQNQVSAAPFGSAGILPITY 758
                                        ******************************************76643........34678**************** PP

                          TIGR00461 754 myikmmGaeGlkkasevailnanylakrlkdaykilfvgrdervahecildlrelkekagiealdvakrlldyGfh 829
                                         yi+mmG+eGl++a+++ailnanyla+ lkd+y i+++g  + v he+il+ r++ e++gi++ d+akrl+dyG+h
  lcl|FitnessBrowser__Btheta:350675 759 GYIRMMGTEGLTQATKIAILNANYLAACLKDTYGIVYRGATGFVGHEMILECRKVHEETGISENDIAKRLMDYGYH 834
                                        **************************************************************************** PP

                          TIGR00461 830 aptlsfpvaGtlmveptesesleeldrfidamiaikeeidavkaGeiklednilknaphslqslivaewadpysre 905
                                        aptlsfpv Gtlm+eptesesl eld f+d m+ i +ei++vk+ e +++dn+l naph   + +  +w ++y+re
  lcl|FitnessBrowser__Btheta:350675 835 APTLSFPVHGTLMIEPTESESLAELDNFVDVMLNIWKEIQEVKNEEADKNDNVLINAPHPEYEIVNDNWEHSYTRE 910
                                        ***********************************************************8888888889999**** PP

                          TIGR00461 906 eaaypapvlkyfkfwptvarlddtyGdrnlv 936
                                        +aayp+  ++++kfw  var+d+t Gdr+l+
  lcl|FitnessBrowser__Btheta:350675 911 KAAYPIESVRENKFWVNVARVDNTLGDRKLL 941
                                        *****************************98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (939 nodes)
Target sequences:                          1  (949 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.06u 0.03s 00:00:00.09 Elapsed: 00:00:00.10
# Mc/sec: 8.67
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory