GapMind for catabolism of small carbon sources

 

Alignments for a candidate for gcvP in Cupriavidus basilensis 4G11

Align Glycine dehydrogenase (aminomethyl-transferring) (EC 1.4.4.2) (characterized)
to candidate RR42_RS20070 RR42_RS20070 glycine dehydrogenase

Query= reanno::pseudo13_GW456_L13:PfGW456L13_1868
         (950 letters)



>FitnessBrowser__Cup4G11:RR42_RS20070
          Length = 973

 Score = 1251 bits (3236), Expect = 0.0
 Identities = 626/957 (65%), Positives = 749/957 (78%), Gaps = 11/957 (1%)

Query: 2   TQINLTTANEFIARHIGPRAGDEQAMLNSLGFDSLEALSASVIPDSIKGTSVLGLED--- 58
           T   L   + F +RHIGP A ++Q ML  LG+D+  AL  +VIP +I+    + L +   
Sbjct: 16  TLAELEARDAFASRHIGPDAAEQQHMLKVLGYDNRAALIDAVIPAAIRRRDGMPLGEFTA 75

Query: 59  GLSEADALALIKSIATKNQLFKTFIGQGYYGTHTPSPILRNLLENPAWYTAYTPYQPEIS 118
            L+E  ALA ++ +A+KN++ K+FIGQGYY T TP  +LRN+ ENPAWYTAYTPYQPEIS
Sbjct: 76  PLTEEAALAKLRGLASKNRVLKSFIGQGYYNTLTPGVVLRNIFENPAWYTAYTPYQPEIS 135

Query: 119 QGRLEALLNFQTLISDLTGLPIANASLLDEATAAAEAMTFCKRLSKNKGSHAFFASVHCH 178
           QGRLEA+LNFQ +++DLTGL IANAS+LDE TAAAEAMT  +R++K+  S  F+ +    
Sbjct: 136 QGRLEAMLNFQQMVTDLTGLDIANASMLDEGTAAAEAMTLLQRVNKH-ASTTFYVADDVL 194

Query: 179 PQTLDVLRTRAEPLGIDVVVGDERELTDVTPFFGALLQYPASNGDVFDYRELTERFHAAN 238
           PQTL+V+RTRA PLGI+V VG   E      F G LLQYP  NGDV DYR + +  HAA 
Sbjct: 195 PQTLEVVRTRALPLGIEVKVGPAAEAAGAHAF-GVLLQYPGVNGDVADYRAIADAVHAAG 253

Query: 239 ALVAVAADLLALTVLTAPGEFGADVAIGSAQRFGVPLGFGGPHAAYFSTKDAFKRDMPGR 298
            LV  AADLLALT++ APGE+GADVA+G++QRFGVPLGFGGPHA Y + KDAFKR MPGR
Sbjct: 254 GLVVAAADLLALTLIAAPGEWGADVAVGNSQRFGVPLGFGGPHAGYMAVKDAFKRSMPGR 313

Query: 299 LVGVSVDRFGNPALRLAMQTREQHIRREKATSNICTAQVLLANIASMYAVYHGPKGLTQI 358
           LVGV++D  GN A RLA+QTREQHIRREKATSNICTAQVLLA +ASMYAVYHGP+GL +I
Sbjct: 314 LVGVTIDAQGNKAYRLALQTREQHIRREKATSNICTAQVLLAVMASMYAVYHGPQGLKRI 373

Query: 359 ANRVHHLTAILAKGLSALGLSVEQASFFDTLTVKAGAQTAALHDKAHAQRINLRVVDGER 418
           A RVH LTA LA GL  LG +   A+FFDTLT++ G  T ALH  A A+ INLR     R
Sbjct: 374 AQRVHRLTATLAGGLEQLGYARTNATFFDTLTLETGFNTEALHASATARGINLRHAGATR 433

Query: 419 LGLSLDETTTQADVETLWSLLSDGKALPDFAALAASVQSAIPATLVRQSPILSHPVFNRY 478
           +G+SLDET ++ DV  L  + + GK +P F AL A+ Q A PA L RQS  L+HPVFN +
Sbjct: 434 IGISLDETASREDVVALLEIFAHGKPVPGFDALEAAAQDAFPAGLARQSAYLTHPVFNTH 493

Query: 479 HSETELMRYLRKLADKDLALDRTMIPLGSCTMKLNAASEMIPVTWAEFGALHPFAPAEQS 538
           H+E E++RYLR LADKDLALDRTMIPLGSCTMKLNA SEMIPVTW EF  +HPFAP +Q+
Sbjct: 494 HAEHEMLRYLRMLADKDLALDRTMIPLGSCTMKLNATSEMIPVTWPEFSKIHPFAPLDQT 553

Query: 539 AGYQQLTDELEAMLCAATGYDSVSLQPNAGSQGEYAGLLAIRAYHQSRGEDRRDICLIPS 598
            GY+++ D+LEAMLCAATGY +VSLQPNAGSQGEYAGLL I AYH SRGE  RDICLIPS
Sbjct: 554 VGYREMIDQLEAMLCAATGYAAVSLQPNAGSQGEYAGLLIIHAYHASRGESHRDICLIPS 613

Query: 599 SAHGTNPATAQMAGMRVVVTACDARGNVDIEDLRAKAIEHREHLAALMITYPSTHGVFEE 658
           SAHGTNPA+AQMAGM+VVV ACD  GNVD+EDL  KA +H ++LAA+MITYPSTHGVFE+
Sbjct: 614 SAHGTNPASAQMAGMKVVVVACDENGNVDLEDLAKKAEQHSKNLAAIMITYPSTHGVFEQ 673

Query: 659 GIREICGIIHDNGGQVYIDGANMNAMVGLCAPGKFGGDVSHLNLHKTFCIPHGGGGPGVG 718
           G+++IC I+H +GGQVY+DGANMNAMVG  APG+FGGDVSHLNLHKTFCIPHGGGGPGVG
Sbjct: 674 GVQQICHIVHKHGGQVYVDGANMNAMVGTAAPGQFGGDVSHLNLHKTFCIPHGGGGPGVG 733

Query: 719 PIGVKSHLAPFLP-----GHAQMERKEGAVCAAPFGSASILPITWMYIRMMGGAGLKRAS 773
           P+ V +HLA FLP     G+ + ++  G V AAPFGSASILPI+WMYI MMG AGL  A+
Sbjct: 734 PVAVGAHLADFLPNQDSVGYRRDDQGIGGVSAAPFGSASILPISWMYIAMMGSAGLTAAT 793

Query: 774 QLAILNANYISRRLEEHYPVLYTGSNGLVAHECILDLRPLKDSSGISVDDVAKRLIDFGF 833
           + AIL ANY++RRL  H+PVLYTG +GLVAHECILD+R L+ ++GIS +DVAKRL+D+GF
Sbjct: 794 ENAILAANYVARRLSPHFPVLYTGQHGLVAHECILDVRALQKTTGISNEDVAKRLMDYGF 853

Query: 834 HAPTMSFPVAGTLMIEPTESESKEELDRFCDAMIRIREEIRAVENGTLDKDDNPLKNAPH 893
           HAPTMSFPV GTLMIEPTESE+  ELDRF DAMI IR EI  VE+GT D++DNPLKNAPH
Sbjct: 854 HAPTMSFPVPGTLMIEPTESEALHELDRFIDAMIAIRAEIARVEDGTFDREDNPLKNAPH 913

Query: 894 TAKELVGE-WSHPYSREQAVYPVASLIEGKYWPPVGRVDNVFGDRNLVCACPSIESY 949
           TA  +  + W H Y+R++A YPVA+L   KYWPPVGR DNV+GDRNL CAC  +  Y
Sbjct: 914 TAAVITADVWEHKYTRQEAAYPVAALRTQKYWPPVGRADNVYGDRNLFCACVPMSEY 970


Lambda     K      H
   0.319    0.135    0.398 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2315
Number of extensions: 89
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 950
Length of database: 973
Length adjustment: 44
Effective length of query: 906
Effective length of database: 929
Effective search space:   841674
Effective search space used:   841674
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (26.6 bits)

Align candidate RR42_RS20070 RR42_RS20070 (glycine dehydrogenase)
to HMM TIGR00461 (gcvP: glycine dehydrogenase (EC 1.4.4.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00461.hmm
# target sequence database:        /tmp/gapView.7975.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00461  [M=939]
Accession:   TIGR00461
Description: gcvP: glycine dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1461.0   0.4          0 1460.8   0.4    1.0  1  lcl|FitnessBrowser__Cup4G11:RR42_RS20070  RR42_RS20070 glycine dehydrogena


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cup4G11:RR42_RS20070  RR42_RS20070 glycine dehydrogenase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1460.8   0.4         0         0       1     939 []      29     964 ..      29     964 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1460.8 bits;  conditional E-value: 0
                                 TIGR00461   1 rhlGpdeaeqkkmlktlGfddlnalieqlvpkdirlarplkl...eapakeyealaelkkiasknkkvk 66 
                                               rh+Gpd+aeq++mlk lG+d+  ali+ ++p +ir +  + l    ap +e +ala+l+ +askn+++k
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070  29 RHIGPDAAEQQHMLKVLGYDNRAALIDAVIPAAIRRRDGMPLgefTAPLTEEAALAKLRGLASKNRVLK 97 
                                               9***********************************98775522279999******************* PP

                                 TIGR00461  67 syiGkGyyatilppviqrnllenpgwytaytpyqpeisqGrleallnfqtvvldltGlevanaslldeg 135
                                               s+iG+Gyy+t++p v++rn++enp wytaytpyqpeisqGrlea+lnfq++v+dltGl++anas+ldeg
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070  98 SFIGQGYYNTLTPGVVLRNIFENPAWYTAYTPYQPEISQGRLEAMLNFQQMVTDLTGLDIANASMLDEG 166
                                               ********************************************************************* PP

                                 TIGR00461 136 taaaeamalsfrvskkkankfvvakdvhpqtlevvktraeplgievivddaskvkkavdvlGvllqypa 204
                                               taaaeam l +rv k+ +  f+va+dv pqtlevv+tra plgiev v+ a +     + +Gvllqyp+
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 167 TAAAEAMTLLQRVNKHASTTFYVADDVLPQTLEVVRTRALPLGIEVKVGPAAEAAG-AHAFGVLLQYPG 234
                                               *************************************************9999765.568********* PP

                                 TIGR00461 205 tdGeildykalidelksrkalvsvaadllaltlltppgklGadivlGsaqrfGvplGyGGphaaffavk 273
                                                +G++ dy+a+ d+++    lv  aadllaltl+ +pg+ Gad+++G +qrfGvplG+GGpha ++avk
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 235 VNGDVADYRAIADAVHAAGGLVVAAADLLALTLIAAPGEWGADVAVGNSQRFGVPLGFGGPHAGYMAVK 303
                                               ********************************************************************* PP

                                 TIGR00461 274 deykrklpGrivGvskdalGntalrlalqtreqhirrdkatsnictaqvllanvaslyavyhGpkGlkn 342
                                               d +kr++pGr+vGv+ da+Gn a rlalqtreqhirr+katsnictaqvlla++as+yavyhGp+Glk+
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 304 DAFKRSMPGRLVGVTIDAQGNKAYRLALQTREQHIRREKATSNICTAQVLLAVMASMYAVYHGPQGLKR 372
                                               ********************************************************************* PP

                                 TIGR00461 343 iarrifrltsilaaglkrknyelrnktyfdtltvevgekaasevlkaaeeaeinlravvltevgialde 411
                                               ia+r++rlt+ la gl++ +y  +n+t+fdtlt+e g ++  ++  +a +++inlr   +t +gi+lde
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 373 IAQRVHRLTATLAGGLEQLGYARTNATFFDTLTLETGFNTE-ALHASATARGINLRHAGATRIGISLDE 440
                                               ************************************98876.89************************* PP

                                 TIGR00461 412 tttkedvldllkvlagkdnlglsseelsedvansfpaellrddeilrdevfnryhsetellrylhrles 480
                                               t+++edv+ ll+++a  +      + l+   +++fpa l r++ +l+++vfn++h+e e+lryl  l  
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 441 TASREDVVALLEIFAHGKP-VPGFDALEAAAQDAFPAGLARQSAYLTHPVFNTHHAEHEMLRYLRMLAD 508
                                               **************98553.336799******************************************* PP

                                 TIGR00461 481 kdlalnqsmiplGsctmklnataemlpitwpefaeihpfapaeqveGykeliaqlekwlveitGfdais 549
                                               kdlal+++miplGsctmklnat em+p+twpef++ihpfap +q+ Gy+e+i qle+ l+  tG+ a+s
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 509 KDLALDRTMIPLGSCTMKLNATSEMIPVTWPEFSKIHPFAPLDQTVGYREMIDQLEAMLCAATGYAAVS 577
                                               ********************************************************************* PP

                                 TIGR00461 550 lqpnsGaqGeyaGlrvirsyhesrgeehrniclipasahGtnpasaamaGlkvvpvkcdkeGnidlvdl 618
                                               lqpn+G+qGeyaGl +i  yh srge hr+iclip sahGtnpasa+maG+kvv+v+cd++Gn+dl+dl
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 578 LQPNAGSQGEYAGLLIIHAYHASRGESHRDICLIPSSAHGTNPASAQMAGMKVVVVACDENGNVDLEDL 646
                                               ********************************************************************* PP

                                 TIGR00461 619 kakaekagdelaavmvtypstyGvfeetirevidivhrfGGqvyldGanmnaqvGltspgdlGadvchl 687
                                                +kae+++++laa+m+typst+Gvfe++++++++ivh+ GGqvy+dGanmna vG ++pg++G dv+hl
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 647 AKKAEQHSKNLAAIMITYPSTHGVFEQGVQQICHIVHKHGGQVYVDGANMNAMVGTAAPGQFGGDVSHL 715
                                               ********************************************************************* PP

                                 TIGR00461 688 nlhktfsiphGGGGpgmgpigvkshlapflpktdlvsvvelegesksigavsaapyGsasilpisymyi 756
                                               nlhktf+iphGGGGpg+gp++v +hla flp+   ++ v  +  ++ ig vsaap+Gsasilpis+myi
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 716 NLHKTFCIPHGGGGPGVGPVAVGAHLADFLPN---QDSVGYRRDDQGIGGVSAAPFGSASILPISWMYI 781
                                               ********************************...889999**************************** PP

                                 TIGR00461 757 kmmGaeGlkkasevailnanylakrlkdaykilfvgrdervahecildlrelkekagiealdvakrlld 825
                                               +mmG+ Gl+ a+e ail+any+a+rl  ++++l++g+++ vahecild+r l++ +gi+++dvakrl+d
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 782 AMMGSAGLTAATENAILAANYVARRLSPHFPVLYTGQHGLVAHECILDVRALQKTTGISNEDVAKRLMD 850
                                               ********************************************************************* PP

                                 TIGR00461 826 yGfhaptlsfpvaGtlmveptesesleeldrfidamiaikeeidavkaGeiklednilknaphslqsli 894
                                               yGfhapt+sfpv+Gtlm+eptese+l+eldrfidamiai++ei  v  G+++ edn+lknaph++    
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 851 YGFHAPTMSFPVPGTLMIEPTESEALHELDRFIDAMIAIRAEIARVEDGTFDREDNPLKNAPHTAAVIT 919
                                               ***************************************************************888777 PP

                                 TIGR00461 895 vaewadpysreeaaypapvlkyfkfwptvarlddtyGdrnlvcsc 939
                                               +  w + y+r+eaayp++ l+ +k+wp v+r d++yGdrnl+c+c
  lcl|FitnessBrowser__Cup4G11:RR42_RS20070 920 ADVWEHKYTRQEAAYPVAALRTQKYWPPVGRADNVYGDRNLFCAC 964
                                               7779999*************************************9 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (939 nodes)
Target sequences:                          1  (973 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.05u 0.02s 00:00:00.07 Elapsed: 00:00:00.06
# Mc/sec: 13.21
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory