GapMind for catabolism of small carbon sources

 

Alignments for a candidate for gcvP in Thiomicrorhabdus arctica DSM 13458

Align Glycine dehydrogenase (aminomethyl-transferring) (EC 1.4.4.2) (characterized)
to candidate WP_019556676.1 F612_RS0105105 aminomethyl-transferring glycine dehydrogenase

Query= reanno::WCS417:GFF4367
         (946 letters)



>NCBI__GCF_000381085.1:WP_019556676.1
          Length = 973

 Score = 1209 bits (3129), Expect = 0.0
 Identities = 602/954 (63%), Positives = 732/954 (76%), Gaps = 13/954 (1%)

Query: 4   QLTTANEFIARHIGPRQEDEQQMLASLGFDSLEALSASVIPESIKGTSVLGLEDGLSEAE 63
           +L  +++FI RH+GP  +++  ML SL   SL  L   V+P SI+    + L +GL+E +
Sbjct: 16  ELQQSDKFITRHLGPDDDEQLAMLRSLKMASLNELLDKVVPSSIRRHDPMDLAEGLTEQQ 75

Query: 64  ALAKIKAIAGKNQLFKTYIGQGYYNCHTPSPILRNLLENPAWYTAYTPYQPEISQGRLEA 123
           +L K+KAIA KN + K+YIG GYYN  TP  I RN+LENPAWYTAYTPYQ EISQGRLEA
Sbjct: 76  SLEKLKAIASKNIVLKSYIGMGYYNTFTPPTIQRNILENPAWYTAYTPYQAEISQGRLEA 135

Query: 124 LLNFQTLISDLTGLPIANASLLDEATAAAEAMTFCKRLSKNKGSNAFFASIHSHPQTLDV 183
           +LNFQT++SDLTGL +ANASLLDEATA AEAMT C+R+SK+KG   FF +   HPQ ++V
Sbjct: 136 MLNFQTMVSDLTGLELANASLLDEATACAEAMTLCQRMSKSKGK-VFFVADDCHPQNIEV 194

Query: 184 LRTRAEPLGIDVVVGDE-RELTDVSPFFGALLQYPASNGDVFDYRELTERFHAAHGLVAV 242
           ++TRAEPLGI+VVVG+   EL D    FG LLQYP + G+V D+ EL E+ HA   LVA+
Sbjct: 195 IQTRAEPLGIEVVVGNPINELADYD-LFGVLLQYPGTYGEVSDFSELIEKIHAKKALVAM 253

Query: 243 AADLLALTLLTPPGEFGADVAIGSAQRFGVPLGFGGPHAAYFSTKDAFKRDMPGRLVGVS 302
           +ADLLALTLL  PGE GADVAIG+ QRFGVPLG+GGPHAAY +TKD FKR MPGRL+GVS
Sbjct: 254 SADLLALTLLKTPGEMGADVAIGNTQRFGVPLGYGGPHAAYMATKDDFKRSMPGRLIGVS 313

Query: 303 VDRFGKPALRLAMQTREQHIRREKATSNICTAQVLLANIASMYAVYHGPKGLTQIARRVH 362
           VD  GK A RLA+QTREQHIRREKATSNICTAQ LLA +ASMYAVYHGP GLT+IA RVH
Sbjct: 314 VDSRGKKAYRLALQTREQHIRREKATSNICTAQALLAVMASMYAVYHGPVGLTKIAYRVH 373

Query: 363 QLTAILAKGLTALGQNVEQAHFFDTLTLNTGANTAAVHDKARAQRINLRVVDAERVGVSV 422
           + T + A GLT LG  V    +FDTL +       A+   A  + INLR +DA+ VG+S 
Sbjct: 374 RYTQLFANGLTQLGFKVNNKVYFDTLNIFVPGKAVAIQQAAVDKGINLRPIDADTVGLSF 433

Query: 423 DETTTQADIETLWAIFADGKALPDFAAQVESTLPAA----LLRQSPVLSHPVFNRYHSET 478
           DE++T  DI  LW +FA  +A+   A  +    P      LLR S  L+HPVF+ + SET
Sbjct: 434 DESSTLQDITVLWEVFAGEQAVSLKAEVLMKDQPPVIGEDLLRTSEFLTHPVFHEHRSET 493

Query: 479 ELMRYLRKLADKDLALDRTMIPLGSCTMKLNAASEMIPVTWAEFGALHPFAPAEQSAGYL 538
           ++MRY+R LADKDLALDR MIPLGSCTMKLNAA+E++P++W EF  LHPF P  Q+ GY 
Sbjct: 494 QMMRYMRCLADKDLALDRAMIPLGSCTMKLNAAAELMPISWPEFANLHPFVPLNQALGYQ 553

Query: 539 ELTSDLEAMLCAATGYDAISLQPNAGSQGEYAGLLAIRAYHQSRGDERRDICLIPSSAHG 598
           ++  +LE MLC ATGYDA+SLQPN+G+QGEYAGLLAIRAYH+SRG+  RD+CLIP+SAHG
Sbjct: 554 QMVLELEKMLCEATGYDAVSLQPNSGAQGEYAGLLAIRAYHKSRGEGHRDVCLIPASAHG 613

Query: 599 TNPATANMAGMRVVVTACDARGNVDIEDLRAKAIEHRDHLAALMITYPSTHGVFEEGIRE 658
           TNPA+A M GM VVV   + +G +D+ DL AK  +H   LAA+MITYPSTHGVFEE ++ 
Sbjct: 614 TNPASAQMVGMSVVVVKTNLKGEIDLADLHAKLEKHSAKLAAIMITYPSTHGVFEENVKT 673

Query: 659 ICGIIHDNGGQVYIDGANMNAMVGLCAPGKFGGDVSHLNLHKTFCIPHGGGGPGVGPIGV 718
           +C ++H +GGQVYIDGANMNAMVG+ APG+FGGDVSHLNLHKTF IPHGGGGPGVGPIGV
Sbjct: 674 VCDLVHQHGGQVYIDGANMNAMVGVAAPGQFGGDVSHLNLHKTFAIPHGGGGPGVGPIGV 733

Query: 719 KSHLTPFLPGHAAMERKE-----GAVCAAPFGSASILPITWMYISMMGGAGLKRASQLAI 773
             HL PFLPG+A +  ++     GAV AAP+GSA +LPI+W YI+MMG  GL +A+  AI
Sbjct: 734 GKHLKPFLPGYAVVNSEDEPKVVGAVSAAPWGSAGVLPISWSYIAMMGSNGLVQATATAI 793

Query: 774 LNANYISRRLEEHYPVLYTGSNGLVAHECILDLRPLKDSSGISVDDVAKRLIDFGFHAPT 833
           LNANYI+  L  HYP+LY+   G VAHECI+DLRP+K++SGI+VDD+AKRLID+GFHAPT
Sbjct: 794 LNANYIAHSLAPHYPILYSDDQGRVAHECIIDLRPIKEASGITVDDIAKRLIDYGFHAPT 853

Query: 834 MSFPVAGTLMIEPTESESKEELDRFCNAMIAIREEIRAVENGTLDKDDNPLKNAPHTA-A 892
           MSFPVAGT MIEPTESESK ELDRFC+AMIAI+ EI  V +G LD+ DNPLKNAPHTA  
Sbjct: 854 MSFPVAGTFMIEPTESESKVELDRFCDAMIAIKAEINDVMSGVLDEHDNPLKNAPHTADM 913

Query: 893 ELVSEWTHPYTREQAVYPVPSLIEGKYWPPVGRVDNVFGDRNLVCACPSIESYA 946
            +  EW H Y+R+ A YPV SL E KYW PVGRVDNV+GDR+LVC+CP +  Y+
Sbjct: 914 VMADEWNHAYSRQLAAYPVESLREVKYWCPVGRVDNVYGDRHLVCSCPPLSDYS 967


Lambda     K      H
   0.319    0.135    0.398 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2071
Number of extensions: 69
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 946
Length of database: 973
Length adjustment: 44
Effective length of query: 902
Effective length of database: 929
Effective search space:   837958
Effective search space used:   837958
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (26.6 bits)

Align candidate WP_019556676.1 F612_RS0105105 (aminomethyl-transferring glycine dehydrogenase)
to HMM TIGR00461 (gcvP: glycine dehydrogenase (EC 1.4.4.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00461.hmm
# target sequence database:        /tmp/gapView.843220.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00461  [M=939]
Accession:   TIGR00461
Description: gcvP: glycine dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1496.7   0.2          0 1496.6   0.2    1.0  1  NCBI__GCF_000381085.1:WP_019556676.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000381085.1:WP_019556676.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1496.6   0.2         0         0       1     939 []      26     960 ..      26     960 .. 1.00

  Alignments for each domain:
  == domain 1  score: 1496.6 bits;  conditional E-value: 0
                             TIGR00461   1 rhlGpdeaeqkkmlktlGfddlnalieqlvpkdirlarplkleapakeyealaelkkiasknkkvksyiGkGy 73 
                                           rhlGpd+ eq  ml++l + +ln+l++++vp +ir + p+ l    +e++ l++lk+iaskn ++ksyiG+Gy
  NCBI__GCF_000381085.1:WP_019556676.1  26 RHLGPDDDEQLAMLRSLKMASLNELLDKVVPSSIRRHDPMDLAEGLTEQQSLEKLKAIASKNIVLKSYIGMGY 98 
                                           9************************************************************************ PP

                             TIGR00461  74 yatilppviqrnllenpgwytaytpyqpeisqGrleallnfqtvvldltGlevanaslldegtaaaeamalsf 146
                                           y+t +pp iqrn+lenp wytaytpyq+eisqGrlea+lnfqt+v+dltGle+anasllde+ta aeam l++
  NCBI__GCF_000381085.1:WP_019556676.1  99 YNTFTPPTIQRNILENPAWYTAYTPYQAEISQGRLEAMLNFQTMVSDLTGLELANASLLDEATACAEAMTLCQ 171
                                           ************************************************************************* PP

                             TIGR00461 147 rvskkkankfvvakdvhpqtlevvktraeplgievivddaskvkkavdvlGvllqypatdGeildykalidel 219
                                           r+sk+k + f+va+d+hpq +ev++traeplgiev+v++  +     d++Gvllqyp+t Ge+ d+ +li+++
  NCBI__GCF_000381085.1:WP_019556676.1 172 RMSKSKGKVFFVADDCHPQNIEVIQTRAEPLGIEVVVGNPINELADYDLFGVLLQYPGTYGEVSDFSELIEKI 244
                                           *****************************************999***************************** PP

                             TIGR00461 220 ksrkalvsvaadllaltlltppgklGadivlGsaqrfGvplGyGGphaaffavkdeykrklpGrivGvskdal 292
                                           + +kalv+++adllaltll++pg++Gad+++G +qrfGvplGyGGphaa++a+kd++kr++pGr++Gvs d+ 
  NCBI__GCF_000381085.1:WP_019556676.1 245 HAKKALVAMSADLLALTLLKTPGEMGADVAIGNTQRFGVPLGYGGPHAAYMATKDDFKRSMPGRLIGVSVDSR 317
                                           ************************************************************************* PP

                             TIGR00461 293 GntalrlalqtreqhirrdkatsnictaqvllanvaslyavyhGpkGlkniarrifrltsilaaglkrknyel 365
                                           G+ a rlalqtreqhirr+katsnictaq+lla++as+yavyhGp+Gl +ia r++r t+++a+gl + ++++
  NCBI__GCF_000381085.1:WP_019556676.1 318 GKKAYRLALQTREQHIRREKATSNICTAQALLAVMASMYAVYHGPVGLTKIAYRVHRYTQLFANGLTQLGFKV 390
                                           ************************************************************************* PP

                             TIGR00461 366 rnktyfdtltvevgekaasevlkaaeeaeinlravvltevgialdetttkedvldllkvlagkdnlglsseel 438
                                           +nk yfdtl + v  ka  ++++aa +++inlr++++++vg+++de+ t +d++ l++v+ag++  +l++e l
  NCBI__GCF_000381085.1:WP_019556676.1 391 NNKVYFDTLNIFVPGKAV-AIQQAAVDKGINLRPIDADTVGLSFDESSTLQDITVLWEVFAGEQAVSLKAEVL 462
                                           *************98887.9***************************************************** PP

                             TIGR00461 439 sedvansfpaellrddeilrdevfnryhsetellrylhrleskdlalnqsmiplGsctmklnataemlpitwp 511
                                            +d    + ++llr++e+l+++vf+++ set+++ry+  l  kdlal+++miplGsctmklna+ae++pi+wp
  NCBI__GCF_000381085.1:WP_019556676.1 463 MKDQPPVIGEDLLRTSEFLTHPVFHEHRSETQMMRYMRCLADKDLALDRAMIPLGSCTMKLNAAAELMPISWP 535
                                           ************************************************************************* PP

                             TIGR00461 512 efaeihpfapaeqveGykeliaqlekwlveitGfdaislqpnsGaqGeyaGlrvirsyhesrgeehrniclip 584
                                           efa++hpf p +q+ Gy++++ +lek l+e tG+da+slqpnsGaqGeyaGl +ir yh+srge+hr++clip
  NCBI__GCF_000381085.1:WP_019556676.1 536 EFANLHPFVPLNQALGYQQMVLELEKMLCEATGYDAVSLQPNSGAQGEYAGLLAIRAYHKSRGEGHRDVCLIP 608
                                           ************************************************************************* PP

                             TIGR00461 585 asahGtnpasaamaGlkvvpvkcdkeGnidlvdlkakaekagdelaavmvtypstyGvfeetirevidivhrf 657
                                           asahGtnpasa+m+G+ vv+vk + +G+idl dl+ak ek++ +laa+m+typst+Gvfee +++v+d+vh+ 
  NCBI__GCF_000381085.1:WP_019556676.1 609 ASAHGTNPASAQMVGMSVVVVKTNLKGEIDLADLHAKLEKHSAKLAAIMITYPSTHGVFEENVKTVCDLVHQH 681
                                           ************************************************************************* PP

                             TIGR00461 658 GGqvyldGanmnaqvGltspgdlGadvchlnlhktfsiphGGGGpgmgpigvkshlapflpktdlvsvveleg 730
                                           GGqvy+dGanmna vG+++pg++G dv+hlnlhktf+iphGGGGpg+gpigv  hl+pflp+     vv+ e 
  NCBI__GCF_000381085.1:WP_019556676.1 682 GGQVYIDGANMNAMVGVAAPGQFGGDVSHLNLHKTFAIPHGGGGPGVGPIGVGKHLKPFLPG---YAVVNSED 751
                                           **************************************************************...889999** PP

                             TIGR00461 731 esksigavsaapyGsasilpisymyikmmGaeGlkkasevailnanylakrlkdaykilfvgrdervahecil 803
                                           e k +gavsaap+Gsa +lpis+ yi+mmG++Gl +a+  ailnany+a+ l  +y+il+   ++rvaheci+
  NCBI__GCF_000381085.1:WP_019556676.1 752 EPKVVGAVSAAPWGSAGVLPISWSYIAMMGSNGLVQATATAILNANYIAHSLAPHYPILYSDDQGRVAHECII 824
                                           ************************************************************************* PP

                             TIGR00461 804 dlrelkekagiealdvakrlldyGfhaptlsfpvaGtlmveptesesleeldrfidamiaikeeidavkaGei 876
                                           dlr++ke +gi + d+akrl+dyGfhapt+sfpvaGt+m+epteses++eldrf+damiaik+ei+ v++G++
  NCBI__GCF_000381085.1:WP_019556676.1 825 DLRPIKEASGITVDDIAKRLIDYGFHAPTMSFPVAGTFMIEPTESESKVELDRFCDAMIAIKAEINDVMSGVL 897
                                           ************************************************************************* PP

                             TIGR00461 877 klednilknaphslqslivaewadpysreeaaypapvlkyfkfwptvarlddtyGdrnlvcsc 939
                                           + +dn+lknaph+++ +++ ew+++ysr+ aayp+  l++ k+w  v+r+d++yGdr+lvcsc
  NCBI__GCF_000381085.1:WP_019556676.1 898 DEHDNPLKNAPHTADMVMADEWNHAYSRQLAAYPVESLREVKYWCPVGRVDNVYGDRHLVCSC 960
                                           *************************************************************** PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (939 nodes)
Target sequences:                          1  (973 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 43.18
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory