GapMind for catabolism of small carbon sources

 

Alignments for a candidate for gcvP in Azoarcus sp. BH72

Align Glycine dehydrogenase (aminomethyl-transferring) (EC 1.4.4.2) (characterized)
to candidate WP_011765018.1 AZO_RS06490 glycine dehydrogenase (aminomethyl-transferring)

Query= reanno::Koxy:BWI76_RS23870
         (957 letters)



>NCBI__GCF_000061505.1:WP_011765018.1
          Length = 959

 Score = 1209 bits (3127), Expect = 0.0
 Identities = 611/963 (63%), Positives = 741/963 (76%), Gaps = 17/963 (1%)

Query: 1   MTQTLGQLENRDAFIERHIGPDALQQQEMLKTVGADSLNALIGQIVPQDIQLATPPQVGE 60
           ++  L QLE RDAFI RH+GP+  +   M  T+G   ++ LI Q VP  I+L     +  
Sbjct: 6   LSAPLAQLEQRDAFIHRHLGPNPDEIARMCATIGVPDIDTLIAQTVPASIRLPQALPLAG 65

Query: 61  ATTEFAALAELKAIAGRNKRFKSYIGMGYTAVQLPPVIQRNMLENPGWYTAYTPYQPEVS 120
              E  AL  L+ +A RN   KS IGMGY     P VI RN++ENPGWYTAYTPYQ E++
Sbjct: 66  PRPEHEALELLRGLAERNAVKKSMIGMGYYGTHTPAVILRNVMENPGWYTAYTPYQAEIA 125

Query: 121 QGRLESLLNFQQVTLDLTGLDIASASLLDEATAAAEAMAMAKRVSKLKNANRFFVAADVH 180
           QGRLE+LLNFQQ+ +DLTGL++A+ASLLDEATAAAEAMAMA+RVSK K+ N FFV A   
Sbjct: 126 QGRLEALLNFQQMVIDLTGLELANASLLDEATAAAEAMAMARRVSKSKS-NAFFVDAACF 184

Query: 181 PQTLDVVRTRAETFGFDVIVDDAEKALDHQDVFGVLLQQVGTTGEVHDYSKLIADLKARK 240
           PQTLDVVRTRAE FGF++++ DA +A +H DVFG LLQ     G V D   +IA LK R 
Sbjct: 185 PQTLDVVRTRAEYFGFNLVLGDAAEAAEH-DVFGALLQYPNVHGTVGDLGAVIAALKGRG 243

Query: 241 VIVSVAADFMALVLLTAPGKQGADIVFGSAQRFGVPMGYGGPHAAFFAAKDEFKRSMPGR 300
            I ++A D MALVLL +PG  GADI  GSAQRFGVPMG+GGPHAAFFA ++ + RSMPGR
Sbjct: 244 AITALATDLMALVLLKSPGAMGADIALGSAQRFGVPMGFGGPHAAFFATREAYVRSMPGR 303

Query: 301 IIGVSKDAAGNTALRMAMQTREQHIRREKANSNICTSQVLLANIASLYAVFHGPAGLKRI 360
           IIGVS+DA G TALRM +QTREQHIRREKANSNICTSQVLLAN+A  YAV+HGP GL+ I
Sbjct: 304 IIGVSRDARGKTALRMTLQTREQHIRREKANSNICTSQVLLANMAGFYAVYHGPQGLRTI 363

Query: 361 ASRIHRLTDILADGLQKKGLKLRHAHYFDTLCVEVADKAA-VLARAEALRINLRSDIHHA 419
           A+RIHRL  +L  GL+  G  +R + YFDTL V+  ++AA +L+ A+    NLR   H  
Sbjct: 364 AARIHRLAALLDAGLRAAGFAVRSSAYFDTLEVDADERAAAILSAADQAGFNLRDAGHGR 423

Query: 420 VGITLDEATTREDVLNLFRAILGDDHGLDIDTLDKDVALDSRSIPASMLRDDAILTHPVF 479
           +G+++DE TTR D+  +  A  G +  +D++TL    AL     PA +LRDDAIL HPVF
Sbjct: 424 IGLSVDETTTRADIAAVL-ACFGAN--VDLETLTPASAL-----PAGLLRDDAILAHPVF 475

Query: 480 NRYHSETEMMRYMHALERKDLALNQAMIPLGSCTMKLNAAAEMIPITWPEFAELHPFCPV 539
           N +H+E EM+RY+  L+ +DLAL+ +MI LGSCTMKLNA +EMIP+TWP FA LHPF P 
Sbjct: 476 NTHHTEHEMLRYLKKLQNRDLALDHSMISLGSCTMKLNATSEMIPVTWPAFANLHPFAPP 535

Query: 540 DQAEGYHQMIAQLSDWLVKLTGYDAVCMQPNSGAQGEYAGLLAIRHYHESRNEGHRDICL 599
            Q +GY  MI  L+D+L  +TG+DA+CMQPNSGAQGEYAGL+AIR YH SR E HRD+CL
Sbjct: 536 AQTQGYMAMIDGLADYLKAVTGFDAICMQPNSGAQGEYAGLVAIRRYHASRGEAHRDVCL 595

Query: 600 IPSSAHGTNPASAQMAGMQVVVVACDKNGNIDLADLRAKAEQHAANLSCIMVTYPSTHGV 659
           IP SAHGTNPA+AQM GM+VVVV CD +GN+DL +L++KA Q+A  L+ +M+TYPSTHGV
Sbjct: 596 IPRSAHGTNPATAQMCGMEVVVVDCDGSGNVDLENLQSKAAQYADRLAAMMITYPSTHGV 655

Query: 660 YEETIREVCEVVHQFGGQVYLDGANMNAQVGITSPGFIGADVSHLNLHKTFCIPHGGGGP 719
           +EE IRE+C  VH  GGQVY+DGAN+NAQVG+TSP  IGADVSH+NLHKTFCIPHGGGGP
Sbjct: 656 FEENIREICAAVHAHGGQVYMDGANLNAQVGLTSPAIIGADVSHMNLHKTFCIPHGGGGP 715

Query: 720 GMGPIGVKSHLAQFVPGHSVV---QIEGMLTRQGAVSAAPFGSASILPISWMYIRMMGAE 776
           GMGPIG+K+HLA F+  H+V      E +   QGAVSAAPFGSASILPISWMYI MMG E
Sbjct: 716 GMGPIGLKAHLAPFMADHAVAATGDAERVNKGQGAVSAAPFGSASILPISWMYITMMGGE 775

Query: 777 GLKQASQMAILNANYIATRLKDAFPVLYTGRDGRVAHECILDIRPLKEETGISELDIAKR 836
           GLK+A+++AILNANY+A+RL   +PVLYTG  GRVAHECILDIRP+K  TGISE+DIAKR
Sbjct: 776 GLKRATEVAILNANYLASRLAPHYPVLYTGSRGRVAHECILDIRPIKAATGISEVDIAKR 835

Query: 837 LIDFGFHAPTMSFPVAGTLMVEPTESESKVELDRFIDAMLAIRGEIDRVKAGEWPLEDNP 896
           L+D+GFHAPTMSFPVAGT+MVEPTESE   ELDRFI+AM+AIRGEI R++ GEWP +DNP
Sbjct: 836 LMDYGFHAPTMSFPVAGTIMVEPTESEDLAELDRFIEAMVAIRGEILRIERGEWPADDNP 895

Query: 897 LVNAPHTQGELVSA-WNHPYARELAVFPAG--LNNKYWPTVKRLDDVYGDRNLFCSCVPM 953
           L NAPHTQGE+ +A W  PY+RE AVFP     +NK+WP+V R+DDVYGDRNLFC+CVPM
Sbjct: 896 LRNAPHTQGEIAAAQWERPYSREQAVFPLPWVADNKFWPSVNRIDDVYGDRNLFCACVPM 955

Query: 954 SEY 956
             Y
Sbjct: 956 EAY 958


Lambda     K      H
   0.320    0.135    0.399 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2172
Number of extensions: 83
Number of successful extensions: 9
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 957
Length of database: 959
Length adjustment: 44
Effective length of query: 913
Effective length of database: 915
Effective search space:   835395
Effective search space used:   835395
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 57 (26.6 bits)

Align candidate WP_011765018.1 AZO_RS06490 (glycine dehydrogenase (aminomethyl-transferring))
to HMM TIGR00461 (gcvP: glycine dehydrogenase (EC 1.4.4.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00461.hmm
# target sequence database:        /tmp/gapView.29286.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00461  [M=939]
Accession:   TIGR00461
Description: gcvP: glycine dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1488.9   0.0          0 1488.7   0.0    1.0  1  lcl|NCBI__GCF_000061505.1:WP_011765018.1  AZO_RS06490 glycine dehydrogenas


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000061505.1:WP_011765018.1  AZO_RS06490 glycine dehydrogenase (aminomethyl-transferring)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1488.7   0.0         0         0       1     939 []      22     952 ..      22     952 .. 0.99

  Alignments for each domain:
  == domain 1  score: 1488.7 bits;  conditional E-value: 0
                                 TIGR00461   1 rhlGpdeaeqkkmlktlGfddlnalieqlvpkdirlarplkleapakeyealaelkkiasknkkvksyi 69 
                                               rhlGp++ e  +m  t+G+ d++ li q vp +irl+++l l  p  e+eal+ l+ +a++n + ks i
  lcl|NCBI__GCF_000061505.1:WP_011765018.1  22 RHLGPNPDEIARMCATIGVPDIDTLIAQTVPASIRLPQALPLAGPRPEHEALELLRGLAERNAVKKSMI 90 
                                               9******************************************************************** PP

                                 TIGR00461  70 GkGyyatilppviqrnllenpgwytaytpyqpeisqGrleallnfqtvvldltGlevanaslldegtaa 138
                                               G+Gyy+t +p vi+rn++enpgwytaytpyq+ei+qGrleallnfq++v+dltGle+anasllde+taa
  lcl|NCBI__GCF_000061505.1:WP_011765018.1  91 GMGYYGTHTPAVILRNVMENPGWYTAYTPYQAEIAQGRLEALLNFQQMVIDLTGLELANASLLDEATAA 159
                                               ********************************************************************* PP

                                 TIGR00461 139 aeamalsfrvskkkankfvvakdvhpqtlevvktraeplgievivddaskvkkavdvlGvllqypatdG 207
                                               aeama++ rvsk+k+n+f+v+  + pqtl+vv+trae +g++++ +da +  +  dv+G+llqyp  +G
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 160 AEAMAMARRVSKSKSNAFFVDAACFPQTLDVVRTRAEYFGFNLVLGDAAEAAE-HDVFGALLQYPNVHG 227
                                               ************************************************99876.59************* PP

                                 TIGR00461 208 eildykalidelksrkalvsvaadllaltlltppgklGadivlGsaqrfGvplGyGGphaaffavkdey 276
                                               ++ d+ a+i +lk r a+ ++a+dl+al+ll++pg +Gadi+lGsaqrfGvp+G+GGphaaffa+++ y
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 228 TVGDLGAVIAALKGRGAITALATDLMALVLLKSPGAMGADIALGSAQRFGVPMGFGGPHAAFFATREAY 296
                                               ********************************************************************* PP

                                 TIGR00461 277 krklpGrivGvskdalGntalrlalqtreqhirrdkatsnictaqvllanvaslyavyhGpkGlkniar 345
                                                r++pGri+Gvs+da G+talr++lqtreqhirr+ka+snict+qvllan+a  yavyhGp+Gl+ ia 
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 297 VRSMPGRIIGVSRDARGKTALRMTLQTREQHIRREKANSNICTSQVLLANMAGFYAVYHGPQGLRTIAA 365
                                               ********************************************************************* PP

                                 TIGR00461 346 rifrltsilaaglkrknyelrnktyfdtltvevgekaasevlkaaeeaeinlravvltevgialdettt 414
                                               ri+rl+++l agl+  ++ +r + yfdtl v+  e+aa ++l aa++a+ nlr    + +g+++dettt
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 366 RIHRLAALLDAGLRAAGFAVRSSAYFDTLEVDADERAA-AILSAADQAGFNLRDAGHGRIGLSVDETTT 433
                                               ********************************888877.9***************************** PP

                                 TIGR00461 415 kedvldllkvlagkdnlglsseelsedvansfpaellrddeilrdevfnryhsetellrylhrleskdl 483
                                               ++d+  +l  + + +   ++ e+l+   a+++pa llrdd il ++vfn++h+e e+lryl++l+++dl
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 434 RADIAAVLACFGA-N---VDLETLTP--ASALPAGLLRDDAILAHPVFNTHHTEHEMLRYLKKLQNRDL 496
                                               **********988.3...67888875..678************************************** PP

                                 TIGR00461 484 alnqsmiplGsctmklnataemlpitwpefaeihpfapaeqveGykeliaqlekwlveitGfdaislqp 552
                                               al++smi lGsctmklnat em+p+twp fa++hpfap+ q++Gy  +i  l+++l  +tGfdai++qp
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 497 ALDHSMISLGSCTMKLNATSEMIPVTWPAFANLHPFAPPAQTQGYMAMIDGLADYLKAVTGFDAICMQP 565
                                               ********************************************************************* PP

                                 TIGR00461 553 nsGaqGeyaGlrvirsyhesrgeehrniclipasahGtnpasaamaGlkvvpvkcdkeGnidlvdlkak 621
                                               nsGaqGeyaGl +ir+yh srge hr++clip sahGtnpa+a+m+G++vv+v+cd+ Gn+dl++l++k
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 566 NSGAQGEYAGLVAIRRYHASRGEAHRDVCLIPRSAHGTNPATAQMCGMEVVVVDCDGSGNVDLENLQSK 634
                                               ********************************************************************* PP

                                 TIGR00461 622 aekagdelaavmvtypstyGvfeetirevidivhrfGGqvyldGanmnaqvGltspgdlGadvchlnlh 690
                                               a +++d+laa+m+typst+Gvfee ire++  vh  GGqvy+dGan+naqvGltsp+ +Gadv+h+nlh
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 635 AAQYADRLAAMMITYPSTHGVFEENIREICAAVHAHGGQVYMDGANLNAQVGLTSPAIIGADVSHMNLH 703
                                               ********************************************************************* PP

                                 TIGR00461 691 ktfsiphGGGGpgmgpigvkshlapflpktdlvsvvelegesksigavsaapyGsasilpisymyikmm 759
                                               ktf+iphGGGGpgmgpig+k+hlapf+  + +  + + e  +k +gavsaap+Gsasilpis+myi mm
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 704 KTFCIPHGGGGPGMGPIGLKAHLAPFMADHAVAATGDAERVNKGQGAVSAAPFGSASILPISWMYITMM 772
                                               ********************************************************************* PP

                                 TIGR00461 760 GaeGlkkasevailnanylakrlkdaykilfvgrdervahecildlrelkekagiealdvakrlldyGf 828
                                               G eGlk+a+evailnanyla+rl  +y++l++g+ +rvahecild+r++k+ +gi+++d+akrl+dyGf
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 773 GGEGLKRATEVAILNANYLASRLAPHYPVLYTGSRGRVAHECILDIRPIKAATGISEVDIAKRLMDYGF 841
                                               ********************************************************************* PP

                                 TIGR00461 829 haptlsfpvaGtlmveptesesleeldrfidamiaikeeidavkaGeiklednilknaphslqslivae 897
                                               hapt+sfpvaGt+mveptese+l eldrfi+am+ai+ ei  +  Ge++++dn+l+naph+  +  +a+
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 842 HAPTMSFPVAGTIMVEPTESEDLAELDRFIEAMVAIRGEILRIERGEWPADDNPLRNAPHTQGEIAAAQ 910
                                               ************************************************************988888999 PP

                                 TIGR00461 898 wadpysreeaaypapvlkyfkfwptvarlddtyGdrnlvcsc 939
                                               w  pysre+a++p+p++  +kfwp+v+r+dd+yGdrnl+c+c
  lcl|NCBI__GCF_000061505.1:WP_011765018.1 911 WERPYSREQAVFPLPWVADNKFWPSVNRIDDVYGDRNLFCAC 952
                                               99***************************************9 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (939 nodes)
Target sequences:                          1  (959 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.05u 0.02s 00:00:00.07 Elapsed: 00:00:00.06
# Mc/sec: 13.64
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory