GapMind for catabolism of small carbon sources

 

Alignments for a candidate for gcvP in Echinicola vietnamensis KMM 6221, DSM 17526

Align Glycine dehydrogenase (decarboxylating), mitochondrial; Glycine cleavage system P protein; Glycine decarboxylase; Glycine dehydrogenase (aminomethyl-transferring); EC 1.4.4.2 (characterized)
to candidate Echvi_0744 Echvi_0744 glycine dehydrogenase (decarboxylating)

Query= SwissProt::P26969
         (1057 letters)



>FitnessBrowser__Cola:Echvi_0744
          Length = 966

 Score = 1129 bits (2919), Expect = 0.0
 Identities = 561/970 (57%), Positives = 710/970 (73%), Gaps = 18/970 (1%)

Query: 92   LKPSDTFPRRHNSATPDEQTKMAESVGFDTLDSLVDATVPKSIRLKEMKFNKFDGGLTEG 151
            L PS  F  RHN  + ++ ++M   +G  ++D L+D T+PK+I+L +          +E 
Sbjct: 5    LTPSVKFEDRHNGPSANDVSEMLSKIGASSIDELIDQTIPKAIQLDQPL--NLPEAKSEA 62

Query: 152  QMIEHMKDLASKNKVFKSFIGMGYYNTHVPPVILRNIMENPAWYTQYTPYQAEISQGRLE 211
              ++  + +A+KNK++KSFIG+GYY+T  P VILRN++ENP WYT YTPYQAEI+QGRLE
Sbjct: 63   AFLKDFRKMAAKNKIYKSFIGLGYYDTITPGVILRNVLENPGWYTAYTPYQAEIAQGRLE 122

Query: 212  SLLNFQTMITDLTGLPMSNASLLDEGTAAAEAMSMCNNIQKGKKKT---FIIASNCHPQT 268
            +L+NFQTM+ DLTG+ ++NASLLDEGTAAAEAM+M    +   KK    F +      QT
Sbjct: 123  ALVNFQTMVMDLTGMELANASLLDEGTAAAEAMNMLFATRPRDKKKATKFFVDEKVFIQT 182

Query: 269  IDICQTRADGFELKVVVKDLKDIDYKSGDVCGVLVQYPGTEGEVLDYGEFIKKAHANEVK 328
             +I +TRA    + +V   L +++ +  ++ GVL+QYP  EGE +DY   ++KA  + V 
Sbjct: 183  KEILKTRALPIGVTLVEGSLNELNLEDPELYGVLLQYPNAEGEAIDYKALVEKAKQHNVT 242

Query: 329  VVMASDLLALTVLKPPGEFGADIVVGSAQRFGVPMGYGGPHAAFLATSQEYKRMMPGRII 388
               ++DLLALT+L PPGE GAD+VVG+ QRFGVPMG+GGPHAA+ AT   YKR +PGRII
Sbjct: 243  TAFSADLLALTLLTPPGEMGADVVVGTTQRFGVPMGFGGPHAAYFATKDAYKRQVPGRII 302

Query: 389  GVSVDSSGKQALRMAMQTREQHIRRDKATSNICTAQALLANMAAMYAVYHGPEGLKAIAQ 448
            G+SVD  G +A RMA+QTREQHI+R++ATSNICTAQ LLA MA MYAVYHGP+GLK IA 
Sbjct: 303  GISVDKDGNKAYRMALQTREQHIKRERATSNICTAQVLLAVMAGMYAVYHGPKGLKDIAL 362

Query: 449  RVHGLAGVFALGLKKLGLEVQDLGFFDTVKVKTSNAKA--IADAAIKSEINLRVVDGNTI 506
            ++HGLA + A GL KLG E ++  +FDT+K+K  + K   I   A+  E+N R   G  +
Sbjct: 363  KIHGLAKLTAQGLAKLGFEQENEHYFDTLKIKVDDVKQSKIKAFALSHEMNFRYEPGY-V 421

Query: 507  TAAFDETTTLEDVDKLFKVFAGGKPVSFTA---ASLAPEFQNAIPSGLVRESPYLTHPIF 563
              AFDE  T+EDV ++ +VFA     S      AS+       +  GL R S Y+ H IF
Sbjct: 422  YLAFDEAKTMEDVQEIIEVFARTTHSSADVVDLASMVDHLSFEVSDGLRRTSDYMDHMIF 481

Query: 564  NTYQTEHELLRYIHRLQSKDLSLCHSMIPLGSCTMKLNATTEMMPVTWPSFTDLHPFAPT 623
            N + +EHE+LRYI RL+++DLSL HSMI LGSCTMKLNAT EM+PVTWP F  LHPF P 
Sbjct: 482  NAFHSEHEMLRYIKRLENRDLSLVHSMISLGSCTMKLNATAEMIPVTWPEFGQLHPFVPQ 541

Query: 624  EQAQGYQEMFNNLGDLLCTITGFDSFSLQPNAGAAGEYAGLMVIRAYHLSRGDHHRNVCI 683
            +QA GY  +F +L + L  ITGF   SLQPN+GA GE+AGLMVIRAYH SRG+ HRN+ +
Sbjct: 542  DQAAGYYALFQDLRNWLSEITGFAETSLQPNSGAQGEFAGLMVIRAYHESRGESHRNIAL 601

Query: 684  IPASAHGTNPASAAMVGMKIVTIGTDAKGNINIEELKKAAEKHKDNLSAFMVTYPSTHGV 743
            IP+SAHGTNPASA M GMK+V +  D KGNI++ +LK+ AEKHK+NLS+F+VTYPSTHGV
Sbjct: 602  IPSSAHGTNPASAVMAGMKVVIVKCDDKGNIDLADLKEKAEKHKENLSSFLVTYPSTHGV 661

Query: 744  YEEGIDDICKIIHDNGGQVYMDGANMNAQVGLTSPGWIGADVCHLNLHKTFCIPHGGGGP 803
            +EE I ++C+I+H+NGGQVYMDGANMNAQVGLTSPG IGADVCHLNLHKTFCIPHGGGGP
Sbjct: 662  FEEAIREMCQIVHENGGQVYMDGANMNAQVGLTSPGVIGADVCHLNLHKTFCIPHGGGGP 721

Query: 804  GMGPIGVKKHLAPFLPSHPVVPTGGIPAPENPQPLGSISAAPWGSALILPISYTYIAMMG 863
            GMGPI V KHL  FLPS P+V TGG       QP+ +ISAAP+GSA ILPISY YIAMMG
Sbjct: 722  GMGPICVAKHLEEFLPSSPLVKTGG------QQPISAISAAPFGSASILPISYAYIAMMG 775

Query: 864  SQGLTDASKIAILNANYMAKRLESYYPVLFRGVNGTVAHEFIIDLRGFKNTAGIEPEDVA 923
             +GL  A++ AILNANY+  RL  ++P L+ G  G  AHE I+D R FK   G+E ED+A
Sbjct: 776  REGLKHATQTAILNANYIKARLGEFFPTLYTGAQGRAAHEMIVDFREFK-AVGVEVEDIA 834

Query: 924  KRLMDYGFHGPTMSWPVAGTLMIEPTESESKAELDRFCDALISIRKEIAEVEKGNADVHN 983
            KRL+DYGFH PT+S+PVAGT+MIEPTESESKAELDRFCDALI+IR EI E+E+G AD  N
Sbjct: 835  KRLIDYGFHSPTVSFPVAGTMMIEPTESESKAELDRFCDALIAIRGEIREIEEGKADAEN 894

Query: 984  NVLKGAPHPPSLLMADAWTKPYSREYAAFPAAWLRGAKFWPTTGRVDNVYGDRNLVCTLL 1043
            NVLK APH   ++M+DAW  PYSRE A +P  +++ +KFWPT  R+D+ YGDRNLVC+ +
Sbjct: 895  NVLKNAPHTAGMVMSDAWDMPYSREKAVYPLEYVKNSKFWPTVRRIDSAYGDRNLVCSCI 954

Query: 1044 PASQAVEEQA 1053
            P     EE A
Sbjct: 955  PTEDYAEEAA 964


Lambda     K      H
   0.317    0.133    0.393 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2375
Number of extensions: 88
Number of successful extensions: 8
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1057
Length of database: 966
Length adjustment: 45
Effective length of query: 1012
Effective length of database: 921
Effective search space:   932052
Effective search space used:   932052
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 57 (26.6 bits)

Align candidate Echvi_0744 Echvi_0744 (glycine dehydrogenase (decarboxylating))
to HMM TIGR00461 (gcvP: glycine dehydrogenase (EC 1.4.4.2))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR00461.hmm
# target sequence database:        /tmp/gapView.15801.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR00461  [M=939]
Accession:   TIGR00461
Description: gcvP: glycine dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                            Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                            -----------
          0 1437.6   0.1          0 1437.5   0.1    1.0  1  lcl|FitnessBrowser__Cola:Echvi_0744  Echvi_0744 glycine dehydrogenase


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cola:Echvi_0744  Echvi_0744 glycine dehydrogenase (decarboxylating)
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1437.5   0.1         0         0       1     939 []      14     953 ..      14     953 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1437.5 bits;  conditional E-value: 0
                            TIGR00461   1 rhlGpdeaeqkkmlktlGfddlnalieqlvpkdirlarplkleapakeyealaelkkiasknkkvksyiGkGyy 74 
                                          rh Gp++ ++ +ml  +G++++++li+q +pk+i+l +pl+l+ +++e + l++++k+a+knk +ks+iG Gyy
  lcl|FitnessBrowser__Cola:Echvi_0744  14 RHNGPSANDVSEMLSKIGASSIDELIDQTIPKAIQLDQPLNLPEAKSEAAFLKDFRKMAAKNKIYKSFIGLGYY 87 
                                          8************************************************************************* PP

                            TIGR00461  75 atilppviqrnllenpgwytaytpyqpeisqGrleallnfqtvvldltGlevanaslldegtaaaeamalsfrv 148
                                          +ti+p vi+rn+lenpgwytaytpyq+ei+qGrleal+nfqt+v+dltG+e+anaslldegtaaaeam + f  
  lcl|FitnessBrowser__Cola:Echvi_0744  88 DTITPGVILRNVLENPGWYTAYTPYQAEIAQGRLEALVNFQTMVMDLTGMELANASLLDEGTAAAEAMNMLFAT 161
                                          ********************************************************************998876 PP

                            TIGR00461 149 ...skkkankfvvakdvhpqtlevvktraeplgievivddaskvk.kavdvlGvllqypatdGeildykalide 218
                                              kkka kf+v+++v  qt+e++ktra p+g+ ++++  ++++ +  +++Gvllqyp ++Ge  dykal+++
  lcl|FitnessBrowser__Cola:Echvi_0744 162 rprDKKKATKFFVDEKVFIQTKEILKTRALPIGVTLVEGSLNELNlEDPELYGVLLQYPNAEGEAIDYKALVEK 235
                                          3336889***********************************9972567999********************** PP

                            TIGR00461 219 lksrkalvsvaadllaltlltppgklGadivlGsaqrfGvplGyGGphaaffavkdeykrklpGrivGvskdal 292
                                          +k++++  + +adllaltlltppg++Gad+v+G++qrfGvp+G+GGphaa+fa+kd ykr++pGri+G+s d+ 
  lcl|FitnessBrowser__Cola:Echvi_0744 236 AKQHNVTTAFSADLLALTLLTPPGEMGADVVVGTTQRFGVPMGFGGPHAAYFATKDAYKRQVPGRIIGISVDKD 309
                                          ************************************************************************** PP

                            TIGR00461 293 GntalrlalqtreqhirrdkatsnictaqvllanvaslyavyhGpkGlkniarrifrltsilaaglkrknyelr 366
                                          Gn a r+alqtreqhi+r++atsnictaqvlla++a +yavyhGpkGlk+ia +i+ l+++ a+gl + ++e +
  lcl|FitnessBrowser__Cola:Echvi_0744 310 GNKAYRMALQTREQHIKRERATSNICTAQVLLAVMAGMYAVYHGPKGLKDIALKIHGLAKLTAQGLAKLGFEQE 383
                                          ************************************************************************** PP

                            TIGR00461 367 nktyfdtltvevgekaasevlkaaeeaeinlravvltevgialdetttkedvldllkvlagkdnlglsseelse 440
                                          n++yfdtl+++v +   s+++  a ++e+n+r   ++ v +a+de+ t edv+++++v+a     + ++  l +
  lcl|FitnessBrowser__Cola:Echvi_0744 384 NEHYFDTLKIKVDDVKQSKIKAFALSHEMNFR-YEPGYVYLAFDEAKTMEDVQEIIEVFARTTHSSADVVDLAS 456
                                          ******************************88.5799************************9999999999999 PP

                            TIGR00461 441 dvan...sfpaellrddeilrdevfnryhsetellrylhrleskdlalnqsmiplGsctmklnataemlpitwp 511
                                           v +    +   l+r+++++ + +fn +hse e+lry++rle++dl+l +smi lGsctmklnataem+p+twp
  lcl|FitnessBrowser__Cola:Echvi_0744 457 MVDHlsfEVSDGLRRTSDYMDHMIFNAFHSEHEMLRYIKRLENRDLSLVHSMISLGSCTMKLNATAEMIPVTWP 530
                                          998883345678************************************************************** PP

                            TIGR00461 512 efaeihpfapaeqveGykeliaqlekwlveitGfdaislqpnsGaqGeyaGlrvirsyhesrgeehrniclipa 585
                                          ef+++hpf p +q+ Gy  l+ +l +wl+eitGf   slqpnsGaqGe+aGl vir yhesrge hrni lip 
  lcl|FitnessBrowser__Cola:Echvi_0744 531 EFGQLHPFVPQDQAAGYYALFQDLRNWLSEITGFAETSLQPNSGAQGEFAGLMVIRAYHESRGESHRNIALIPS 604
                                          ************************************************************************** PP

                            TIGR00461 586 sahGtnpasaamaGlkvvpvkcdkeGnidlvdlkakaekagdelaavmvtypstyGvfeetirevidivhrfGG 659
                                          sahGtnpasa+maG+kvv vkcd++Gnidl dlk+kaek+ ++l++ +vtypst+Gvfee+ire+++ivh+ GG
  lcl|FitnessBrowser__Cola:Echvi_0744 605 SAHGTNPASAVMAGMKVVIVKCDDKGNIDLADLKEKAEKHKENLSSFLVTYPSTHGVFEEAIREMCQIVHENGG 678
                                          ************************************************************************** PP

                            TIGR00461 660 qvyldGanmnaqvGltspgdlGadvchlnlhktfsiphGGGGpgmgpigvkshlapflpktdlvsvvelegesk 733
                                          qvy+dGanmnaqvGltspg +Gadvchlnlhktf+iphGGGGpgmgpi+v  hl  flp +     +  +g+++
  lcl|FitnessBrowser__Cola:Echvi_0744 679 QVYMDGANMNAQVGLTSPGVIGADVCHLNLHKTFCIPHGGGGPGMGPICVAKHLEEFLPSS----PLVKTGGQQ 748
                                          ************************************************************4....444579999 PP

                            TIGR00461 734 sigavsaapyGsasilpisymyikmmGaeGlkkasevailnanylakrlkdaykilfvgrdervahecildlre 807
                                           i a+saap+Gsasilpisy+yi+mmG eGlk+a++ ailnany+ +rl ++++ l++g ++r ahe+i+d+re
  lcl|FitnessBrowser__Cola:Echvi_0744 749 PISAISAAPFGSASILPISYAYIAMMGREGLKHATQTAILNANYIKARLGEFFPTLYTGAQGRAAHEMIVDFRE 822
                                          ************************************************************************** PP

                            TIGR00461 808 lkekagiealdvakrlldyGfhaptlsfpvaGtlmveptesesleeldrfidamiaikeeidavkaGeikledn 881
                                          +k+  g+e++d+akrl+dyGfh+pt+sfpvaGt+m+epteses+ eldrf+da+iai+ ei ++ +G+ ++e+n
  lcl|FitnessBrowser__Cola:Echvi_0744 823 FKAV-GVEVEDIAKRLIDYGFHSPTVSFPVAGTMMIEPTESESKAELDRFCDALIAIRGEIREIEEGKADAENN 895
                                          **98.********************************************************************* PP

                            TIGR00461 882 ilknaphslqslivaewadpysreeaaypapvlkyfkfwptvarlddtyGdrnlvcsc 939
                                          +lknaph++  ++   w  pysre+a+yp+ ++k+ kfwptv+r+d +yGdrnlvcsc
  lcl|FitnessBrowser__Cola:Echvi_0744 896 VLKNAPHTAGMVMSDAWDMPYSREKAVYPLEYVKNSKFWPTVRRIDSAYGDRNLVCSC 953
                                          ********99999999999*************************************** PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (939 nodes)
Target sequences:                          1  (966 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.06u 0.02s 00:00:00.08 Elapsed: 00:00:00.08
# Mc/sec: 11.04
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory