GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Methylocystis bryophila S285

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_085770261.1 B1812_RS02885 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA

Query= reanno::SB2B:6938573
         (1058 letters)



>NCBI__GCF_002117405.1:WP_085770261.1
          Length = 1018

 Score =  846 bits (2185), Expect = 0.0
 Identities = 484/1026 (47%), Positives = 641/1026 (62%), Gaps = 28/1026 (2%)

Query: 28   YIVDEEAYLKELIALVPSSDEEIARITSRAHDLVAKVRQYEKKGLMVG-IDAFLQQYSLE 86
            Y  D+      L+A  P      A   + A +LVA  R    +GL++G I+ FL+++SL 
Sbjct: 14   YAEDDAQIAARLLARRPDPGARDAT-EALARELVAASRP---QGLLIGGIEDFLREFSLT 69

Query: 87   TQEGIILMCLAEALLRIPDAETADALIADKLSGAKWDEHMSKSDSVLVNASTWGLMLTGK 146
            ++EG+ +M LAE+LLR+PD ET D L+ADKL+   +  H    D++LV A  + L L+ +
Sbjct: 70   SREGLAVMALAESLLRVPDDETLDRLLADKLAVGDFAHHRIAGDALLVQACAFALGLSAR 129

Query: 147  IVQLDKNLDGTPSNLLSRLVNRLGEPVIRQAMYAAMKIMGKQFVLGRTIEEGLKNAAEKR 206
            + +       +P      +  RLG P +R A   AM++MG  FV G TIE    NA  +R
Sbjct: 130  LFEEG----ASPHGAAESVARRLGLPALRLAAKQAMRLMGAHFVFGETIE----NALSRR 181

Query: 207  KLGYTHSYDMLGEAALTMKDADKYYRDYANAIQALGTAKFDESEAPRPTISIKLSALHPR 266
                 +S+DMLGEAA T +DA++Y+  YA+AI+A+G A   ++   RP +S+KLSALHPR
Sbjct: 182  APHLRYSFDMLGEAARTQEDAERYFEAYAHAIEAVGGAAGAQALPERPGVSVKLSALHPR 241

Query: 267  YEVANEDRVMTELYATLIKLIEQARSLNVGIQIDAEEVDRLELSLKLFKKLYQSDAAKGW 326
            YE  + +RV+ EL   L+ L   AR  ++   IDAEE DRLELSL +  ++    + KGW
Sbjct: 242  YEAISRERVLAELAPRLLDLARLAREHDLAFTIDAEEADRLELSLDVMARVVADPSLKGW 301

Query: 327  GLLGIVVQAYSKRALPVLMWLTRLAKEQGDEIPLRLVKGAYWDSELKWAQQAGEAGYPLF 386
               G+ VQAY KRA  V+  +   AK     + LRLVKGAYWD E+K AQ+ G A YP+F
Sbjct: 302  EGFGLAVQAYQKRAEAVIDHVAGWAKTLNRRLMLRLVKGAYWDLEIKRAQERGLADYPVF 361

Query: 387  TRKAATDVSYLACARYLLSEATRGVIYPQFASHNAQTVAAITAM-VGDRKFEFQRLHGMG 445
            TRKA TD +YLACA  L +E    +++PQFA+HNA T AAI A       FEFQRLHGMG
Sbjct: 362  TRKAMTDANYLACAARLFAEP---MLFPQFATHNAMTAAAILAQGAKPESFEFQRLHGMG 418

Query: 446  QELYDTVLAEAAVPTVRIYAPIGAHKDLLPYLVRRLLENGANTSFVHKLVDPKTPIESLV 505
            + LY  +L + A   VR+YAP+G H+DLL YLVRRL+ENGAN+S+V +L DP   IE L+
Sbjct: 419  EGLYGLLLQKGA--AVRVYAPVGKHRDLLAYLVRRLIENGANSSYVARLADPHCGIEDLL 476

Query: 506  THPLKTLQGYKTLANNKIVKPADIFGAERKNSKGLNMNIISESEPFFAALEKFKDTQWSA 565
              P   L   +   +  +  P D++   R NS G+              ++  +  ++ A
Sbjct: 477  EDPFAALGRPENARHRHLPLPKDLYRPLRSNSDGVEFGDQKALSALLVEIDASRGAKFFA 536

Query: 566  GPLVNGETLSGEVRDVVSPYNTTLKVGQVAFANEATIEQAIAGADKAFASWCRTPVETRA 625
             P     T   + R + SP++ T +VG V  A+ A+++ A+A A +AF++W   P ETRA
Sbjct: 537  QPASASMT---QKRAIHSPFDGT-RVGDVIEADAASLDVAMAAAHRAFSAWGAQPAETRA 592

Query: 626  NALQKLADLLEENREELIALCTREAGKSIQDGIDEVREAVDFCRYYAVQAKKMMSKPELL 685
              L++ A L+E  R  LIAL   E GK++ D + EVREA D+CRYYA  A+++M +   L
Sbjct: 593  QILERAAALIESRRGRLIALLQSEGGKTLDDALAEVREAADYCRYYANVARELM-RERAL 651

Query: 686  PGPTGELNELFLQGRGVFVCISPWNFPLAIFLGQVAAALATGNTVIAKPAEQTCLIGFRA 745
            PGPTGE N L   GRGVFVCISPWNFPLAIF+GQ+AAAL  GN V AKPAEQT LIGF A
Sbjct: 652  PGPTGEENRLRHVGRGVFVCISPWNFPLAIFVGQIAAALVAGNAVAAKPAEQTPLIGFEA 711

Query: 746  VQLAHEAGIPKDVLQFLPGTGAVVGAKLTSDERIGGVCFTGSTTTAKVINRALAGRDGAI 805
            V L  EAG+P D LQFLPG GA VGA L + E   GV FTGS   A+ INRALA + G I
Sbjct: 712  VALLREAGVPADALQFLPGDGA-VGASLVAHELTAGVVFTGSVAVAQEINRALAKKPGPI 770

Query: 806  IPLIAETGGQNAMVVDSTSQPEQVVNDVVSSAFTSAGQRCSALRVLYLQEDIAERVLDVL 865
            +PLIAETGG NAM+VDST+  E V +DV +SAF SAGQRCSALR+LYLQE+I +  L  +
Sbjct: 771  VPLIAETGGINAMIVDSTALFEHVADDVCASAFRSAGQRCSALRLLYLQEEIFDACLATI 830

Query: 866  KGAMDELTLGNPGSVKTDVGPVIDAAAKANLNAHIDHIKQVGRLIHQLSLPEGTENGHFV 925
             GA  EL LG+PG   T +GPVID  AKA L+ ++   +  G +++  + P     G FV
Sbjct: 831  VGAAKELRLGDPGDPATHIGPVIDRGAKAALDDYLARRRAEGSVVYTGAAP---GQGCFV 887

Query: 926  APTAVEIDSIKVLTKENFGPILHVVRYKAAGLQKVIDDINSTGFGLTLGIHSRNEGHALE 985
            AP  V + S + L +E FGP+LHV  ++A     ++ +I +  +GLT+G+HSR E  A  
Sbjct: 888  APHVVRLSSGRELNQEIFGPVLHVAPWRAGDFDSLVAEIMAANYGLTIGLHSRIEARAKR 947

Query: 986  VADKVNVGNVYINRNQIGAVVGVQPFGGQGLSGTGPKAGGPHYLTRFVTEKTRTNNITAI 1045
            +A     GN+YINR  IGAVVG QPFGG  LSGTGPKAGG  YL RFV E T T N  A+
Sbjct: 948  LAALAPAGNIYINRTMIGAVVGSQPFGGFNLSGTGPKAGGADYLRRFVREITVTTNTAAL 1007

Query: 1046 GGNATL 1051
            GG+  L
Sbjct: 1008 GGDLRL 1013


Lambda     K      H
   0.317    0.134    0.380 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2313
Number of extensions: 92
Number of successful extensions: 10
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1058
Length of database: 1018
Length adjustment: 45
Effective length of query: 1013
Effective length of database: 973
Effective search space:   985649
Effective search space used:   985649
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 57 (26.6 bits)

Align candidate WP_085770261.1 B1812_RS02885 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.4034767.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   4.7e-169  549.0   0.6   6.2e-169  548.6   0.6    1.1  1  NCBI__GCF_002117405.1:WP_085770261.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_002117405.1:WP_085770261.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  548.6   0.6  6.2e-169  6.2e-169       1     497 [.     499     996 ..     499     999 .. 0.97

  Alignments for each domain:
  == domain 1  score: 548.6 bits;  conditional E-value: 6.2e-169
                             TIGR01238   1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvseada 73 
                                           dly   r+ns Gv++  ++ l  l  ++ ++   kf a p       +  + + + +p d    vG v eada
  NCBI__GCF_002117405.1:WP_085770261.1 499 DLYRPLRSNSDGVEFGDQKALSALLVEIDASRGAKFFAQPAS----ASMTQKRAIHSPFDG-TRVGDVIEADA 566
                                           799999********************************9987....4455667899***97.579******** PP

                             TIGR01238  74 aevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflryya 146
                                           a  + a+ +a +af+ w a +a+ ra iler+a l+es+   l+all+ e Gktl++a+aevrea d++ryya
  NCBI__GCF_002117405.1:WP_085770261.1 567 ASLDVAMAAAHRAFSAWGAQPAETRAQILERAAALIESRRGRLIALLQSEGGKTLDDALAEVREAADYCRYYA 639
                                           ************************************************************************* PP

                             TIGR01238 147 kqvedvldeesaka............lGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaarav 207
                                           + +++ + e +               +G++vcispwnfplaif+Gqiaaal+aGn+v akpaeqt+li  +av
  NCBI__GCF_002117405.1:WP_085770261.1 640 NVARELMRERALPGptgeenrlrhvgRGVFVCISPWNFPLAIFVGQIAAALVAGNAVAAKPAEQTPLIGFEAV 712
                                           ***998888865559999******************************************************* PP

                             TIGR01238 208 ellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredapvpliaetGGqnami 280
                                           +ll+eaGvpa ++q+lpG G+ vGa+l ++e  aGv+ftGs +va+ in+alak+  + vpliaetGG nami
  NCBI__GCF_002117405.1:WP_085770261.1 713 ALLREAGVPADALQFLPGDGA-VGASLVAHELTAGVVFTGSVAVAQEINRALAKKPGPIVPLIAETGGINAMI 784
                                           *********************.*************************************************** PP

                             TIGR01238 281 vdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaea 353
                                           vdstal e v  dv asaf saGqrcsalr+l++qe++ d  l  i Ga +el++g p+   t +Gpvid  a
  NCBI__GCF_002117405.1:WP_085770261.1 785 VDSTALFEHVADDVCASAFRSAGQRCSALRLLYLQEEIFDACLATIVGAAKELRLGDPGDPATHIGPVIDRGA 857
                                           ************************************************************************* PP

                             TIGR01238 354 kqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvvdk 426
                                           k  l  ++ + +a +  v+           +g fvap ++ l +  el++e+fGpvlhv  ++a ++d++v +
  NCBI__GCF_002117405.1:WP_085770261.1 858 KAALDDYLARRRAEGSVVYTGAA-----PGQGCFVAPHVVRLSSGRELNQEIFGPVLHVAPWRAGDFDSLVAE 925
                                           **********9999988886544.....559****************************************** PP

                             TIGR01238 427 inakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrltr 497
                                           i a  yglt+G+hsrie   +++   a +Gn+y+nr ++GavvG qpfGG  lsGtGpkaGG+ yl r++r
  NCBI__GCF_002117405.1:WP_085770261.1 926 IMAANYGLTIGLHSRIEARAKRLAALAPAGNIYINRTMIGAVVGSQPFGGFNLSGTGPKAGGADYLRRFVR 996
                                           ********************************************************************987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1018 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.01s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 31.08
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory