GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Methylohalobius crimeensis 10Ki

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_022948913.1 H035_RS0110405 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA

Query= reanno::ANA3:7023590
         (1064 letters)



>NCBI__GCF_000421465.1:WP_022948913.1
          Length = 1042

 Score =  951 bits (2457), Expect = 0.0
 Identities = 511/1034 (49%), Positives = 684/1034 (66%), Gaps = 19/1034 (1%)

Query: 27   KAVTDNYIVDEEQYLSELIKLVPSSDEAIERVTRRAHELVNKVRQFDKKGLMVGIDAFLQ 86
            KA+   Y+ DE Q L  L+  V  SD A +RV  RA  LV  VRQ       + +DAFL 
Sbjct: 24   KAIAAAYLADEAQTLESLLPEVSLSDGARQRVHERARRLVQSVRQHP-----LPLDAFLS 78

Query: 87   QYSLETQEGIILMCLAEALLRIPDAATADALIEDKLSGAKWDEHLSKSDSVLVNASTWGL 146
            ++ L ++EG++LMCLAEALLRIPD  TA+ LI DKLS  KWD HL  S S LVNAST+GL
Sbjct: 79   EFELSSEEGVLLMCLAEALLRIPDDQTAERLIRDKLSRGKWDRHLGHSASFLVNASTFGL 138

Query: 147  MLTGKIVKLDKKIDGTPSNLLSRLVNRLGEPVIRQAMMAAMKIMGKQFVLGRTMKEALKN 206
            +LTG++++L+ +   T   L  +L+ R GE  +   ++ AMKI+G QF+L  T+++AL  
Sbjct: 139  LLTGRLMRLEIEDSDT---LFQKLLQR-GEAAVHVLVVRAMKILGGQFILATTIEKALA- 193

Query: 207  SEDKRKLGYTHSYDMLGEAALTRKDAEKYFNDYANAITELGAQSYNENES-PRPTISIKL 265
               +R   + +S+DMLGE A T +D E YF  Y +AIT L     +E++    P IS+KL
Sbjct: 194  ---RRSPDFRYSFDMLGEEAFTAEDVEAYFRSYRHAITVLARHRGDESDVLAMPGISVKL 250

Query: 266  SALHPRYEVANEDRVLTELYDTVIRLIKLARGLNIGISIDAEEVDRLELSLKLFQKLFNA 325
            SALHPRY     +RV  EL   ++ L K AR   IG+++DAEE +RL+L L +F+ +F  
Sbjct: 251  SALHPRYTFTQGERVRRELIPRLLTLAKEARAARIGLTLDAEEAERLDLMLDIFEAVFGH 310

Query: 326  DATKGWGLLGIVVQAYSKRALPVLVWLTRLAKEQGDEIPVRLVKGAYWDSELKWAQQAGE 385
                 W   G+ VQAY KRA P++ +L  +A  QG  IPVRLVKGAYWD+E+K AQQ G 
Sbjct: 311  PDLSSWEGFGLAVQAYQKRAGPLIEYLRHIAASQGKRIPVRLVKGAYWDTEIKRAQQQGW 370

Query: 386  AAYPLYTRKAGTDVSYLACARYLLSDATRGAIYPQFASHNAQTVAAISDMAGDRNHEFQR 445
            + YP++TRK  TDVSYLA AR LL  A   A +PQFA+HNA TVA I ++A  R  EFQR
Sbjct: 371  SDYPVFTRKMNTDVSYLALARRLL--AFPRAFFPQFATHNAHTVAWILEVAEGRPLEFQR 428

Query: 446  LHGMGQELYDTILSEAGAKAVRIYAPIGAHKDLLPYLVRRLLENGANTSFVHKLVDPKTP 505
            L GMG+ LY  +    G    R+YAP+G  ++LLPYL+RRLLENGANTSFVH+L D + P
Sbjct: 429  LQGMGEPLYRALRQTEGDIPCRVYAPVGGFRELLPYLMRRLLENGANTSFVHRLSDARLP 488

Query: 506  IESLVVHPLKTLTGYKTLANNKIVLPTDIFGSDRKNSKGLNMNIISEAEPFFAALDKFKS 565
            IE ++  P+  +   +T+ + +I  P D++   R NS G N++           L  F+ 
Sbjct: 489  IEEVIADPVAKVRTLETIPHPQIPQPADLYQPQRPNSPGPNLSDPLTYRRLNELLSSFRQ 548

Query: 566  TQWQAGPLVNGQTLTGEHKTVVSPFDTTQTVGQVAFADKAAIEQAVASADAAFATWTRTP 625
              W+A P V+G+   G+ + V SP D  Q VG V  A + +I +A+A A+AA   W R P
Sbjct: 549  NTWEAQPWVSGKGGEGKERMVYSPVDRGQVVGTVVEAGEESIAKALACAEAAAPEWDRVP 608

Query: 626  VEVRASALQKLADLLEENREELIALCTREAGKSIQDGIDEVREAVDFCRYYAVQAKKLMS 685
             E RA+ L++ A+L + +R+EL+ALC RE G+ + D +DEVREA+DF  YYA +A++L +
Sbjct: 609  AEDRAACLERAAELYQSHRDELLALCIREGGRCLSDAVDEVREAIDFLYYYAAEARRLFA 668

Query: 686  KPELLPGPTGELNELFLQGRGVFVCISPWNFPLAIFLGQVSAALAAGNTVVAKPAEQTSI 745
             P  LPGP GE N L+L GRG FVCISPWNFPLAIF GQ +AALAAGN V+AKPAEQT +
Sbjct: 669  NPLSLPGPVGESNRLYLHGRGPFVCISPWNFPLAIFTGQAAAALAAGNPVLAKPAEQTPL 728

Query: 746  IGYRAVQLAHQAGIPTDVLQYLPGTGATVGNALTADERIGGVCFTGSTGTAKLINRTLAN 805
            I   A +L  QAGIP +VL +LPG G  VG AL  D RI GV FTGST TA  I R LA 
Sbjct: 729  IAALAARLLQQAGIPNEVLHFLPG-GGEVGAALVGDPRIRGVAFTGSTDTAWEIQRNLAG 787

Query: 806  REGAIIPLIAETGGQNAMVVDSTSQPEQVVNDVVSSSFTSAGQRCSALRVLFLQEDIADR 865
            R   I   +AETGG NAM+VDS++ P+QVV DV++S+F SAGQRCSALR+LFLQ+++AD 
Sbjct: 788  RRAPIAAFVAETGGVNAMIVDSSALPQQVVADVLASTFNSAGQRCSALRILFLQDEVADP 847

Query: 866  VIDVLQGAMDELVIGNPSSVKTDVGPVIDATAKANLDAHIDHIKQVGKLIKQMSLPAGTE 925
            V+D+L GA+ EL IG+P  + TD+GP+ID  A A L++H  +++   + + ++  P  T 
Sbjct: 848  VLDMLCGALAELKIGDPWDLSTDLGPLIDEDAVARLESHRAYLETNVQPLARLDPPPLT- 906

Query: 926  NGHFVSPTAVEIDSIKVLEKEHFGPILHVIRYKASELAHVIDEINSTGFGLTLGIHSRNE 985
             G +  P   EI +++ + +E FGPILHV+RY A  L  +++ I+  G+GLTLG+HSR +
Sbjct: 907  -GCYFGPAIYEITALEPVSREVFGPILHVMRYSAGHLDELLERIHGLGYGLTLGVHSRID 965

Query: 986  GHALEVADKVNVGNVYINRNQIGAVVGVQPFGGQGLSGTGPKAGGPHYLTRFVTEKTRTN 1045
                 +A+   VGN+Y+NRN IGAVVG  PFGG+GLSGTGPKAGGP+YL RF  EK+ + 
Sbjct: 966  AIWERIAEHARVGNIYVNRNMIGAVVGTHPFGGEGLSGTGPKAGGPNYLARFAYEKSLSV 1025

Query: 1046 NITAIGGNATLLSL 1059
            N  A+GG+  LLSL
Sbjct: 1026 NTAAVGGDPDLLSL 1039


Lambda     K      H
   0.317    0.133    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2442
Number of extensions: 95
Number of successful extensions: 9
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1064
Length of database: 1042
Length adjustment: 45
Effective length of query: 1019
Effective length of database: 997
Effective search space:  1015943
Effective search space used:  1015943
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate WP_022948913.1 H035_RS0110405 (bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase PutA)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.8532.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.5e-185  603.4   0.0     2e-185  603.0   0.0    1.1  1  lcl|NCBI__GCF_000421465.1:WP_022948913.1  H035_RS0110405 bifunctional prol


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000421465.1:WP_022948913.1  H035_RS0110405 bifunctional proline dehydrogenase/L-glutamate gamma-semiald
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  603.0   0.0    2e-185    2e-185       1     496 [.     516    1018 ..     516    1022 .. 0.97

  Alignments for each domain:
  == domain 1  score: 603.0 bits;  conditional E-value: 2e-185
                                 TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGq 67  
                                                dly   r ns G +l+   + ++l+e l +  +++++a p v++k   eg+ + v +p+dr ++vG+
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  516 DLYQPQRPNSPGPNLSDPLTYRRLNELLSSFRQNTWEAQPWVSGK-GGEGKERMVYSPVDRGQVVGT 581 
                                                799999************************************665.57888899************* PP

                                 TIGR01238   68 vseadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiae 134 
                                                v ea ++ + +a+ +a aa++ew  ++a++raa+ler+a+l +sh  el+al++re G+ ls+a++e
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  582 VVEAGEESIAKALACAEAAAPEWDRVPAEDRAACLERAAELYQSHRDELLALCIREGGRCLSDAVDE 648 
                                                ******************************************************************* PP

                                 TIGR01238  135 vreavdflryyakqvedvldeesaka.............lGavvcispwnfplaiftGqiaaalaaG 188 
                                                vrea+dfl yya +++  + +    +             +G++vcispwnfplaiftGq aaalaaG
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  649 VREAIDFLYYYAAEARRLFANPLSLPgpvgesnrlylhgRGPFVCISPWNFPLAIFTGQAAAALAAG 715 
                                                *******************999777799*************************************** PP

                                 TIGR01238  189 ntviakpaeqtsliaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarli 255 
                                                n v+akpaeqt+liaa a  llq+aG+p  v+ +lpG Ge vGaal  d+ri+Gv+ftGst++a  i
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  716 NPVLAKPAEQTPLIAALAARLLQQAGIPNEVLHFLPGGGE-VGAALVGDPRIRGVAFTGSTDTAWEI 781 
                                                ***************************************9.************************** PP

                                 TIGR01238  256 nkalakredapvpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrv 322 
                                                +++la r ++ ++++aetGG namivds+al++qvvadvlas+f+saGqrcsalr+l++q++vad v
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  782 QRNLAGRRAPIAAFVAETGGVNAMIVDSSALPQQVVADVLASTFNSAGQRCSALRILFLQDEVADPV 848 
                                                ******************************************************************* PP

                                 TIGR01238  323 ltlikGamdelkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfva 389 
                                                l+++ Ga+ elk+g p  l td+Gp+id++a  +l++h   ++++ + +a++          g +  
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  849 LDMLCGALAELKIGDPWDLSTDLGPLIDEDAVARLESHRAYLETNVQPLARLDPPP----LTGCYFG 911 
                                                **************************************999998888877765554....789999* PP

                                 TIGR01238  390 ptlfelddldelkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvG 456 
                                                p ++e+  l+ +++evfGp+lhv+ry a +ld+++++i   GygltlGvhsri+    +i ++a+vG
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  912 PAIYEITALEPVSREVFGPILHVMRYSAGHLDELLERIHGLGYGLTLGVHSRIDAIWERIAEHARVG 978 
                                                ******************************************************************* PP

                                 TIGR01238  457 nvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrlt 496 
                                                n+yvnrn++GavvG  pfGGeGlsGtGpkaGGp yl r+ 
  lcl|NCBI__GCF_000421465.1:WP_022948913.1  979 NIYVNRNMIGAVVGTHPFGGEGLSGTGPKAGGPNYLARFA 1018
                                                **************************************96 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1042 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.03
# Mc/sec: 13.14
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory