GapMind for catabolism of small carbon sources

 

Alignments for a candidate for rocA in Azohydromonas australica DSM 1124

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_028999944.1 H537_RS0124445 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::acidovorax_3H11:Ac3H11_2850
         (1261 letters)



>NCBI__GCF_000430725.1:WP_028999944.1
          Length = 1248

 Score = 1696 bits (4392), Expect = 0.0
 Identities = 886/1258 (70%), Positives = 997/1258 (79%), Gaps = 20/1258 (1%)

Query: 6    APFADFAPRTPLANPLRAAITAAITAATRHPEPEALAPLLAQARLPADQAAAAEQLALRI 65
            APFA+F    P +  LR AIT A     R PE EAL  L   ARLPA     A+ LA RI
Sbjct: 9    APFAEFLHPAPYSAELRQAITEA----WRKPEVEALPMLAEMARLPAPLKEQAQALAARI 64

Query: 66   AKALRERKASAGRAGIVQGLLQEFSLSSQEGVALMCLAEALLRIPDKATRDALIRDKISH 125
            A  LR+RK SAGRAG+VQGLLQE++LSSQEGVALMCLAEALLRIPD+ TRDALIRDKI+ 
Sbjct: 65   ATTLRDRKPSAGRAGLVQGLLQEYALSSQEGVALMCLAEALLRIPDRETRDALIRDKIAR 124

Query: 126  GQWDAHLGKSPSLFVNAATWGLLITGKLVATHSEGSLGNSLSRLIGKGGEPLIRKGVDMA 185
            GQW  HLG+SPSLFVNAATWGLLITG+L ATHSE  L ++L R++  GGEPLIRK VDMA
Sbjct: 125  GQWHTHLGRSPSLFVNAATWGLLITGRLTATHSESGLSSALGRMLAVGGEPLIRKSVDMA 184

Query: 186  MRMMGEQFVTGETIDEALRNARTMEAEGFRYSYDMLGEAALTSEDAKRYYSSYEQAIHAI 245
            MR+MGEQFVTGETID+AL NAR  EAEGFRYSYDMLGEAALT++DA+RY +SYE+AIHAI
Sbjct: 185  MRVMGEQFVTGETIDQALANARVREAEGFRYSYDMLGEAALTAQDAQRYLASYERAIHAI 244

Query: 246  GKASAGRGIYEGPGISIKLSALHPRYSRAQFGRVMDELYPLVLRLTALAKQYDIGLNIDA 305
            GKASAGRGIYEGPGISIKLSALHPRYSRAQ  RV+DELYP++LRL  LAK+YDIGLNIDA
Sbjct: 245  GKASAGRGIYEGPGISIKLSALHPRYSRAQLDRVLDELYPVLLRLALLAKRYDIGLNIDA 304

Query: 306  EETDRLELSLDLLERLCHEPTLAGWNGIGFVIQAYQKRCPFVIDCVVDLARRTQRRLMVR 365
            EE DRLE+SLDLLERLC EP L GWNGIGFVIQAYQKRCP+VID +VDLARR++RRLM+R
Sbjct: 305  EEADRLEISLDLLERLCFEPALKGWNGIGFVIQAYQKRCPYVIDFIVDLARRSERRLMIR 364

Query: 366  LVKGAYWDSEIKRAQVDGLKDYPVYTRKVHTDISYIACAKKLLAAPEAVYPQFATHNAET 425
            LVKGAYWDSEIKRAQ+DG  DYPVYTRK +TDI+YIACA+KLLAAP+ VYPQFATHNA T
Sbjct: 365  LVKGAYWDSEIKRAQLDGQLDYPVYTRKPYTDIAYIACARKLLAAPQQVYPQFATHNAHT 424

Query: 426  VATIYQLAG-SNYYAGQYEFQCLHGMGEPLYEQVVGAITAGKLGREIGKGGLGRPCRIYA 484
            +A IY LA  + +  GQYEFQCLHGMGEPLYEQVVG           G G LGRPCR+YA
Sbjct: 425  LAAIYHLADPARWQPGQYEFQCLHGMGEPLYEQVVGT----------GAGKLGRPCRVYA 474

Query: 485  PVGTHETLLAYLVRRLLENGANTSFVNRIADETIALDELVKSPVQVVDQQAATEGTAGLP 544
            PVGTHETLLAYLVRRLLENGANTSFVNRIAD TI++ ELV+ PV V D+ A  EG  GLP
Sbjct: 475  PVGTHETLLAYLVRRLLENGANTSFVNRIADATISITELVRDPVDVADELARKEGRFGLP 534

Query: 545  HPRIPLPAALYGAHRSNSRGLDLSNENTLTELAATLQATASHAWTAAPLLAADVPAGTTQ 604
            HP IP P  LYG  R NSRG+DLSNE+ L +L A L+ +A  AW+A PLLAA   AG  +
Sbjct: 535  HPAIPAPRDLYGPARPNSRGIDLSNEHELAKLQAALRQSAGEAWSAEPLLAAGPVAGERE 594

Query: 605  PVRNPADHNDVVGQVQEATTADVDQALVHAQAAATSWAATPPAERAAALLRTADLLEERI 664
             VRNPADH DVVG VQ A+   VD A  HA AAA  WAATPPA RA  L R AD LEE +
Sbjct: 595  AVRNPADHGDVVGFVQNASAEHVDLAFSHAAAAAGRWAATPPAARADMLDRAADRLEEDM 654

Query: 665  QPLMGLLMREAGKSASNAVAEVREAVDFLRYYAAQVQSTFDNATHIPLGPVACISPWNFP 724
              LMGLL+REAGK+A+NA+AEVREAVDFLRYYA QV+  F+  TH+P+GPV CISPWNFP
Sbjct: 655  PRLMGLLIREAGKTAANAIAEVREAVDFLRYYARQVREDFEPDTHVPVGPVVCISPWNFP 714

Query: 725  LAIFMGQVAAALAAGNPVLAKPAEQTPLIAAEAVRLLWQAGVPRAAVQLLPGQGETVGAR 784
            LAIFMGQV+AALAAGNPVLAKPAEQTPLIAAEAVR+LW AGVPR  +Q LPG GE VGAR
Sbjct: 715  LAIFMGQVSAALAAGNPVLAKPAEQTPLIAAEAVRVLWAAGVPRDVLQFLPGAGEVVGAR 774

Query: 785  LIGDARVMGVMFTGSTEVARILQRTVAGRLDAAGRPIPLIAETGGQNAMIVDSSALVEQV 844
            L+GDARV GV+FTGSTEVARILQR VAGRLDA GRPIPLIAETGGQNAMIVDSSAL EQV
Sbjct: 775  LVGDARVRGVLFTGSTEVARILQRAVAGRLDADGRPIPLIAETGGQNAMIVDSSALAEQV 834

Query: 845  VGDAVSSAFDSAGQRCSALRVLCVQEEAADRVVEMLQGAMGELRVGNPGELRVDVGPVID 904
            V D ++SAFDSAGQRCSALRVLCVQE+ ADR++EML GAM E R+G+P  L VDVGPVID
Sbjct: 835  VTDVLASAFDSAGQRCSALRVLCVQEDVADRLIEMLLGAMAEWRIGSPDRLAVDVGPVID 894

Query: 905  AEAQAGIAQHIEKFKAQGHRVFQHPNHVSAISAPGTFVPPTLIELNHIGELQREVFGPVL 964
            AEA AG+ +H+E  +A G RV Q     + +   GTF+ PT+IEL+ + ELQREVFGPVL
Sbjct: 895  AEALAGLQRHVEGMRASGRRVHQLGACDAGVLGRGTFMLPTVIELDQLSELQREVFGPVL 954

Query: 965  HLVRYARSDLDQLLDQINATGYGLTQGVHTRIDETIARVVNRAHAGNVYVNRNMVGAVVG 1024
            H++RY R DLD LL Q+NATGYGLT G+HTRIDETI+RV+  +HAGNVYVNRN+VGAVVG
Sbjct: 955  HVLRYQREDLDALLAQVNATGYGLTMGLHTRIDETISRVLKASHAGNVYVNRNIVGAVVG 1014

Query: 1025 VQPFGGEGLSGTGPKAGGPLYLLRLLSQRPADALARTFAEADRTSPHDTERRERHLAP-L 1083
            VQPFGGEGLSGTGPKAGGPLYL RLLS+RPAD + R F E D T   D       + P L
Sbjct: 1015 VQPFGGEGLSGTGPKAGGPLYLYRLLSRRPADVMTRLF-EPDATL--DLGPFAGQMPPAL 1071

Query: 1084 ATLQQWAHNQGNLALAGHCQRFAQETQSGTSRTLPGPTGERNVYTLAPRARVLCLAHSVD 1143
              L  WA  Q    LA  C RF Q ++SG S+TLPGPTGE+NVY LA R  VLCLA S D
Sbjct: 1072 RALHDWAQQQHQTQLAQVCHRFWQRSRSGASQTLPGPTGEKNVYALAAREAVLCLAGSDD 1131

Query: 1144 DLLVQTAAVLASGGTALWPHAHAGLRAKLPTHVQAQVMLQDNTLSDGSVALDAVLHHGDA 1203
            D LVQ AAVLA GG  +WP     L  KLP  VQA V++  +  S   V+ DA L HG A
Sbjct: 1132 DRLVQLAAVLAVGGRCVWPVECEALMRKLPAAVQASVVIARDWASP-EVSFDAALFHGAA 1190

Query: 1204 PSLQAVCTTLARRPGPIVGVTALQPGAADIPLERLLIERALSVNTAAAGGNASLMTIG 1261
            P LQA+   LA RPGPIVG+  LQPG  D+PLERL++E+A S+NTAAAGGNASLMTIG
Sbjct: 1191 PELQAIRQRLAERPGPIVGIERLQPGETDVPLERLVVEKATSINTAAAGGNASLMTIG 1248


Lambda     K      H
   0.318    0.133    0.387 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3901
Number of extensions: 171
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1261
Length of database: 1248
Length adjustment: 48
Effective length of query: 1213
Effective length of database: 1200
Effective search space:  1455600
Effective search space used:  1455600
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 59 (27.3 bits)

Align candidate WP_028999944.1 H537_RS0124445 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.4105264.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   2.6e-229  747.7   0.3   3.8e-229  747.2   0.3    1.2  1  NCBI__GCF_000430725.1:WP_028999944.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000430725.1:WP_028999944.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  747.2   0.3  3.8e-229  3.8e-229       1     498 [.     543    1042 ..     543    1044 .. 0.99

  Alignments for each domain:
  == domain 1  score: 747.2 bits;  conditional E-value: 3.8e-229
                             TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71  
                                            dlyg +r ns G+dl+ne+el++l++ l+++a + + a p+++  ++  ge + v+npad+ d+vG v++a
  NCBI__GCF_000430725.1:WP_028999944.1  543 DLYGPARPNSRGIDLSNEHELAKLQAALRQSAGEAWSAEPLLAAGPV-AGEREAVRNPADHGDVVGFVQNA 612 
                                            89****************************************76666.5899******************* PP

                             TIGR01238   72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 
                                            +a++v+ a + a aa+  w at+++ ra +l+r+ad le++mp l++ll+reaGkt  naiaevreavdfl
  NCBI__GCF_000430725.1:WP_028999944.1  613 SAEHVDLAFSHAAAAAGRWAATPPAARADMLDRAADRLEEDMPRLMGLLIREAGKTAANAIAEVREAVDFL 683 
                                            *********************************************************************** PP

                             TIGR01238  143 ryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqea 213 
                                            ryya+qv+++++ +++ ++G+vvcispwnfplaif+Gq++aalaaGn v+akpaeqt+liaa+av +l  a
  NCBI__GCF_000430725.1:WP_028999944.1  684 RYYARQVREDFEPDTHVPVGPVVCISPWNFPLAIFMGQVSAALAAGNPVLAKPAEQTPLIAAEAVRVLWAA 754 
                                            *********************************************************************** PP

                             TIGR01238  214 GvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamiv 281 
                                            Gvp  v+q+lpG+Ge vGa l  d+r++Gv+ftGstevar +++a+a r da+   +pliaetGGqnamiv
  NCBI__GCF_000430725.1:WP_028999944.1  755 GVPRDVLQFLPGAGEVVGARLVGDARVRGVLFTGSTEVARILQRAVAGRLDADgrpIPLIAETGGQNAMIV 825 
                                            ***************************************************97777*************** PP

                             TIGR01238  282 dstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidae 352 
                                            ds+alaeqvv+dvlasafdsaGqrcsalrvlcvqedvadr+++++ Gam e ++g p rl  dvGpvidae
  NCBI__GCF_000430725.1:WP_028999944.1  826 DSSALAEQVVTDVLASAFDSAGQRCSALRVLCVQEDVADRLIEMLLGAMAEWRIGSPDRLAVDVGPVIDAE 896 
                                            *********************************************************************** PP

                             TIGR01238  353 akqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkv 423 
                                            a   l++h+e m+a +++v+q+   d     +gtf+ pt++eld+l+el++evfGpvlhv+ry++++ld +
  NCBI__GCF_000430725.1:WP_028999944.1  897 ALAGLQRHVEGMRASGRRVHQLGACDAGVLGRGTFMLPTVIELDQLSELQREVFGPVLHVLRYQREDLDAL 967 
                                            ************************99999****************************************** PP

                             TIGR01238  424 vdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyr 494 
                                            + ++na+Gyglt+G+h+ri+et+ ++ k  ++Gnvyvnrn+vGavvGvqpfGGeGlsGtGpkaGGplylyr
  NCBI__GCF_000430725.1:WP_028999944.1  968 LAQVNATGYGLTMGLHTRIDETISRVLKASHAGNVYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLYR 1038
                                            *********************************************************************** PP

                             TIGR01238  495 ltrv 498 
                                            l++ 
  NCBI__GCF_000430725.1:WP_028999944.1 1039 LLSR 1042
                                            *986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1248 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 55.59
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory