GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Acidovorax caeni R-24608

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_054255670.1 BN2503_RS05615 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::Cup4G11:RR42_RS20125
         (1333 letters)



>NCBI__GCF_001298675.1:WP_054255670.1
          Length = 1335

 Score = 1492 bits (3863), Expect = 0.0
 Identities = 817/1360 (60%), Positives = 971/1360 (71%), Gaps = 57/1360 (4%)

Query: 3    TTTLGVKLDDASRERLKRVAQSIDRTPHWLIKQAIFTYLEQVERGNIPHETSAAGTGSEG 62
            +TT+G+K+DD  RER++  A ++ RTPHWLIKQA+  Y++ +ERG      +  G G   
Sbjct: 2    STTIGIKVDDTLRERIRNAAHNMGRTPHWLIKQAVLQYVDALERGATTIRLT--GLGEPP 59

Query: 63   AADGADAFDGAASDGAIQPFLEFAQSVQPQSVLRAAITAAYRRPESECVPVLLEQARLPH 122
              DGA+           QPFL FAQS+ PQ+ LRAAITAA+ RPE+EC+P LL  AR   
Sbjct: 60   QDDGAEDAPPPPPMETAQPFLVFAQSILPQTPLRAAITAAWHRPETECLPALLPLARA-- 117

Query: 123  QQAEAALAMARTLATRL----RERKVGTGREGLVQGLIQEFSLSSQEGVALMCLAEALLR 178
            Q AE +    R LATRL    R+   G+G    V  L+QEFSLSSQEGVALMCLAEALLR
Sbjct: 118  QDAEQS-GKVRELATRLVQGLRDAPAGSG----VAALVQEFSLSSQEGVALMCLAEALLR 172

Query: 179  IPDKATRDALIRDKISGANWQSHLGQSPSVFVNAATWGLLFTGKLVATHTEAGLSKALTR 238
            IPD+ATRDALIRDKIS  +W+SH+G+SPS+FVNAA WGL+ TGKL +T +E  LS AL+R
Sbjct: 173  IPDRATRDALIRDKISKGDWKSHVGRSPSLFVNAAAWGLVLTGKLTSTSSEKSLSAALSR 232

Query: 239  IIGKGGEPLIRKGVDMAMRLMGEQFVTGETISEALANARKYEAEGFRYSYDMLGEAAMTE 298
            +IGKGGEPLIR+GV  AM+LMGEQFVTG+ I+EALAN+R YE +GFRYSYDMLGEAA T+
Sbjct: 233  VIGKGGEPLIRQGVHRAMKLMGEQFVTGQNIAEALANSRTYEKQGFRYSYDMLGEAAATD 292

Query: 299  ADAQRYLASYEQAINAIGQASRGRGIYEGPGISIKLSALHPRYSRAQHERVIGELYGRLK 358
            ADAQRYL +YEQAI+AIG AS GRGI+EGPGISIKLSALHPRYSRAQ++RV+ EL  R+ 
Sbjct: 293  ADAQRYLQAYEQAIHAIGAASNGRGIFEGPGISIKLSALHPRYSRAQYDRVMAELLPRVL 352

Query: 359  SLTLLARQYDIGINIDAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQGYQKRCPFVI 418
             L  LA+QYDIG+NIDAEEADRLE+SLDL+E LC  P L GW+GIGFVVQ YQKRCP VI
Sbjct: 353  RLAELAKQYDIGMNIDAEEADRLELSLDLMEALCAAPSLKGWSGIGFVVQAYQKRCPHVI 412

Query: 419  DYLIDLARRSRHRLMIRLVKGAYWDSEIKRAQVDGLEGYPVYTRKVYTDVSYVACARKLL 478
            DYL+DLARRSR RLM+RLVKGAYWDSEIKRAQ+DGL GYPVYTRKVYTDVSY+ACARKLL
Sbjct: 413  DYLVDLARRSRRRLMVRLVKGAYWDSEIKRAQLDGLAGYPVYTRKVYTDVSYLACARKLL 472

Query: 479  SVPDVIYPQFATHNAHTLAAIYQIA---GHNYYPGQYEFQCLHGMGEPLYDQVVGPLADG 535
              PD IYPQFATHNA TLA+IY +A   G +YY GQYEFQCLHGMGEPLY QV G  ADG
Sbjct: 473  EAPDAIYPQFATHNAQTLASIYHLAQSVGGSYYSGQYEFQCLHGMGEPLYAQVTGTAADG 532

Query: 536  KFNRPCRIYAPVGTHETLLAYLVRRLLENGANTSFVNRIADDTISLDELVADPVAVVEQM 595
            K  RPCRIYAPVG+HETLLAYLVRRLLENGANTSFVNRI D ++ + ELV DPV    ++
Sbjct: 533  KLARPCRIYAPVGSHETLLAYLVRRLLENGANTSFVNRIGDASVPIAELVTDPVEDALRI 592

Query: 596  HADEGALGLPHPRIAQPRTLY----GESRANSAGIDLSNEHRLASLSSALLAGTSEAVSA 651
               EG LG PHPRIA P  L+     +SR NS G++L++E +LASL++ALL  T +   A
Sbjct: 593  ANQEGRLGAPHPRIALPADLFADLGAQSRPNSHGLNLAHEQQLASLAAALLYSTRQTYLA 652

Query: 652  VPLLGTEAAAGEDVNQPAPVRNPSDQRDVVGHVTEASMAEVEAALQAAVNAAPIWQATPA 711
             P      A         P+RNP++  D VG V EA+  +V+AA + A  AAPIW  TP 
Sbjct: 653  APPGVPLPADPSRTPGWQPLRNPAELSDTVGWVYEAAAHDVQAACERAAQAAPIWAGTPP 712

Query: 712  DVRAAALERAAELMEAQMQSLMGIIVREAGKTFSNAIAEVREAVDFLRYYAAQVRETFSS 771
              RA AL+RAA+L+E + Q LMG+I+REAGKT  NA+AE+REAVDFLRYY AQV   F +
Sbjct: 713  ATRADALQRAADLLEQRSQPLMGLIMREAGKTLPNAVAEIREAVDFLRYYGAQVATQFDN 772

Query: 772  DTHRPLGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRLLREAGV 831
               RPLG V+ ISPWNFPLAIF GQVAAALAAGNTVLAKPAEQTPL AA  V LL EAGV
Sbjct: 773  AAQRPLGVVLAISPWNFPLAIFCGQVAAALAAGNTVLAKPAEQTPLTAAAMVALLHEAGV 832

Query: 832  PAGAVQLLPGRGETVGAALVGDARVKGVMFTGSTEVARLLQRSVAGRLDAAGRPVPLIAE 891
            P  A+QL+PG+GE+VGAALV   +V GVMFTGSTEVARL+ R ++ RL   G+ +PL+AE
Sbjct: 833  PRDALQLVPGQGESVGAALVAHPQVAGVMFTGSTEVARLIARQLSTRLSPTGQAIPLVAE 892

Query: 892  TGGQNAMIVDSSALAEQVVGDVVNSAFDSAGQRCSALRVLCLQEEVADRVLEMLKGAMDE 951
            TGGQNAM+VDSSALAEQVV DV+ SAFDSAGQRCSALR+LCLQ++VADR L ML+ A+ E
Sbjct: 893  TGGQNAMVVDSSALAEQVVADVLASAFDSAGQRCSALRLLCLQDDVADRTLTMLRDALQE 952

Query: 952  LTMGNPDRLSTDVGPVIDEEARGNIVRHIDAMRAKGRRVHQAD-PNGALSAACRNGTFVS 1010
             T+GNPDRL TDVGPVID EAR  I  HI  M   G+ V + +  +GAL     NG FV+
Sbjct: 953  WTLGNPDRLHTDVGPVIDAEARAQIEAHIARMADAGQTVTRVERTDGAL-----NGHFVA 1007

Query: 1011 PTLIELDSIEELQREVFGPVLHVVRYPRAGLDTLLAQINGTGYGLTMGIHTRIDETIEHI 1070
            P +IE+DS   L REVFGPVLHV+RYPR  LD LL  IN TGYGLT G+H+RIDETI+H+
Sbjct: 1008 PAIIEIDSTSRLTREVFGPVLHVIRYPREQLDALLDGINATGYGLTFGVHSRIDETIQHL 1067

Query: 1071 VERAEVGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLHRLLSVCPLDAVARVVR 1130
             ER   GNLYVNRN++GAVVGVQPFGG GLSGTGPKAGGPLYLHRL+   P +    ++ 
Sbjct: 1068 SERIHAGNLYVNRNVIGAVVGVQPFGGMGLSGTGPKAGGPLYLHRLVH-GPANTALALLP 1126

Query: 1131 ASDTVGGADETGPVRRTLTETL-ATLKEWAQRESAALPGLVAACERFAAASAAGLSVTLP 1189
             + ++        +RR    TL     E AQ  +         CE     S  G S+ LP
Sbjct: 1127 PTPSLSEHPALLLLRRLRQTTLPLPAPEQAQAHT--------TCEAALTTSRLGASLLLP 1178

Query: 1190 GPTGERNTYTLLPRAAVLCLAQQETDLAVQLAAVLAAGS-------------QAVWVESP 1236
            GPTGE N Y LLPR  V  L +    L  Q+AA LA+G+              AVW ++ 
Sbjct: 1179 GPTGESNRYRLLPRGPVWALPRTPLGLVAQVAAALASGNPCHMVLPQDDNGCAAVW-QAL 1237

Query: 1237 MARALFARLPKAVQSRVRLVADWSAGDTGFDAVLHHGDSDQLRAVCEQLATRPGPIISVQ 1296
             A A  A +     +    +AD   G     A+L  GD D L   C  +A RPGP++ V+
Sbjct: 1238 RAAAGDAGVAWLHSAEGAALAD---GAIPVAALLFEGDGDALLQACRAVAARPGPLVRVE 1294

Query: 1297 GLAHGE----PNIAIERLLIERSLSVNTAAAGGNASLMTI 1332
             L   E        +  L  E+S+S NTAAAGGNA LMT+
Sbjct: 1295 SLGSDELQAGQGYDLAALCHEQSISTNTAAAGGNAQLMTM 1334


Lambda     K      H
   0.318    0.133    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 4121
Number of extensions: 175
Number of successful extensions: 10
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1333
Length of database: 1335
Length adjustment: 49
Effective length of query: 1284
Effective length of database: 1286
Effective search space:  1651224
Effective search space used:  1651224
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 59 (27.3 bits)

Align candidate WP_054255670.1 BN2503_RS05615 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.3686508.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   3.5e-218  711.0   0.7   7.1e-218  710.0   0.7    1.5  1  NCBI__GCF_001298675.1:WP_054255670.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001298675.1:WP_054255670.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  710.0   0.7  7.1e-218  7.1e-218       2     497 ..     616    1115 ..     615    1118 .. 0.97

  Alignments for each domain:
  == domain 1  score: 710.0 bits;  conditional E-value: 7.1e-218
                             TIGR01238    2 lygegrknslGvdlaneselksleeqllkaaakkfqaapi...vgekakaegeaqpvknpadrkdivGqvs 69  
                                            l +++r ns G++la e++l+sl + ll +  +++ aap    +  ++      qp++npa+  d vG v 
  NCBI__GCF_001298675.1:WP_054255670.1  616 LGAQSRPNSHGLNLAHEQQLASLAAALLYSTRQTYLAAPPgvpLPADPSRTPGWQPLRNPAELSDTVGWVY 686 
                                            66899*******************************998421233666778889***************** PP

                             TIGR01238   70 eadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavd 140 
                                            ea a +vq+a + a +a+++w  t+++ ra  l+r+adlle++   l++l++reaGktl na+ae+reavd
  NCBI__GCF_001298675.1:WP_054255670.1  687 EAAAHDVQAACERAAQAAPIWAGTPPATRADALQRAADLLEQRSQPLMGLIMREAGKTLPNAVAEIREAVD 757 
                                            *********************************************************************** PP

                             TIGR01238  141 flryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellq 211 
                                            flryy+ qv  ++d+ ++++lG+v++ispwnfplaif Gq+aaalaaGntv+akpaeqt+l aa  v+ll+
  NCBI__GCF_001298675.1:WP_054255670.1  758 FLRYYGAQVATQFDNAAQRPLGVVLAISPWNFPLAIFCGQVAAALAAGNTVLAKPAEQTPLTAAAMVALLH 828 
                                            *********************************************************************** PP

                             TIGR01238  212 eaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnam 279 
                                            eaGvp  ++ql+pG+Ge+vGaal +++++aGv+ftGstevarli ++l+ r  +    +pl+aetGGqnam
  NCBI__GCF_001298675.1:WP_054255670.1  829 EAGVPRDALQLVPGQGESVGAALVAHPQVAGVMFTGSTEVARLIARQLSTRLSPTgqaIPLVAETGGQNAM 899 
                                            ****************************************************9887889************ PP

                             TIGR01238  280 ivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvid 350 
                                            +vds+alaeqvvadvlasafdsaGqrcsalr+lc+q+dvadr+lt+++ a++e  +g+p rl+tdvGpvid
  NCBI__GCF_001298675.1:WP_054255670.1  900 VVDSSALAEQVVADVLASAFDSAGQRCSALRLLCLQDDVADRTLTMLRDALQEWTLGNPDRLHTDVGPVID 970 
                                            *********************************************************************** PP

                             TIGR01238  351 aeakqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeld 421 
                                            aea+ +++ahi +m + +++v++v + d   + +g fvap ++e+d+ + l +evfGpvlhv+ry +++ld
  NCBI__GCF_001298675.1:WP_054255670.1  971 AEARAQIEAHIARMADAGQTVTRVERTD--GALNGHFVAPAIIEIDSTSRLTREVFGPVLHVIRYPREQLD 1039
                                            **************************99..8899************************************* PP

                             TIGR01238  422 kvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplyl 492 
                                             ++d ina+Gyglt+Gvhsri+et++++ +r+++Gn+yvnrn++GavvGvqpfGG GlsGtGpkaGGplyl
  NCBI__GCF_001298675.1:WP_054255670.1 1040 ALLDGINATGYGLTFGVHSRIDETIQHLSERIHAGNLYVNRNVIGAVVGVQPFGGMGLSGTGPKAGGPLYL 1110
                                            *********************************************************************** PP

                             TIGR01238  493 yrltr 497 
                                            +rl++
  NCBI__GCF_001298675.1:WP_054255670.1 1111 HRLVH 1115
                                            **985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1335 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 77.57
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory