GapMind for catabolism of small carbon sources

 

Alignments for a candidate for rocA in Sinorhizobium fredii NGR234

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_012706686.1 NGR_RS11860 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::azobra:AZOBR_RS23695
         (1235 letters)



>NCBI__GCF_000018545.1:WP_012706686.1
          Length = 1235

 Score = 1656 bits (4288), Expect = 0.0
 Identities = 869/1232 (70%), Positives = 974/1232 (79%), Gaps = 9/1232 (0%)

Query: 5    TAPPSAAPGEAAPFADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAA 64
            T    AAP   APFADFAPPIR  T LR AITAAYRRPE ECLP L E A+LP G   AA
Sbjct: 10   TISTDAAP---APFADFAPPIRTQTTLRQAITAAYRRPETECLPPLVEAATLPRGTRDAA 66

Query: 65   AATARKLITALRAKPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDK 124
            A TARKL+ ALRAK +G GVEGL+ EYSLSSQEG+ALMCLAEALLRIPD ATRDALIRDK
Sbjct: 67   AGTARKLVEALRAKHKGSGVEGLVQEYSLSSQEGVALMCLAEALLRIPDTATRDALIRDK 126

Query: 125  IAGGDWQAHLGKGGSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGV 184
            I+ G+W++HLG G S+FVNAATWGL++TGKLTS   +++L++ALTRLI+R GEP+IRRGV
Sbjct: 127  ISDGNWKSHLGGGRSLFVNAATWGLVVTGKLTSTVNDRSLAAALTRLISRCGEPVIRRGV 186

Query: 185  DFAMRMMGEQFVTGQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAI 244
            D AMRMMGEQFVTG+TI EAL  AR +E +GFRYSYDMLGEAA TA DA RYY DY  AI
Sbjct: 187  DMAMRMMGEQFVTGETIDEALRRARALEQKGFRYSYDMLGEAATTAADAERYYKDYEAAI 246

Query: 245  HAIGTASAGRGVYEGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLN 304
            HAIG ASAGRG+YEGPGISIKLSA+HPRY+R QA RVM ELLP+VKALALLAK YDIG N
Sbjct: 247  HAIGKASAGRGIYEGPGISIKLSALHPRYARTQAARVMGELLPKVKALALLAKTYDIGFN 306

Query: 305  IDAEEADRLELSLDLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRL 364
            IDAEEADRLELSLDL+E LC D DLA WNG+GFVVQAYGKRCP+V+DF+IDLARR+G R+
Sbjct: 307  IDAEEADRLELSLDLLEELCLDSDLADWNGMGFVVQAYGKRCPFVLDFIIDLARRAGRRI 366

Query: 365  MIRLVKGAYWDSEIKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHN 424
            M+RLVKGAYWD+EIKRAQLDGL DFPV+TRK++TDVSY+ACARKLL+A +AVFPQFATHN
Sbjct: 367  MVRLVKGAYWDAEIKRAQLDGLEDFPVFTRKIHTDVSYIACARKLLSATDAVFPQFATHN 426

Query: 425  AQTLATIYEMAGSDFQVGKYEFQCLHGMGEPLYKEVVG--PLKRPCRIYAPVGTHETLLA 482
            AQTLATI+ MAG DF VGKYEFQCLHGMGEPLY+EVVG   L RPCRIYAPVGTHETLLA
Sbjct: 427  AQTLATIHHMAGKDFHVGKYEFQCLHGMGEPLYEEVVGRENLGRPCRIYAPVGTHETLLA 486

Query: 483  YLVRRLLENGANSSFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPE 542
            YLVRRLLENGANSSFV+RIADP V +D L+ADPV + RA+   GA H  IALP  L+   
Sbjct: 487  YLVRRLLENGANSSFVHRIADPGVSIDALIADPVEIVRAMPVVGAKHEKIALPAELFGAA 546

Query: 543  RANSAGIDLSDETELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGS 602
            R NSAG+D+S+E  LA L+  L ASA + WTAAP LA G  +G+ +PV NP D RD VGS
Sbjct: 547  RPNSAGLDISNEATLAALTETLKASAAIGWTAAPQLATGAASGETRPVVNPGDHRDRVGS 606

Query: 603  VTEASEALVAEAFGHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKS 662
            VTE SE     A   A  AA +WAA  P ERAA L RAAD MQ RMPTLLGLIVREAGKS
Sbjct: 607  VTETSEEDAKRAVRLAAEAAPSWAAVSPAERAACLDRAADLMQARMPTLLGLIVREAGKS 666

Query: 663  LPNAIAEVREAIDFLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAA 722
            L NAIAEVREAIDFLRYY  Q R     A HRPLGPV+CISPWNFPLAIF+GQIAAAL A
Sbjct: 667  LLNAIAEVREAIDFLRYYAEQTRRTLGQA-HRPLGPVICISPWNFPLAIFTGQIAAALVA 725

Query: 723  GNPVLAKPAEETPLIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGS 782
            GNPVLAKPAEETPLIAAE VRILH AG+PA ALQLLPG G VGAALV      GVMFTGS
Sbjct: 726  GNPVLAKPAEETPLIAAEGVRILHEAGVPASALQLLPGDGRVGAALVAAPQTAGVMFTGS 785

Query: 783  TEVARLIQRQLAGRLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQR 842
            TEVARLIQ QLA RL P G PIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQR
Sbjct: 786  TEVARLIQAQLADRLSPAGRPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQR 845

Query: 843  CSALRILCLQEDVADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMR 902
            CSALR+LCLQED+ADRTLAMLKGA+ EL IG  DRL+VDVGPVIS EA+  I  HIE MR
Sbjct: 846  CSALRVLCLQEDIADRTLAMLKGALHELNIGRTDRLSVDVGPVISAEAKDIIETHIERMR 905

Query: 903  AKGRNVEFLPLPAETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSI 962
              GR VE + L AET  GTF+ PT+IE+  + +L+REVFGPVLHV+R+ RD+LD L+D I
Sbjct: 906  GLGRKVEQIGLAAETEAGTFVPPTIIELEKLADLQREVFGPVLHVIRYRRDNLDRLIDDI 965

Query: 963  NATGYGLTFGLHTRIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAG 1022
            NATGYGLTFGLHTR+D TI  VT RI AGN+YVNRN IGAVVGVQPFGG GLSGTGPKAG
Sbjct: 966  NATGYGLTFGLHTRLDETIAHVTSRIKAGNLYVNRNIIGAVVGVQPFGGRGLSGTGPKAG 1025

Query: 1023 GPLYLSRLLSRRPKGWLEFRGPDAARAAGLAYGEWLRAKGFTAEASRCAGYVARSAIGGG 1082
            GPLYL RL++  P                L + +WL  KG TAEA       + SA+G  
Sbjct: 1026 GPLYLGRLVTPAPVP--PQHSSVHIDPTLLDFAKWLDGKGATAEAETARNAGSSSALGLD 1083

Query: 1083 AELNGPVGERNLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGL 1142
             EL GPVGERNLY LH RGRVLL+P T +GL  QL A LATGNS  VDA   L   L+ L
Sbjct: 1084 LELPGPVGERNLYALHPRGRVLLVPATESGLYHQLAAALATGNSVVVDAACGLQASLKSL 1143

Query: 1143 PPALAARVRTTADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAG 1202
            P ++A RV  + DW   GP A  LVEGD ER+  +N+ +A LPGP++LVQAA+ E +A  
Sbjct: 1144 PHSVALRVSWSKDWAADGPFAGALVEGDAERIRQVNKAIAALPGPLVLVQAASGEEIAR- 1202

Query: 1203 RGEGYDLDLLLNERSVSVNTAAAGGNASLVAM 1234
              + Y L+ L+ E S S+NTAAAGGNASL+ +
Sbjct: 1203 NPDAYCLNWLVEEVSTSINTAAAGGNASLMTI 1234


Lambda     K      H
   0.319    0.136    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3764
Number of extensions: 168
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1235
Length of database: 1235
Length adjustment: 48
Effective length of query: 1187
Effective length of database: 1187
Effective search space:  1408969
Effective search space used:  1408969
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.3 bits)

Align candidate WP_012706686.1 NGR_RS11860 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.1596480.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   1.5e-230  751.8   2.7   1.6e-230  751.7   0.7    1.9  2  NCBI__GCF_000018545.1:WP_012706686.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000018545.1:WP_012706686.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  751.7   0.7  1.6e-230  1.6e-230       1     496 [.     541    1034 ..     541    1038 .. 0.99
   2 ?   -0.9   0.1     0.023     0.023     180     196 ..    1117    1133 ..    1113    1146 .. 0.82

  Alignments for each domain:
  == domain 1  score: 751.7 bits;  conditional E-value: 1.6e-230
                             TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsea 71  
                                            +l+g +r ns+G+d++ne +l+ l e l+++aa  + aap +     a+ge++pv+np d++d vG v+e+
  NCBI__GCF_000018545.1:WP_012706686.1  541 ELFGAARPNSAGLDISNEATLAALTETLKASAAIGWTAAPQL-ATGAASGETRPVVNPGDHRDRVGSVTET 610 
                                            79****************************************.67789*********************** PP

                             TIGR01238   72 daaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdfl 142 
                                            +++++++av  a +a++ w a+ ++eraa+l+r+adl++ +mp+l++l+vreaGk+l naiaevrea+dfl
  NCBI__GCF_000018545.1:WP_012706686.1  611 SEEDAKRAVRLAAEAAPSWAAVSPAERAACLDRAADLMQARMPTLLGLIVREAGKSLLNAIAEVREAIDFL 681 
                                            *********************************************************************** PP

                             TIGR01238  143 ryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqea 213 
                                            ryya+q + +l++  +++lG+v+cispwnfplaiftGqiaaal+aGn v+akpae+t+liaa++v +l+ea
  NCBI__GCF_000018545.1:WP_012706686.1  682 RYYAEQTRRTLGQA-HRPLGPVICISPWNFPLAIFTGQIAAALVAGNPVLAKPAEETPLIAAEGVRILHEA 751 
                                            ************99.******************************************************** PP

                             TIGR01238  214 GvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamiv 281 
                                            Gvpa+++qllpG G  vGaal + ++ aGv+ftGstevarli+ +la+r  +    +pliaetGGqnamiv
  NCBI__GCF_000018545.1:WP_012706686.1  752 GVPASALQLLPGDGR-VGAALVAAPQTAGVMFTGSTEVARLIQAQLADRLSPAgrpIPLIAETGGQNAMIV 821 
                                            **************9.*********************************9986666*************** PP

                             TIGR01238  282 dstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidae 352 
                                            ds+alaeqvv dv+asafdsaGqrcsalrvlc+qed+adr+l ++kGa++el++g+  rl  dvGpvi ae
  NCBI__GCF_000018545.1:WP_012706686.1  822 DSSALAEQVVGDVIASAFDSAGQRCSALRVLCLQEDIADRTLAMLKGALHELNIGRTDRLSVDVGPVISAE 892 
                                            *********************************************************************** PP

                             TIGR01238  353 akqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkv 423 
                                            ak+ +++hie+m++ ++kv q+ l++  e+e gtfv+pt++el++l++l++evfGpvlhv+ry++d+ld++
  NCBI__GCF_000018545.1:WP_012706686.1  893 AKDIIETHIERMRGLGRKVEQIGLAA--ETEAGTFVPPTIIELEKLADLQREVFGPVLHVIRYRRDNLDRL 961 
                                            *************************9..999**************************************** PP

                             TIGR01238  424 vdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyr 494 
                                            +d ina+Gyglt+G+h+r +et++++++r+k+Gn+yvnrn++GavvGvqpfGG+GlsGtGpkaGGplyl r
  NCBI__GCF_000018545.1:WP_012706686.1  962 IDDINATGYGLTFGLHTRLDETIAHVTSRIKAGNLYVNRNIIGAVVGVQPFGGRGLSGTGPKAGGPLYLGR 1032
                                            **********************************************************************9 PP

                             TIGR01238  495 lt 496 
                                            l+
  NCBI__GCF_000018545.1:WP_012706686.1 1033 LV 1034
                                            97 PP

  == domain 2  score: -0.9 bits;  conditional E-value: 0.023
                             TIGR01238  180 qiaaalaaGntviakpa 196 
                                            q+aaala+Gn+v+   a
  NCBI__GCF_000018545.1:WP_012706686.1 1117 QLAAALATGNSVVVDAA 1133
                                            89**********98766 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1235 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 63.08
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory