GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Sinorhizobium medicae WSM419

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate YP_001325787.1 Smed_0092 bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase

Query= reanno::azobra:AZOBR_RS23695
         (1235 letters)



>NCBI__GCF_000017145.1:YP_001325787.1
          Length = 1233

 Score = 1620 bits (4194), Expect = 0.0
 Identities = 853/1227 (69%), Positives = 961/1227 (78%), Gaps = 9/1227 (0%)

Query: 10   AAPGEAAPFADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAAAATAR 69
            AAP   APFADFAPP++P + LR AITAAYRRPE ECLP L E A+       AA+ TAR
Sbjct: 13   AAP---APFADFAPPVQPQSTLRQAITAAYRRPETECLPPLVEAATQSKETREAASRTAR 69

Query: 70   KLITALRAKPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDKIAGGD 129
            KLI ALR K  G GVEGL+ EYSLSSQEG+ALMCLAEALLRIPD ATRDALIRDKIA G+
Sbjct: 70   KLIEALRGKHSGSGVEGLVQEYSLSSQEGVALMCLAEALLRIPDTATRDALIRDKIADGN 129

Query: 130  WQAHLGKGGSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGVDFAMR 189
            W++HLG   S+FVNAATWGL++TGKLTS   +++LS+ALTRLI+R GEP+IRRGVD AMR
Sbjct: 130  WKSHLGGSRSLFVNAATWGLVVTGKLTSTVNDRSLSAALTRLISRCGEPVIRRGVDMAMR 189

Query: 190  MMGEQFVTGQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAIHAIGT 249
            MMGEQFVTG+TI+EAL  ++ +E +GFRYSYDMLGEAA TA DA RYY DY +AIHAIG 
Sbjct: 190  MMGEQFVTGETIEEALKRSKELEGKGFRYSYDMLGEAATTAADAERYYRDYESAIHAIGK 249

Query: 250  ASAGRGVYEGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLNIDAEE 309
            ASAGRG+YEGPGISIKLSA+HPRY+RAQA RVM ELLP+VKALALLA+ YDIGLNIDAEE
Sbjct: 250  ASAGRGIYEGPGISIKLSALHPRYTRAQAVRVMGELLPKVKALALLARKYDIGLNIDAEE 309

Query: 310  ADRLELSLDLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRLMIRLV 369
            ADRLELSLDL+E LC D +L+GWNG+GFVVQAYGKRCP+V+DF+IDLARRSG R+M+RLV
Sbjct: 310  ADRLELSLDLLEELCLDAELSGWNGMGFVVQAYGKRCPFVLDFVIDLARRSGRRIMVRLV 369

Query: 370  KGAYWDSEIKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHNAQTLA 429
            KGAYWD+EIKRAQLDGL DFPV+TRK++TDVSY+ACA KLLAA + +FPQFATHNAQTLA
Sbjct: 370  KGAYWDAEIKRAQLDGLEDFPVFTRKIHTDVSYLACAAKLLAATDVIFPQFATHNAQTLA 429

Query: 430  TIYEMAGSDFQVGKYEFQCLHGMGEPLYKEVV--GPLKRPCRIYAPVGTHETLLAYLVRR 487
             IY MAG DFQVGKYEFQCLHGMGEPLY+EVV  G L RPCRIYAPVGTHETLLAYLVRR
Sbjct: 430  AIYHMAGKDFQVGKYEFQCLHGMGEPLYEEVVGLGNLDRPCRIYAPVGTHETLLAYLVRR 489

Query: 488  LLENGANSSFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERANSA 547
            LLENGANSSFV+RI DP V +DEL+ADPV + R +   GA H  IALP  L+   R NSA
Sbjct: 490  LLENGANSSFVHRINDPKVSIDELIADPVEIVRGMPVVGARHDKIALPAELFGDSRTNSA 549

Query: 548  GIDLSDETELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVTEAS 607
            G+DLS+E  LA L+ AL ASA   WT  P LA G  AG+  PV NP D RDVVGSVTE S
Sbjct: 550  GLDLSNEETLASLTDALKASAANGWTCVPRLATGPVAGETLPVLNPGDHRDVVGSVTETS 609

Query: 608  EALVAEAFGHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKSLPNAI 667
            E     A   A  AA+ WAA  P ERAA L RAA  MQ RMPTLLGLI+REAGKS  NAI
Sbjct: 610  EEDARRAARLAAEAAADWAAVSPSERAACLDRAAGLMQARMPTLLGLIIREAGKSALNAI 669

Query: 668  AEVREAIDFLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAAGNPVL 727
            AEVREAIDFLRYY  Q R R     HRPLGPVVCISPWNFPLAIF+GQIAAAL AGNPVL
Sbjct: 670  AEVREAIDFLRYYAEQTR-RTLGPGHRPLGPVVCISPWNFPLAIFTGQIAAALVAGNPVL 728

Query: 728  AKPAEETPLIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGSTEVAR 787
            AKPAEETPLIAAE VRIL  AGIPA ALQLLPG G +GAALV      GVMFTGSTEVAR
Sbjct: 729  AKPAEETPLIAAEGVRILQEAGIPARALQLLPGDGRIGAALVAAPETAGVMFTGSTEVAR 788

Query: 788  LIQRQLAGRLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALR 847
            LIQ QLA RL   G PIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALR
Sbjct: 789  LIQAQLADRLSLAGRPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALR 848

Query: 848  ILCLQEDVADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMRAKGRN 907
            +LCLQEDVA+R L MLKGA+ EL+IG  DRL+VDVGPVI+ EA+  I  HI  MR  GR 
Sbjct: 849  VLCLQEDVAERILTMLKGALHELQIGRTDRLSVDVGPVITSEAKENIEKHIARMRELGRK 908

Query: 908  VEFLPLPAETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSINATGY 967
            VE + L +ET  GTF+ PT+IE+  + +LEREVFGPVLHV+R+ RD LD LVD INATGY
Sbjct: 909  VEQIGLASETDQGTFVPPTIIELEKLSDLEREVFGPVLHVIRYRRDGLDRLVDDINATGY 968

Query: 968  GLTFGLHTRIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAGGPLYL 1027
            GLTFGLHTR+D TI  VT RI AGN+Y+NRN IGAVVGVQPFGG GLSGTGPKAGGPLYL
Sbjct: 969  GLTFGLHTRLDETIAHVTSRIKAGNLYINRNIIGAVVGVQPFGGRGLSGTGPKAGGPLYL 1028

Query: 1028 SRLLSRRPKGWLEFRGPDAARAAGLAYGEWLRAKGFTAEASRCAGYVARSAIGGGAELNG 1087
             RL++  P                L + +WL  KG  AEA       + SA+G   EL G
Sbjct: 1029 GRLVATAPVP--PQHSSVHTDPVLLDFTKWLDGKGACAEAEAARNAGSSSALGLDLELPG 1086

Query: 1088 PVGERNLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGLPPALA 1147
            PVGERNLY LH RGR+LL+P T +GL  QL A LATGNS  +DA   L   L+ LP  + 
Sbjct: 1087 PVGERNLYTLHARGRILLVPATASGLYHQLAAALATGNSVVIDAASGLQPSLKDLPQTVG 1146

Query: 1148 ARVRTTADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAGRGEGY 1207
             RV  + DW   GP A  LVEGD ER+ A+N+ +A LPGP+LLVQAA++E +A    + Y
Sbjct: 1147 LRVSWSKDWAADGPFAGALVEGDGERIRAVNKAIAALPGPLLLVQAASSEEIAR-NPDAY 1205

Query: 1208 DLDLLLNERSVSVNTAAAGGNASLVAM 1234
             L+ L+ E S S+NTAAAGGNASL+A+
Sbjct: 1206 CLNWLVEEVSASINTAAAGGNASLMAI 1232


Lambda     K      H
   0.319    0.136    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3769
Number of extensions: 166
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1235
Length of database: 1233
Length adjustment: 48
Effective length of query: 1187
Effective length of database: 1185
Effective search space:  1406595
Effective search space used:  1406595
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.3 bits)

Align candidate YP_001325787.1 Smed_0092 (bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.4469.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
     4e-224  730.6   1.0   7.5e-224  729.7   0.1    1.7  2  lcl|NCBI__GCF_000017145.1:YP_001325787.1  Smed_0092 bifunctional proline d


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000017145.1:YP_001325787.1  Smed_0092 bifunctional proline dehydrogenase/pyrroline-5-carboxylate dehydr
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  729.7   0.1  7.5e-224  7.5e-224       1     497 [.     539    1033 ..     539    1036 .. 0.98
   2 ?   -1.3   0.1      0.03      0.03     180     196 ..    1115    1131 ..    1110    1138 .. 0.84

  Alignments for each domain:
  == domain 1  score: 729.7 bits;  conditional E-value: 7.5e-224
                                 TIGR01238    1 dlygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGq 67  
                                                +l+g++r+ns+G+dl+ne++l+sl + l+++aa+ +  +p ++  ++  ge+ pv np d++d+vG 
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  539 ELFGDSRTNSAGLDLSNEETLASLTDALKASAANGWTCVPRLATGPV-AGETLPVLNPGDHRDVVGS 604 
                                                79****************************************76666.58999************** PP

                                 TIGR01238   68 vseadaaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiae 134 
                                                v+e++++++++a   a +a+a w a+ + eraa+l+r+a l++ +mp+l++l++reaGk+  naiae
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  605 VTETSEEDARRAARLAAEAAADWAAVSPSERAACLDRAAGLMQARMPTLLGLIIREAGKSALNAIAE 671 
                                                ******************************************************************* PP

                                 TIGR01238  135 vreavdflryyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsl 201 
                                                vrea+dflryya+q + +l+   +++lG+vvcispwnfplaiftGqiaaal+aGn v+akpae+t+l
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  672 VREAIDFLRYYAEQTRRTLGPG-HRPLGPVVCISPWNFPLAIFTGQIAAALVAGNPVLAKPAEETPL 737 
                                                ********************98.******************************************** PP

                                 TIGR01238  202 iaaravellqeaGvpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap.. 266 
                                                iaa++v +lqeaG+pa ++qllpG G  +Gaal + +  aGv+ftGstevarli+ +la+r      
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  738 IAAEGVRILQEAGIPARALQLLPGDGR-IGAALVAAPETAGVMFTGSTEVARLIQAQLADRLSLAgr 803 
                                                **************************9.*********************************875444 PP

                                 TIGR01238  267 .vpliaetGGqnamivdstalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamde 332 
                                                 +pliaetGGqnamivds+alaeqvv dv+asafdsaGqrcsalrvlc+qedva+r+lt++kGa++e
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  804 pIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRVLCLQEDVAERILTMLKGALHE 870 
                                                4****************************************************************** PP

                                 TIGR01238  333 lkvgkpirlttdvGpvidaeakqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddld 399 
                                                l++g+  rl  dvGpvi +eak+n+++hi +m++ ++kv q+ l++  e+++gtfv+pt++el++l+
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  871 LQIGRTDRLSVDVGPVITSEAKENIEKHIARMRELGRKVEQIGLAS--ETDQGTFVPPTIIELEKLS 935 
                                                ********************************************99..999**************** PP

                                 TIGR01238  400 elkkevfGpvlhvvrykadeldkvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvG 466 
                                                +l++evfGpvlhv+ry++d ld++vd ina+Gyglt+G+h+r +et++++++r+k+Gn+y+nrn++G
  lcl|NCBI__GCF_000017145.1:YP_001325787.1  936 DLEREVFGPVLHVIRYRRDGLDRLVDDINATGYGLTFGLHTRLDETIAHVTSRIKAGNLYINRNIIG 1002
                                                ******************************************************************* PP

                                 TIGR01238  467 avvGvqpfGGeGlsGtGpkaGGplylyrltr 497 
                                                avvGvqpfGG+GlsGtGpkaGGplyl rl+ 
  lcl|NCBI__GCF_000017145.1:YP_001325787.1 1003 AVVGVQPFGGRGLSGTGPKAGGPLYLGRLVA 1033
                                                ****************************986 PP

  == domain 2  score: -1.3 bits;  conditional E-value: 0.03
                                 TIGR01238  180 qiaaalaaGntviakpa 196 
                                                q+aaala+Gn+v+   a
  lcl|NCBI__GCF_000017145.1:YP_001325787.1 1115 QLAAALATGNSVVIDAA 1131
                                                89**********98766 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1233 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 22.35
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory