GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Brucella inopinata BO1

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_008510302.1 BIBO1_RS16940 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::azobra:AZOBR_RS23695
         (1235 letters)



>NCBI__GCF_000182725.1:WP_008510302.1
          Length = 1227

 Score = 1620 bits (4194), Expect = 0.0
 Identities = 849/1226 (69%), Positives = 971/1226 (79%), Gaps = 13/1226 (1%)

Query: 14   EAAPFADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAAAATARKLIT 73
            + A F +FAPPIR  + LR AITAAYRRPE EC+  L EQA+LP        +TARKLI 
Sbjct: 9    KVAVFQNFAPPIREQSALRQAITAAYRRPEAECVSALAEQATLPEETRQQIRSTARKLIE 68

Query: 74   ALRAKPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDKIAGGDWQAH 133
            ALRAK +G GVEGL+HEYSLSSQEG+ALMCLAEALLRIPD ATRDALIRDKI+ GDW++H
Sbjct: 69   ALRAKHKGTGVEGLVHEYSLSSQEGVALMCLAEALLRIPDMATRDALIRDKISNGDWKSH 128

Query: 134  LGKGGSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGVDFAMRMMGE 193
            +G G S+FVNAATWGL++TGKLT+   ++ LS+ALTRLIAR GEP+IRRGVD AMRMMGE
Sbjct: 129  IGGGRSLFVNAATWGLVVTGKLTNTVNDRGLSAALTRLIARCGEPVIRRGVDMAMRMMGE 188

Query: 194  QFVTGQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAIHAIGTASAG 253
            QFVTG+TI EAL  A+T+E  GFRYSYDMLGEAA TA DA RYY DY  AIHAIG ASAG
Sbjct: 189  QFVTGETIDEALKRAKTLEERGFRYSYDMLGEAATTAADAERYYKDYETAIHAIGRASAG 248

Query: 254  RGVYEGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLNIDAEEADRL 313
            RG+Y+GPGISIKLSA+HPRY+RAQ++RVM ELLP+VKALA +AK Y+IGLNIDAEEADRL
Sbjct: 249  RGIYDGPGISIKLSALHPRYTRAQSERVMGELLPKVKALAAIAKSYNIGLNIDAEEADRL 308

Query: 314  ELSLDLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRLMIRLVKGAY 373
            ELSLDL++SLC DPDLAGW+GIGFVVQAYGKRCP+V+DF+IDLARR+  R+M+RLVKGAY
Sbjct: 309  ELSLDLLQSLCEDPDLAGWDGIGFVVQAYGKRCPFVLDFIIDLARRTKRRVMVRLVKGAY 368

Query: 374  WDSEIKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHNAQTLATIYE 433
            WD+EIKRAQ+DGL DFPVYTRKV+TDVSY+ACARKLLAA + +FPQFATHNAQTLATIY 
Sbjct: 369  WDAEIKRAQVDGLEDFPVYTRKVHTDVSYIACARKLLAATDVIFPQFATHNAQTLATIYH 428

Query: 434  MAGSDFQVGKYEFQCLHGMGEPLYKEVVGP--LKRPCRIYAPVGTHETLLAYLVRRLLEN 491
            +AG DF+ GK+EFQCLHGMGEPLY EVVGP  L RP RIYAPVGTHETLLAYLVRRLLEN
Sbjct: 429  LAGPDFKTGKFEFQCLHGMGEPLYDEVVGPEKLGRPARIYAPVGTHETLLAYLVRRLLEN 488

Query: 492  GANSSFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERANSAGIDL 551
            GANSSFVNRI D  V VDEL+ADPV V R++A  GA H  IALP NLY   R NSAG DL
Sbjct: 489  GANSSFVNRIGDKNVSVDELIADPVEVVRSMAVVGARHDQIALPENLYGARR-NSAGFDL 547

Query: 552  SDETELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVTEASEALV 611
            S+E  LA LS  L  +A   WTA P +A  +  G ++PV NP DR DVVG+VTE +EA V
Sbjct: 548  SNEVTLAELSKTLKETAGRAWTAEPQVAGAKVKGVSRPVLNPGDRNDVVGTVTEIAEADV 607

Query: 612  AEAFGHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKSLPNAIAEVR 671
            A+A   A  A  +W+A  P ERAA L RAAD MQ  MP LLGL++REAGKS+PNAIAEVR
Sbjct: 608  AKAMKAAQTATISWSAVAPTERAACLERAADIMQRDMPALLGLVMREAGKSMPNAIAEVR 667

Query: 672  EAIDFLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAAGNPVLAKPA 731
            EAIDFLRYY  Q R R     H+ LGPVVCISPWNFPLAIF+GQIAAAL AGNPVLAKPA
Sbjct: 668  EAIDFLRYYAEQTR-RTLGVGHKALGPVVCISPWNFPLAIFTGQIAAALVAGNPVLAKPA 726

Query: 732  EETPLIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGSTEVARLIQR 791
            EETPLIAAE VRILH  GIPA ALQLLPG G +GAALV      GVMFTGSTEVARLIQ 
Sbjct: 727  EETPLIAAEGVRILHEGGIPADALQLLPGDGRIGAALVAAPETCGVMFTGSTEVARLIQA 786

Query: 792  QLAGRLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRILCL 851
            QLA RLLP+G PIPLIAETGGQNAMIVDSSALAEQVV DVIASAFDSAGQRCSALR+LCL
Sbjct: 787  QLASRLLPNGKPIPLIAETGGQNAMIVDSSALAEQVVFDVIASAFDSAGQRCSALRVLCL 846

Query: 852  QEDVADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMRAKGRNVEFL 911
            QEDVADR L MLKGA+REL IG  D+L VD+GPVI++EA+ TI  HI+AMR  GR VE L
Sbjct: 847  QEDVADRILTMLKGALRELSIGRTDQLKVDIGPVITDEAKNTIEKHIQAMRDLGRKVEQL 906

Query: 912  PLPAETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSINATGYGLTF 971
            PL  ET +GTF+APT+IEI  + +L+REVFGPVLHVVR+ RDD++ L+D IN+TGYGLTF
Sbjct: 907  PLGPETQNGTFVAPTIIEIESLRDLKREVFGPVLHVVRYKRDDMENLIDDINSTGYGLTF 966

Query: 972  GLHTRIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAGGPLYLSRLL 1031
            GLHTR+D TI  V  RI  GN+Y+NRN IGAVVGVQPFGG GLSGTGPKAGGPLYL RL+
Sbjct: 967  GLHTRLDETIANVADRIRVGNIYINRNIIGAVVGVQPFGGRGLSGTGPKAGGPLYLGRLV 1026

Query: 1032 SRRPKGWLEFRGPDAARAAGLA-YGEWLRAKGFT--AEASRCAGYVARSAIGGGAELNGP 1088
               P   +  R       A L  +  WL  +G    A+A+R  G  + SA+G   EL GP
Sbjct: 1027 ETAP---IPPRHASVHTDAALKDFARWLGNRGMNDLAQAARDTG--SASALGLELELPGP 1081

Query: 1089 VGERNLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGLPPALAA 1148
            VGERNLY LH RGRVLL+PQT  GL  QL AVLATGN+A +D    L  +L+ LP  +AA
Sbjct: 1082 VGERNLYALHPRGRVLLVPQTEIGLYRQLTAVLATGNTAVIDEACGLRAVLKDLPETVAA 1141

Query: 1149 RVRTTADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAGRGEGYD 1208
            R   T DW+   P A  L+EGD  R+  +N R+A LPGP++L QAA+ E LAA + + Y 
Sbjct: 1142 RAIWTGDWQADAPFAGALIEGDSARIKEVNSRIAALPGPLVLTQAASPEDLAANQ-DAYC 1200

Query: 1209 LDLLLNERSVSVNTAAAGGNASLVAM 1234
            L+ LL E S S+NT AAGGNASL+A+
Sbjct: 1201 LNWLLEEVSTSINTTAAGGNASLMAI 1226


Lambda     K      H
   0.319    0.136    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3636
Number of extensions: 144
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1235
Length of database: 1227
Length adjustment: 47
Effective length of query: 1188
Effective length of database: 1180
Effective search space:  1401840
Effective search space used:  1401840
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.3 bits)

Align candidate WP_008510302.1 BIBO1_RS16940 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.302764.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   3.3e-224  730.9   0.6   4.6e-224  730.4   0.6    1.2  1  NCBI__GCF_000182725.1:WP_008510302.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000182725.1:WP_008510302.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  730.4   0.6  4.6e-224  4.6e-224       2     497 ..     535    1027 ..     534    1030 .. 0.98

  Alignments for each domain:
  == domain 1  score: 730.4 bits;  conditional E-value: 4.6e-224
                             TIGR01238    2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsead 72  
                                            lyg  r+ns+G dl+ne +l++l++ l+++a + + a p v   ak +g  +pv np dr+d+vG+v+e  
  NCBI__GCF_000182725.1:WP_008510302.1  535 LYGA-RRNSAGFDLSNEVTLAELSKTLKETAGRAWTAEPQV-AGAKVKGVSRPVLNPGDRNDVVGTVTEIA 603 
                                            8998.************************************.67788999********************* PP

                             TIGR01238   73 aaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflr 143 
                                            +a+v +a+++a++a   wsa+ + eraa+ler+ad+++++mp+l++l++reaGk++ naiaevrea+dflr
  NCBI__GCF_000182725.1:WP_008510302.1  604 EADVAKAMKAAQTATISWSAVAPTERAACLERAADIMQRDMPALLGLVMREAGKSMPNAIAEVREAIDFLR 674 
                                            *********************************************************************** PP

                             TIGR01238  144 yyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaG 214 
                                            yya+q + +l+   +kalG+vvcispwnfplaiftGqiaaal+aGn v+akpae+t+liaa++v +l+e G
  NCBI__GCF_000182725.1:WP_008510302.1  675 YYAEQTRRTLGVG-HKALGPVVCISPWNFPLAIFTGQIAAALVAGNPVLAKPAEETPLIAAEGVRILHEGG 744 
                                            ***********98.********************************************************* PP

                             TIGR01238  215 vpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamivd 282 
                                            +pa ++qllpG G  +Gaal + +   Gv+ftGstevarli+ +la+r  ++   +pliaetGGqnamivd
  NCBI__GCF_000182725.1:WP_008510302.1  745 IPADALQLLPGDGR-IGAALVAAPETCGVMFTGSTEVARLIQAQLASRLLPNgkpIPLIAETGGQNAMIVD 814 
                                            *************9.*********************************8765555**************** PP

                             TIGR01238  283 stalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaea 353 
                                            s+alaeqvv dv+asafdsaGqrcsalrvlc+qedvadr+lt++kGa+ el +g+  +l+ d+Gpvi +ea
  NCBI__GCF_000182725.1:WP_008510302.1  815 SSALAEQVVFDVIASAFDSAGQRCSALRVLCLQEDVADRILTMLKGALRELSIGRTDQLKVDIGPVITDEA 885 
                                            *********************************************************************** PP

                             TIGR01238  354 kqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykadeldkvv 424 
                                            k+ +++hi++m++ ++kv q+ l    e+++gtfvapt++e+++l +lk+evfGpvlhvvryk+d++++++
  NCBI__GCF_000182725.1:WP_008510302.1  886 KNTIEKHIQAMRDLGRKVEQLPLGP--ETQNGTFVAPTIIEIESLRDLKREVFGPVLHVVRYKRDDMENLI 954 
                                            ***********************99..9******************************************* PP

                             TIGR01238  425 dkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGplylyrl 495 
                                            d in++Gyglt+G+h+r +et++++ +r++vGn+y+nrn++GavvGvqpfGG+GlsGtGpkaGGplyl rl
  NCBI__GCF_000182725.1:WP_008510302.1  955 DDINSTGYGLTFGLHTRLDETIANVADRIRVGNIYINRNIIGAVVGVQPFGGRGLSGTGPKAGGPLYLGRL 1025
                                            **********************************************************************9 PP

                             TIGR01238  496 tr 497 
                                            ++
  NCBI__GCF_000182725.1:WP_008510302.1 1026 VE 1027
                                            87 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1227 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 41.26
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory