GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Amantichitinum ursilacus IGB-41

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_053939191.1 WG78_RS17855 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::Cup4G11:RR42_RS20125
         (1333 letters)



>NCBI__GCF_001294205.1:WP_053939191.1
          Length = 1316

 Score = 1925 bits (4988), Expect = 0.0
 Identities = 1008/1344 (75%), Positives = 1120/1344 (83%), Gaps = 41/1344 (3%)

Query: 1    MATTTLGVKLDDASRERLKRVAQSIDRTPHWLIKQAIFTYLEQVERGN----IPHETSAA 56
            M+TTTLGVKLDD +R+RLK  AQ ++RTPHWLIKQAIF  L+QVERG+    +P E  AA
Sbjct: 1    MSTTTLGVKLDDLTRDRLKTAAQQLERTPHWLIKQAIFNLLDQVERGHPVVALPDEADAA 60

Query: 57   GTGSEGAADGADAFDGAASDGAIQPFLEFAQSVQPQSVLRAAITAAYRRPESECVPVLLE 116
             T            +    DG++QPFLEFAQ+VQPQSVLRAAITAA+RRPE+EC+P+LL+
Sbjct: 61   ATD-----------EAPKRDGSVQPFLEFAQNVQPQSVLRAAITAAWRRPETECIPLLLQ 109

Query: 117  QARLPHQQAEAALAMARTLATRLRERKVGTGREGLVQGLIQEFSLSSQEGVALMCLAEAL 176
            QAR+P  QA+AA  +A TLA +LR +  G  REGLVQGLI EFSLSSQEGVALMCLAEAL
Sbjct: 110  QARMPEAQAKAATQLAHTLAGKLRAQNPGASREGLVQGLIHEFSLSSQEGVALMCLAEAL 169

Query: 177  LRIPDKATRDALIRDKISGANWQSHLGQSPSVFVNAATWGLLFTGKLVATHTEAGLSKAL 236
            LRIPDKATRD+LIRDKI+  NW SH+GQSPS+FVNAATWGLL TGKLV+THTE+GLS AL
Sbjct: 170  LRIPDKATRDSLIRDKIARGNWHSHIGQSPSLFVNAATWGLLLTGKLVSTHTESGLSTAL 229

Query: 237  TRIIGKGGEPLIRKGVDMAMRLMGEQFVTGETISEALANARKYEAEGFRYSYDMLGEAAM 296
            TRIIGK GEP+IRKGVDMAMRLMGEQFVTGETISEALANAR +E  GFRYSYDMLGEAAM
Sbjct: 230  TRIIGKRGEPVIRKGVDMAMRLMGEQFVTGETISEALANARNFEGHGFRYSYDMLGEAAM 289

Query: 297  TEADAQRYLASYEQAINAIGQASRGRGIYEGPGISIKLSALHPRYSRAQHERVIGELYGR 356
            T+ADA  Y+ASYEQAI AIGQAS GRGIYEGPGISIKLSALHPRYSRAQHERVI ELY R
Sbjct: 290  TDADAHAYMASYEQAIRAIGQASGGRGIYEGPGISIKLSALHPRYSRAQHERVISELYPR 349

Query: 357  LKSLTLLARQYDIGINIDAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQGYQKRCPF 416
            +K+LT+LARQ+DIGINIDAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQ YQKRCP+
Sbjct: 350  VKALTVLARQFDIGINIDAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQAYQKRCPY 409

Query: 417  VIDYLIDLARRSRHRLMIRLVKGAYWDSEIKRAQVDGLEGYPVYTRKVYTDVSYVACARK 476
            V+DY+IDLA+RS HRLMIRLVKGAYWDSEIKRAQ+DGLEGYPVYTRKVYTDVSY+ACARK
Sbjct: 410  VLDYIIDLAKRSGHRLMIRLVKGAYWDSEIKRAQIDGLEGYPVYTRKVYTDVSYLACARK 469

Query: 477  LLSVPDVIYPQFATHNAHTLAAIYQIAGHNYYPGQYEFQCLHGMGEPLYDQVVGPLADGK 536
            LL  P+ +YPQFATHNAHTLAAIYQ+AG NYYPGQYEFQCLHGMGEPLY+ VVG  A+GK
Sbjct: 470  LLGAPEAVYPQFATHNAHTLAAIYQMAGQNYYPGQYEFQCLHGMGEPLYEHVVGKAAEGK 529

Query: 537  FNRPCRIYAPVGTHETLLAYLVRRLLENGANTSFVNRIADDTISLDELVADPVAVVEQMH 596
             NRPCRIYAPVGTHETLLAYLVRRLLENGANTSFVNRIAD +I+L++LVADPV VVEQM 
Sbjct: 530  LNRPCRIYAPVGTHETLLAYLVRRLLENGANTSFVNRIADQSIALEDLVADPVVVVEQMA 589

Query: 597  ADEGALGLPHPRIAQPRTLYGESRANSAGIDLSNEHRLASLSSALLAGTSEAVSAVPLLG 656
              EG  GLPHP I  PR LYGE+RANSAG+DL+NE RLASLS+ALL  T  A  A P   
Sbjct: 590  RQEGTQGLPHPSIPLPRALYGEARANSAGLDLANEQRLASLSAALL--TDHAWLATP--- 644

Query: 657  TEAAAGEDVNQPAPVRNPSDQRDVVGHVTEASMAEVEAALQAAVNAAPIWQATPADVRAA 716
              AAAGE       VRNP+D RDVVG+V EA++ +V AAL  A NA PIWQATPA  RAA
Sbjct: 645  PHAAAGE----KQAVRNPADHRDVVGYVVEAALDDVAAALTRAQNAGPIWQATPAVTRAA 700

Query: 717  ALERAAELMEAQMQSLMGIIVREAGKTFSNAIAEVREAVDFLRYYAAQVRETFSSDTHRP 776
             LERAA+LMEA+MQ+LMG+IVREAGKT SNAIAEVREAVDFLRYYAAQVR +FS+DTHRP
Sbjct: 701  LLERAADLMEAEMQTLMGLIVREAGKTLSNAIAEVREAVDFLRYYAAQVRGSFSNDTHRP 760

Query: 777  LGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRLLREAGVPAGAV 836
            LGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVR+L EAGVP  AV
Sbjct: 761  LGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRILHEAGVPQDAV 820

Query: 837  QLLPGRGETVGAALVGDARVKGVMFTGSTEVARLLQRSVAGRLDAAGRPVPLIAETGGQN 896
            QLLPG+GETVGAALVGDARVKGVMFTGSTEVAR+LQR++AGRLDA G+P+PLIAETGGQN
Sbjct: 821  QLLPGQGETVGAALVGDARVKGVMFTGSTEVARILQRNLAGRLDAHGQPIPLIAETGGQN 880

Query: 897  AMIVDSSALAEQVVGDVVNSAFDSAGQRCSALRVLCLQEEVADRVLEMLKGAMDELTMGN 956
            AMIVDSSALAEQVVGDV+ SAFDSAGQRCSALRVLCLQE+VADRVL MLKG + EL +GN
Sbjct: 881  AMIVDSSALAEQVVGDVLASAFDSAGQRCSALRVLCLQEDVADRVLTMLKGGLAELVVGN 940

Query: 957  PDRLSTDVGPVIDEEARGNIVRHIDAMRAKGRRVHQADPNGALSAACRNGTFVSPTLIEL 1016
            PDRL+TDVGPVID EA+ NI  HIDAMRA+GR+VHQ        AAC  G+FV PTLIEL
Sbjct: 941  PDRLATDVGPVIDAEAQRNINSHIDAMRARGRKVHQVQ----TGAACNQGSFVPPTLIEL 996

Query: 1017 DSIEELQREVFGPVLHVVRYPR----AGLDTLLAQINGTGYGLTMGIHTRIDETIEHIVE 1072
            DS+ EL+RE+FGPVLHVVR+ R    AGLD L+  INGTGYGLT+GIHTRIDETI HIVE
Sbjct: 997  DSLAELEREIFGPVLHVVRWQRTADQAGLDALIDDINGTGYGLTLGIHTRIDETIAHIVE 1056

Query: 1073 RAEVGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLHRLLSVCPLDAVARVVRAS 1132
            RA+VGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLHRLLS CP D +   V  +
Sbjct: 1057 RAQVGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLHRLLSSCPRDTLHAAVLLT 1116

Query: 1133 DTVGGADETGPVRRTLTETLATLKEWAQRESAALPGLVAACERFAAASAAGLSVTLPGPT 1192
               G   ET   R  L + L  L +W+ + +   P  VA CER +AA+A G +VTLPGPT
Sbjct: 1117 SGNGAPFETA-AREQLLQPLQALAQWSAQHA---PQHVALCERLSAATAVGAAVTLPGPT 1172

Query: 1193 GERNTYTLLPRAAVLCLAQ---QETDLAVQLAAVLAAGSQAVWVESPMARALFARLPKAV 1249
            GERNTYTLLPR  VLC         DL  Q+A VLA GS+AV ++ P A A+   LPKAV
Sbjct: 1173 GERNTYTLLPREHVLCALPANGSRDDLLAQIATVLAVGSKAVVLDGPYA-AVVRELPKAV 1231

Query: 1250 QSRVRL-VADWSAGDTGFDAVLHHGDSDQLRAVCEQLATRPGPIISVQGLAHGEPNIAIE 1308
            Q+R+ +  A         DAVLHHGD+DQL A+   +A R GPI+SVQGLA GE   A+E
Sbjct: 1232 QARIDVQPAGSDLATLNIDAVLHHGDADQLLALTAAMAQRSGPIVSVQGLAPGEDGFALE 1291

Query: 1309 RLLIERSLSVNTAAAGGNASLMTI 1332
            RLLIERSLSVNTAAAGGNASLMTI
Sbjct: 1292 RLLIERSLSVNTAAAGGNASLMTI 1315


Lambda     K      H
   0.318    0.133    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 4307
Number of extensions: 167
Number of successful extensions: 10
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1333
Length of database: 1316
Length adjustment: 49
Effective length of query: 1284
Effective length of database: 1267
Effective search space:  1626828
Effective search space used:  1626828
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 59 (27.3 bits)

Align candidate WP_053939191.1 WG78_RS17855 (trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase)
to HMM TIGR01238 (delta-1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01238.hmm
# target sequence database:        /tmp/gapView.2788895.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01238  [M=500]
Accession:   TIGR01238
Description: D1pyr5carbox3: delta-1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
   5.6e-241  786.2   5.5   1.1e-240  785.3   5.5    1.5  1  NCBI__GCF_001294205.1:WP_053939191.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_001294205.1:WP_053939191.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  785.3   5.5  1.1e-240  1.1e-240       2     498 ..     608    1103 ..     607    1105 .. 0.98

  Alignments for each domain:
  == domain 1  score: 785.3 bits;  conditional E-value: 1.1e-240
                             TIGR01238    2 lygegrknslGvdlaneselksleeqllkaaakkfqaapivgekakaegeaqpvknpadrkdivGqvsead 72  
                                            lyge+r ns+G+dlane++l+sl++ ll   ++ + a p       a ge q v+npad++d+vG+v ea 
  NCBI__GCF_001294205.1:WP_053939191.1  608 LYGEARANSAGLDLANEQRLASLSAALL--TDHAWLATPPH----AAAGEKQAVRNPADHRDVVGYVVEAA 672 
                                            8************************997..5899**99965....799*********************** PP

                             TIGR01238   73 aaevqeavdsavaafaewsatdakeraailerladlleshmpelvallvreaGktlsnaiaevreavdflr 143 
                                            +++v +a++ a++a ++w+at+a  raa ler+adl+e +m +l++l+vreaGktlsnaiaevreavdflr
  NCBI__GCF_001294205.1:WP_053939191.1  673 LDDVAAALTRAQNAGPIWQATPAVTRAALLERAADLMEAEMQTLMGLIVREAGKTLSNAIAEVREAVDFLR 743 
                                            *********************************************************************** PP

                             TIGR01238  144 yyakqvedvldeesakalGavvcispwnfplaiftGqiaaalaaGntviakpaeqtsliaaravellqeaG 214 
                                            yya qv+ +++++++++lG+vvcispwnfplaiftGq+aaalaaGntv+akpaeqt+liaa+av +l+eaG
  NCBI__GCF_001294205.1:WP_053939191.1  744 YYAAQVRGSFSNDTHRPLGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRILHEAG 814 
                                            *********************************************************************** PP

                             TIGR01238  215 vpagviqllpGrGedvGaaltsderiaGviftGstevarlinkalakredap...vpliaetGGqnamivd 282 
                                            vp  ++qllpG+Ge+vGaal  d+r++Gv+ftGstevar ++++la r da+   +pliaetGGqnamivd
  NCBI__GCF_001294205.1:WP_053939191.1  815 VPQDAVQLLPGQGETVGAALVGDARVKGVMFTGSTEVARILQRNLAGRLDAHgqpIPLIAETGGQNAMIVD 885 
                                            **************************************************87777**************** PP

                             TIGR01238  283 stalaeqvvadvlasafdsaGqrcsalrvlcvqedvadrvltlikGamdelkvgkpirlttdvGpvidaea 353 
                                            s+alaeqvv dvlasafdsaGqrcsalrvlc+qedvadrvlt++kG + el+vg+p rl tdvGpvidaea
  NCBI__GCF_001294205.1:WP_053939191.1  886 SSALAEQVVGDVLASAFDSAGQRCSALRVLCLQEDVADRVLTMLKGGLAELVVGNPDRLATDVGPVIDAEA 956 
                                            *********************************************************************** PP

                             TIGR01238  354 kqnllahiekmkakakkvaqvkleddvesekgtfvaptlfelddldelkkevfGpvlhvvrykade....l 420 
                                            ++n+ +hi++m+a+++kv+qv++ +  ++++g+fv+ptl+eld+l+el++e+fGpvlhvvr+++      l
  NCBI__GCF_001294205.1:WP_053939191.1  957 QRNINSHIDAMRARGRKVHQVQTGA--ACNQGSFVPPTLIELDSLAELEREIFGPVLHVVRWQRTAdqagL 1025
                                            ***********************99..9**********************************975333449 PP

                             TIGR01238  421 dkvvdkinakGygltlGvhsrieetvrqiekrakvGnvyvnrnlvGavvGvqpfGGeGlsGtGpkaGGply 491 
                                            d ++d in +GygltlG+h+ri+et+++i +ra+vGn+yvnrn+vGavvGvqpfGGeGlsGtGpkaGGply
  NCBI__GCF_001294205.1:WP_053939191.1 1026 DALIDDINGTGYGLTLGIHTRIDETIAHIVERAQVGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLY 1096
                                            *********************************************************************** PP

                             TIGR01238  492 lyrltrv 498 
                                            l+rl+++
  NCBI__GCF_001294205.1:WP_053939191.1 1097 LHRLLSS 1103
                                            ****976 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (500 nodes)
Target sequences:                          1  (1316 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 42.82
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory