GapMind for catabolism of small carbon sources

 

Alignments for a candidate for putA in Echinicola vietnamensis KMM 6221, DSM 17526

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88) (characterized)
to candidate Echvi_1300 Echvi_1300 delta-1-pyrroline-5-carboxylate dehydrogenase, group 1

Query= BRENDA::P30038
         (563 letters)



>FitnessBrowser__Cola:Echvi_1300
          Length = 543

 Score =  513 bits (1321), Expect = e-150
 Identities = 258/530 (48%), Positives = 351/530 (66%), Gaps = 3/530 (0%)

Query: 34  NEPVLAFTQGSPERDALQKALKDLKGRMEAIPCVVGDEEVWTSDVQYQVSPFNHGHKVAK 93
           NEPV  +  G+P R  LQ AL++ + +   +P  +G EEV T +      P +H H +  
Sbjct: 13  NEPVFDYAPGTPARAKLQAALQEARSKEVDVPMYIGSEEVRTGNKIPLSPPHDHQHLLGH 72

Query: 94  FCYADKSLLNKAIEAALAARKEWDLKPIADRAQIFLKAADMLSGPRRAEILAKTMVGQGK 153
           F   DKS + +AI AAL A++ W+      RA IFLKAAD+++GP R ++ A TM+GQ K
Sbjct: 73  FHEGDKSHVEQAINAALGAKEAWETMEWEQRAAIFLKAADLIAGPYRYKMNAATMLGQSK 132

Query: 154 TVIQAEIDAAAELIDFFRFNAKYAVELEGQQP-ISVPPSTNSTVYRGLEGFVAAISPFNF 212
              QAEID+A E++DF RFN KY  E+  QQP IS     N    R LEGFV A++PFNF
Sbjct: 133 NAFQAEIDSACEIVDFLRFNVKYMTEIYKQQPPISGDGVWNRLEQRPLEGFVFALTPFNF 192

Query: 213 TAIGGNLAGAPALMGNVVLWKPSDTAMLASYAVYRILREAGLPPNIIQFVPADGPLFGDT 272
           TAI GNL  APA+MGN V+WKP+ T +  +  + ++ REAG+P  +I  V  DGP  G+ 
Sbjct: 193 TAIAGNLPTAPAMMGNTVVWKPAYTQIYTANLLMQVFREAGVPDGVINLVYVDGPAAGEV 252

Query: 273 VTSSEHLCGINFTGSVPTFKHLWKQVAQNLDRFHTFPRLAGECGGKNFHFVHRSADVESV 332
           +       GI+FTGS   F+ +WK +  N++++ ++PR+ GE GGK+F   H+SAD + +
Sbjct: 253 IFEHPEFAGIHFTGSTAVFQTIWKTIGNNIEKYKSYPRIVGETGGKDFVIAHKSADAKQL 312

Query: 333 VSGTLRSAFEYGGQKCSACSRLYVPHSLWPQIKGRLLEEHSRIKVGDPAEDFGTFFSAVI 392
            +G +R AFE+ GQKCSA SR Y+P +LW  +K  + E+ + IK+G P EDF  F +AVI
Sbjct: 313 ATGLVRGAFEFQGQKCSAASRAYIPSNLWEDVKKYMQEDLASIKMGGP-EDFSNFINAVI 371

Query: 393 DAKSFARIKKWLEHARSSPSLTILAGGKCDDSVGYFVEPCIVESKDPQEPIMKEEIFGPV 452
           D KSF +I K+++ A+S   L ++AGG  D S GYFVEP ++ +KDP    M EEIFGPV
Sbjct: 372 DEKSFDKIAKYIDTAKSD-GLEVVAGGHYDKSKGYFVEPTVLLTKDPMYTTMCEEIFGPV 430

Query: 453 LSVYVYPDDKYKETLQLVDSTTSYGLTGAVFSQDKDVVQEATKVLRNAAGNFYINDKSTG 512
           L++YVY +D ++E L+LVD T+ YGLTGA+FS D+   Q AT+ LRNAAGNFYINDK TG
Sbjct: 431 LTIYVYQEDHFEEALELVDQTSPYGLTGAIFSHDRYAAQLATQKLRNAAGNFYINDKPTG 490

Query: 513 SIVGQQPFGGARASGTNDKPGGPHYILRWTSPQVIKETHKPLGDWSYAYM 562
           ++VGQQPFGGAR SGTNDK G    +LRW SP+ IKET     D+ Y ++
Sbjct: 491 AVVGQQPFGGARKSGTNDKAGAMINMLRWVSPRTIKETFVTPTDYRYPFL 540


Lambda     K      H
   0.319    0.135    0.411 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 870
Number of extensions: 37
Number of successful extensions: 4
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 563
Length of database: 543
Length adjustment: 36
Effective length of query: 527
Effective length of database: 507
Effective search space:   267189
Effective search space used:   267189
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate Echvi_1300 Echvi_1300 (delta-1-pyrroline-5-carboxylate dehydrogenase, group 1)
to HMM TIGR01236 (pruA: 1-pyrroline-5-carboxylate dehydrogenase (EC 1.2.1.88))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01236.hmm
# target sequence database:        /tmp/gapView.23446.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01236  [M=533]
Accession:   TIGR01236
Description: D1pyr5carbox1: 1-pyrroline-5-carboxylate dehydrogenase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                            Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                            -----------
   2.6e-246  804.2   0.2     3e-246  804.1   0.2    1.0  1  lcl|FitnessBrowser__Cola:Echvi_1300  Echvi_1300 delta-1-pyrroline-5-c


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cola:Echvi_1300  Echvi_1300 delta-1-pyrroline-5-carboxylate dehydrogenase, group 1
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  804.1   0.2    3e-246    3e-246       1     531 [.      12     539 ..      12     541 .. 0.99

  Alignments for each domain:
  == domain 1  score: 804.1 bits;  conditional E-value: 3e-246
                            TIGR01236   1 knePvkefrpgskerdllrkelkelkskvleiPlviggkevvksnelievvaPadhqaklakltnateedvkka 74 
                                          knePv +++pg+++r +l+++l+e++sk +++P+ ig +ev + n +i    P+dhq+ l+++++ ++++v++a
  lcl|FitnessBrowser__Cola:Echvi_1300  12 KNEPVFDYAPGTPARAKLQAALQEARSKEVDVPMYIGSEEVRTGN-KIPLSPPHDHQHLLGHFHEGDKSHVEQA 84 
                                          6***************************************66655.7*************************** PP

                            TIGR01236  75 veaaldakkeWselpfadraaiflkaadllsgkyreeilaatmlgqsktvyqaeidavaelidffrfnvkyare 148
                                          ++aal ak+ W+++++++raaiflkaadl++g+yr++++aatmlgqsk+ +qaeid+++e++df+rfnvky +e
  lcl|FitnessBrowser__Cola:Echvi_1300  85 INAALGAKEAWETMEWEQRAAIFLKAADLIAGPYRYKMNAATMLGQSKNAFQAEIDSACEIVDFLRFNVKYMTE 158
                                          ************************************************************************** PP

                            TIGR01236 149 lleqqPsvsapgelnkveyrpleGfvaaisPfnftaiaanlagaPalmGnvvvWkPsktavlsnyllmkileea 222
                                          +++qqP +s++g++n+ e rpleGfv+a++Pfnftaia+nl++aPa+mGn+vvWkP+ t+++++ llm++++ea
  lcl|FitnessBrowser__Cola:Echvi_1300 159 IYKQQPPISGDGVWNRLEQRPLEGFVFALTPFNFTAIAGNLPTAPAMMGNTVVWKPAYTQIYTANLLMQVFREA 232
                                          ************************************************************************** PP

                            TIGR01236 223 GlPpgvinfvpadgvkvsdvvladkdlaalhftGstavfkelwkkvasnldkyrnfPrivGetGGkdfvlvhps 296
                                          G+P gvin+v +dg + ++v+  ++++a++hftGstavf+++wk++ +n++ky+++PrivGetGGkdfv++h+s
  lcl|FitnessBrowser__Cola:Echvi_1300 233 GVPDGVINLVYVDGPAAGEVIFEHPEFAGIHFTGSTAVFQTIWKTIGNNIEKYKSYPRIVGETGGKDFVIAHKS 306
                                          ************************************************************************** PP

                            TIGR01236 297 adveevvaatirgafeyqGqkcsaasrlyvpkslwkelkeellaelkkvkvgdvddlssfmgavideksfakiv 370
                                          ad ++++++++rgafe+qGqkcsaasr+y+p  lw+++k+ + ++l+++k+g ++d+s+f+ avideksf+ki 
  lcl|FitnessBrowser__Cola:Echvi_1300 307 ADAKQLATGLVRGAFEFQGQKCSAASRAYIPSNLWEDVKKYMQEDLASIKMGGPEDFSNFINAVIDEKSFDKIA 380
                                          ************************************************************************** PP

                            TIGR01236 371 kviekakkdpeeleilaGGkyddskGyfvePtvveskdPkeklmkeeifGPvltvyvydddkykeilevvdsts 444
                                          k+i++ak+d   le++aGG+yd+skGyfvePtv+++kdP   +m eeifGPvlt+yvy++d+++e le+vd+ts
  lcl|FitnessBrowser__Cola:Echvi_1300 381 KYIDTAKSDG--LEVVAGGHYDKSKGYFVEPTVLLTKDPMYTTMCEEIFGPVLTIYVYQEDHFEEALELVDQTS 452
                                          ********88..************************************************************** PP

                            TIGR01236 445 kyaltGavfakdreaieeaekklrfaaGnfyindkstGavvgqqpfGGarlsGtndkaGapkillrfvsarsik 518
                                           y+ltGa+f++dr a + a++klr+aaGnfyindk+tGavvgqqpfGGar sGtndkaGa+ ++lr+vs+r+ik
  lcl|FitnessBrowser__Cola:Echvi_1300 453 PYGLTGAIFSHDRYAAQLATQKLRNAAGNFYINDKPTGAVVGQQPFGGARKSGTNDKAGAMINMLRWVSPRTIK 526
                                          ************************************************************************** PP

                            TIGR01236 519 etfkeltdfkypy 531
                                          etf+++td++yp+
  lcl|FitnessBrowser__Cola:Echvi_1300 527 ETFVTPTDYRYPF 539
                                          ***********97 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (533 nodes)
Target sequences:                          1  (543 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.01s 00:00:00.04 Elapsed: 00:00:00.04
# Mc/sec: 6.39
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory