GapMind for catabolism of small carbon sources

 

Alignments for a candidate for edd in Phyllobacterium brassicacearum STM 196

Align phosphogluconate dehydratase (characterized)
to candidate WP_106709318.1 CU102_RS02320 phosphogluconate dehydratase

Query= CharProtDB::CH_024239
         (603 letters)



>NCBI__GCF_003010955.1:WP_106709318.1
          Length = 607

 Score =  743 bits (1918), Expect = 0.0
 Identities = 371/595 (62%), Positives = 457/595 (76%), Gaps = 3/595 (0%)

Query: 8   VTNRIIERSRETRSAYLARIEQAKTSTVHRSQLACGNLAHGFAACQPEDKASLKSMLRNN 67
           +T+RI E+S+ TR  YL  + +A +    RS LAC NLAHGFAAC P DKA+L   +  N
Sbjct: 10  ITHRICEQSKPTRDVYLDHLREAASRKPKRSALACANLAHGFAACSPSDKAALAGDVVPN 69

Query: 68  IAIITSYNDMLSAHQPYEHYPEIIRKALHEANAVGQVAGGVPAMCDGVTQGQDGMELSLL 127
           + IITSYNDMLSAHQP+E YP++I+ A  EA  V QVAGGVPAMCDGVTQGQ GMELSL 
Sbjct: 70  LGIITSYNDMLSAHQPFETYPQLIKAAAKEAGGVAQVAGGVPAMCDGVTQGQPGMELSLF 129

Query: 128 SREVIAMSAAVGLSHNMFDGALFLGVCDKIVPGLTMAALSFGHLPAVFVPSGPMASGLPN 187
           SR+VIAM+ A+GLSH+MFD A++LGVCDKIVPGL + AL+FGHLPAVF+P+GPM +GLPN
Sbjct: 130 SRDVIAMATAIGLSHDMFDAAVYLGVCDKIVPGLVIGALTFGHLPAVFIPAGPMTTGLPN 189

Query: 188 KEKVRIRQLYAEGKVDRMALLESEAASYHAPGTCTFYGTANTNQMVVEFMGMQLPGSSFV 247
            EK + RQLYAEGKV R ALLESE+ SYH PGTCTFYGTAN+NQM++E MG+ +PGSSF+
Sbjct: 190 DEKAKTRQLYAEGKVGREALLESESKSYHGPGTCTFYGTANSNQMLMEIMGLHMPGSSFI 249

Query: 248 HPDSPLRDALTAAAARQVTRMTGNGNEWMPIGKMIDEKVVVNGIVALLATGGSTNHTMHL 307
           +P +PLRDALT  AA++   +T  GNE+ P+G+MIDE+ +VNG+V L ATGGSTNHTMHL
Sbjct: 250 NPGTPLRDALTREAAKRALAITALGNEYTPVGEMIDERSIVNGVVGLHATGGSTNHTMHL 309

Query: 308 VAMARAAGIQINWDDFSDLSDVVPLMARLYPNGPADINHFQAAGGVPVLVRELLKAGLLH 367
           VAMA AAGI++ W D SDLSDVVPL+AR+YPNG AD+NHF AAGG+  ++RELL  GLLH
Sbjct: 310 VAMAAAAGIKLTWQDISDLSDVVPLLARVYPNGLADVNHFHAAGGMGYIIRELLDGGLLH 369

Query: 368 EDVNTV--AGFGLSRYTLEPWL-NNGELDWREGAEKSLDSNVIASFEQPFSHHGGTKVLS 424
           EDV TV   G GL  YT+EP L  NG +     A +S D  V+++ +QPF   GG K+L 
Sbjct: 370 EDVKTVWGGGDGLRAYTIEPKLGENGTVVREPVAAESADKKVLSTCKQPFQVTGGLKMLK 429

Query: 425 GNLGRAVMKTSAVPVENQVIEAPAVVFESQHDVMPAFEAGLLDRDCVVVVRHQGPKANGM 484
           GNLG AV+KTSAV  +  +IEAPA+VF+SQ  +  AF+AG LDRD V VVR QGP+ANGM
Sbjct: 430 GNLGTAVIKTSAVKADRHIIEAPAIVFDSQAALQDAFKAGKLDRDFVAVVRFQGPRANGM 489

Query: 485 PELHKLMPPLGVLLDRCFKIALVTDGRLSGASGKVPSAIHVTPEAYDGGLLAKVRDGDII 544
           PELHKL P LGVL DR   +ALVTDGR+SGASGKVP+AIHVTPEA DGG++ K+ DGD++
Sbjct: 490 PELHKLTPALGVLQDRGHMVALVTDGRMSGASGKVPAAIHVTPEALDGGIIGKIHDGDVV 549

Query: 545 RVNGQTGELTLLVDEAELAAREPHIPDLSASRVGTGRELFSALREKLSGAEQGAT 599
           R++ + G L +L D   LAAR     D++ +  G GRELF+A R  +  AE GA+
Sbjct: 550 RLDAEIGTLDVLEDPDVLAARPTPEVDINHNSYGMGRELFAAFRNVVGKAENGAS 604


Lambda     K      H
   0.318    0.134    0.392 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1129
Number of extensions: 49
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 603
Length of database: 607
Length adjustment: 37
Effective length of query: 566
Effective length of database: 570
Effective search space:   322620
Effective search space used:   322620
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 53 (25.0 bits)

Align candidate WP_106709318.1 CU102_RS02320 (phosphogluconate dehydratase)
to HMM TIGR01196 (edd: phosphogluconate dehydratase (EC 4.2.1.12))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01196.hmm
# target sequence database:        /tmp/gapView.3250564.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01196  [M=601]
Accession:   TIGR01196
Description: edd: phosphogluconate dehydratase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1038.1   0.9          0 1037.9   0.9    1.0  1  NCBI__GCF_003010955.1:WP_106709318.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_003010955.1:WP_106709318.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1037.9   0.9         0         0       3     600 ..       6     606 ..       4     607 .] 0.99

  Alignments for each domain:
  == domain 1  score: 1037.9 bits;  conditional E-value: 0
                             TIGR01196   3 rlaeiteriierskktrekylekirsaktkgklrstlgcgnlahgvaalsesekvelksekrknlaiitaynd 75 
                                           r+++it+ri e+sk+tr+ yl+++r+a+++ ++rs+l+c+nlahg+aa+s+s+k++l+ + ++nl+iit+ynd
  NCBI__GCF_003010955.1:WP_106709318.1   6 RVKSITHRICEQSKPTRDVYLDHLREAASRKPKRSALACANLAHGFAACSPSDKAALAGDVVPNLGIITSYND 78 
                                           7999********************************************************************* PP

                             TIGR01196  76 mlsahqpfkeypdlikkalqeanavaqvagGvpamcdGvtqGedGmelsllsrdvialstaiglshnmfdgal 148
                                           mlsahqpf++yp+lik a++ea++vaqvagGvpamcdGvtqG++Gmelsl+srdvia++taiglsh+mfd+a+
  NCBI__GCF_003010955.1:WP_106709318.1  79 MLSAHQPFETYPQLIKAAAKEAGGVAQVAGGVPAMCDGVTQGQPGMELSLFSRDVIAMATAIGLSHDMFDAAV 151
                                           ************************************************************************* PP

                             TIGR01196 149 flGvcdkivpGlliaalsfGhlpavfvpaGpmasGlenkekakvrqlfaeGkvdreellksemasyhapGtct 221
                                           +lGvcdkivpGl+i+al+fGhlpavf+paGpm++Gl+n+ekak+rql+aeGkv+re+ll+se++syh+pGtct
  NCBI__GCF_003010955.1:WP_106709318.1 152 YLGVCDKIVPGLVIGALTFGHLPAVFIPAGPMTTGLPNDEKAKTRQLYAEGKVGREALLESESKSYHGPGTCT 224
                                           ************************************************************************* PP

                             TIGR01196 222 fyGtansnqmlvelmGlhlpgasfvnpntplrdaltreaakrlarltakngevlplaelideksivnalvgll 294
                                           fyGtansnqml+e+mGlh+pg+sf+np tplrdaltreaakr+ ++ta ++e++p++e+ide+sivn++vgl+
  NCBI__GCF_003010955.1:WP_106709318.1 225 FYGTANSNQMLMEIMGLHMPGSSFINPGTPLRDALTREAAKRALAITALGNEYTPVGEMIDERSIVNGVVGLH 297
                                           ************************************************************************* PP

                             TIGR01196 295 atGGstnhtlhlvaiaraaGiilnwddlselsdlvpllarvypnGkadvnhfeaaGGlsflirellkeGllhe 367
                                           atGGstnht+hlva+a aaGi l+w+d+s+lsd+vpllarvypnG advnhf+aaGG++++irell+ Gllhe
  NCBI__GCF_003010955.1:WP_106709318.1 298 ATGGSTNHTMHLVAMAAAAGIKLTWQDISDLSDVVPLLARVYPNGLADVNHFHAAGGMGYIIRELLDGGLLHE 370
                                           ************************************************************************* PP

                             TIGR01196 368 dvetvag..kGlrrytkepfled.gkleyreaaeksldedilrkvdkpfsaeGGlkllkGnlGravikvsavk 437
                                           dv+tv g   Glr+yt+ep+l + g +++++ a +s+d+++l + ++pf+ +GGlk+lkGnlG avik+savk
  NCBI__GCF_003010955.1:WP_106709318.1 371 DVKTVWGggDGLRAYTIEPKLGEnGTVVREPVAAESADKKVLSTCKQPFQVTGGLKMLKGNLGTAVIKTSAVK 443
                                           *****987799**********764899999999**************************************** PP

                             TIGR01196 438 eesrvieapaivfkdqaellaafkagelerdlvavvrfqGpkanGmpelhklttvlGvlqdrgfkvalvtdGr 510
                                            + ++ieapaivf++qa l++afkag+l+rd+vavvrfqGp+anGmpelhklt++lGvlqdrg+ valvtdGr
  NCBI__GCF_003010955.1:WP_106709318.1 444 ADRHIIEAPAIVFDSQAALQDAFKAGKLDRDFVAVVRFQGPRANGMPELHKLTPALGVLQDRGHMVALVTDGR 516
                                           ************************************************************************* PP

                             TIGR01196 511 lsGasGkvpaaihvtpealegGalakirdGdlirldavngelevlvddaelkareleeldlednelGlGrelf 583
                                           +sGasGkvpaaihvtpeal+gG + ki+dGd++rlda  g l+vl+d   l+ar + e+d+++n++G+Grelf
  NCBI__GCF_003010955.1:WP_106709318.1 517 MSGASGKVPAAIHVTPEALDGGIIGKIHDGDVVRLDAEIGTLDVLEDPDVLAARPTPEVDINHNSYGMGRELF 589
                                           ************************************************************************* PP

                             TIGR01196 584 aalrekvssaeeGassl 600
                                           aa+r+ v+ ae+Gas++
  NCBI__GCF_003010955.1:WP_106709318.1 590 AAFRNVVGKAENGASVF 606
                                           **************998 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (601 nodes)
Target sequences:                          1  (607 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.01
# Mc/sec: 30.44
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory