GapMind for catabolism of small carbon sources

 

Alignments for a candidate for glcB in Stenotrophomonas chelatiphaga DSM 21508

Align malate synthase A (EC 2.3.3.9) (characterized)
to candidate WP_057509261.1 ABB28_RS14270 malate synthase A

Query= ecocyc::MALATE-SYNTHASE
         (533 letters)



>NCBI__GCF_001431535.1:WP_057509261.1
          Length = 545

 Score =  501 bits (1290), Expect = e-146
 Identities = 267/520 (51%), Positives = 338/520 (65%), Gaps = 17/520 (3%)

Query: 21  EKQILTAEAVEFLTELVTHFTPQRNKLLAARIQQQQDIDNGTLPDFISETASIRDADWKI 80
           +  +L A  +  L  L     P R   LAAR+Q+Q   D G LPDF ++TA IR  DWK+
Sbjct: 28  QSALLPAPLLALLVSLHRAVEPGRQARLAARVQRQAFFDQGGLPDFRTDTAPIRAEDWKV 87

Query: 81  RGIPADLEDRRVEITGPVERKMVINALNANVKVFMADFEDSLAPDWNKVIDGQINLRDAV 140
             +PA L DRRVEITGP + KMVINALN+  KVFMADFED+ AP W  ++ GQ +L  AV
Sbjct: 88  APLPAALLDRRVEITGPTDPKMVINALNSGAKVFMADFEDATAPTWGNLLAGQQSLIGAV 147

Query: 141 NGTISYTNEAG-----KIYQLKP--NPAVLICRVRGLHLPEKHVTWRGEAIPGSLFDFAL 193
            G + +T  A      K Y L+P    AVLI R RG HL EKHV   G+ + G LFD A+
Sbjct: 148 RGDLQFTAPASGSKPSKHYSLRPYEERAVLIVRPRGWHLDEKHVLVDGQRLAGGLFDAAV 207

Query: 194 YFFHNYQALLAKGSGPYFYLPKTQSWQEAAWWSEVFSYAEDRFNLPRGTIKATLLIETLP 253
           + +HN + L+A   GPYFYLPK QS +EAA W    S+ E    LP G IK T+LIETLP
Sbjct: 208 FAYHNARTLMANDRGPYFYLPKLQSMEEAALWETALSHIEGMLGLPHGQIKVTVLIETLP 267

Query: 254 AVFQMDEILHALRDHIVGLNCGRWDYIFSYIKTLKNYPDRVLPDRQAVTMDKPFLNAYSR 313
           AVF+MDEILHALRD IVGLNCGRWDYIFSY+KT + + D+VLP+R  VTM +PFL AYS 
Sbjct: 268 AVFEMDEILHALRDRIVGLNCGRWDYIFSYLKTFRRHADKVLPERGQVTMTQPFLKAYSE 327

Query: 314 LLIKTCHKRGAFAMGGMAAFIP-SKDEEHNNQVLNKVKADKSLEANNGHDGTWIAHPGLA 372
           LLI+TCH+RGA AMGGMAA IP   D   N Q + +V+ADK  E   GHDGTW+AHP L 
Sbjct: 328 LLIQTCHRRGAHAMGGMAAQIPIGNDAAANEQAMARVRADKLREVTAGHDGTWVAHPALI 387

Query: 373 DTAMAVFNDILGSRKNQLEVMREQDAPITADQLLAPCDGERTEEGMRANIRVAVQYIEAW 432
             AMA+F++ +    NQ +V+R QD  +  DQL+A   G  +  G   N+ V V+Y+ AW
Sbjct: 388 PVAMAIFDEHMPG-ANQQQVLR-QDVRVDRDQLIARPPGSISRAGFEGNVEVCVRYLAAW 445

Query: 433 ISGNGCVPIYGLMEDAATAEISRTSIWQWIHHQ-KTLSNGKPVTKALFRQMLGEEMKVIA 491
           + GNGCVPI+ LMEDAATAEISR+ +WQW+H   + L +G  +  AL    L +    + 
Sbjct: 446 LDGNGCVPIHNLMEDAATAEISRSQLWQWLHTPGQQLDDGTAIDAALLDNALAQ----LP 501

Query: 492 SELGEERF--SQGRFDDAARLMEQITTSDELIDFLTLPGY 529
           + LG+       GR D+A  L+ +++ +DEL DFLTLP Y
Sbjct: 502 ARLGDRATLPGGGRVDEAIALLAELSRADELNDFLTLPAY 541


Lambda     K      H
   0.320    0.135    0.407 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 776
Number of extensions: 34
Number of successful extensions: 5
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 533
Length of database: 545
Length adjustment: 35
Effective length of query: 498
Effective length of database: 510
Effective search space:   253980
Effective search space used:   253980
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 52 (24.6 bits)

Align candidate WP_057509261.1 ABB28_RS14270 (malate synthase A)
to HMM TIGR01344 (aceB: malate synthase A (EC 2.3.3.9))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR01344.hmm
# target sequence database:        /tmp/gapView.31932.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01344  [M=511]
Accession:   TIGR01344
Description: malate_syn_A: malate synthase A
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   4.3e-211  687.5   0.0   4.9e-211  687.3   0.0    1.0  1  lcl|NCBI__GCF_001431535.1:WP_057509261.1  ABB28_RS14270 malate synthase A


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_001431535.1:WP_057509261.1  ABB28_RS14270 malate synthase A
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  687.3   0.0  4.9e-211  4.9e-211       5     510 ..      35     544 ..      31     545 .] 0.97

  Alignments for each domain:
  == domain 1  score: 687.3 bits;  conditional E-value: 4.9e-211
                                 TIGR01344   5 ealeflaelhrrfaerrkellarrekkqakldkgelldflpetkeireddwkvaaipadlldrrveitG 73 
                                                 l++l++lhr  ++ r++ la+r ++qa +d+g l+df  +t+ ir++dwkva++pa+lldrrveitG
  lcl|NCBI__GCF_001431535.1:WP_057509261.1  35 PLLALLVSLHRAVEPGRQARLAARVQRQAFFDQGGLPDFRTDTAPIRAEDWKVAPLPAALLDRRVEITG 103
                                               568899*************************************************************** PP

                                 TIGR01344  74 PvdrkmvinalnaeakvfladfedsssPtwenvveGqinlkdairgeidftdeesg....keyalk..a 136
                                               P+d kmvinaln++akvf+adfed+++Ptw n++ Gq +l  a+rg+++ft++ sg    k+y l+   
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 104 PTDPKMVINALNSGAKVFMADFEDATAPTWGNLLAGQQSLIGAVRGDLQFTAPASGskpsKHYSLRpyE 172
                                               ***************************************************9765433339*****777 PP

                                 TIGR01344 137 klavlivrprGwhlkerhleidgkaisgslldfglyffhnarellkkGkGPyfylPkleshlearlwnd 205
                                               ++avlivrprGwhl e+h+ +dg+ ++g l+d +++++hnar+l+++ +GPyfylPkl+s  ea lw+ 
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 173 ERAVLIVRPRGWHLDEKHVLVDGQRLAGGLFDAAVFAYHNARTLMANDRGPYFYLPKLQSMEEAALWET 241
                                               99******************************************************************* PP

                                 TIGR01344 206 vfllaqevlglprGtikatvlietlpaafemdeilyelrehssGlncGrwdyifslikklkkaeevvlP 274
                                                +   + +lglp+G ik tvlietlpa+femdeil+ lr+ ++GlncGrwdyifs++k+++++ ++vlP
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 242 ALSHIEGMLGLPHGQIKVTVLIETLPAVFEMDEILHALRDRIVGLNCGRWDYIFSYLKTFRRHADKVLP 310
                                               ********************************************************************* PP

                                 TIGR01344 275 drdavtmdkaflnaysklliqtchrrgafalGGmaafiPikddpaaneaalekvradkereaknGhdGt 343
                                               +r +vtm+++fl+ays+lliqtchrrga+a+GGmaa+iPi +d+aane+a+++vradk re+++GhdGt
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 311 ERGQVTMTQPFLKAYSELLIQTCHRRGAHAMGGMAAQIPIGNDAAANEQAMARVRADKLREVTAGHDGT 379
                                               ********************************************************************* PP

                                 TIGR01344 344 wvahPdlvevalevfdevlgepnqldrvrledvsitaaellevkdasrteeGlrenirvglryieawlr 412
                                               wvahP+l++va+++fde+++ +nq +++r +dv++ + +l++ + +s + +G+  n++v +ry++awl+
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 380 WVAHPALIPVAMAIFDEHMPGANQQQVLR-QDVRVDRDQLIARPPGSISRAGFEGNVEVCVRYLAAWLD 447
                                               ***************************99.9************************************** PP

                                 TIGR01344 413 GsGavpiynlmedaataeisraqlwqwikh.GvvledGekvtselvrdllkeeleklkkesgkeeyaka 480
                                               G+G+vpi+nlmedaataeisr+qlwqw+++ G+ l+dG  + ++l+ ++l++  ++l + ++    + +
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 448 GNGCVPIHNLMEDAATAEISRSQLWQWLHTpGQQLDDGTAIDAALLDNALAQLPARLGDRATLP--GGG 514
                                               ****************************9669*************************9996665..599 PP

                                 TIGR01344 481 rleeaaellerlvlseeledfltlpaydel 510
                                               r++ea  ll +l+ ++el+dfltlpay ++
  lcl|NCBI__GCF_001431535.1:WP_057509261.1 515 RVDEAIALLAELSRADELNDFLTLPAYARI 544
                                               ***************************886 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (511 nodes)
Target sequences:                          1  (545 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.00s 00:00:00.02 Elapsed: 00:00:00.02
# Mc/sec: 13.00
//
[ok]

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory