GapMind for catabolism of small carbon sources

 

Alignments for a candidate for lacZ in Flavobacterium glycines Gm-149

Align β-galactosidase (Gal4214-1) (EC 3.2.1.23) (characterized)
to candidate WP_066328382.1 BLR17_RS01125 DUF4981 domain-containing protein

Query= CAZy::AAX48919.1
         (1046 letters)



>NCBI__GCF_900100165.1:WP_066328382.1
          Length = 1041

 Score = 1000 bits (2586), Expect = 0.0
 Identities = 492/1039 (47%), Positives = 665/1039 (64%), Gaps = 18/1039 (1%)

Query: 12   IFAFISIIVFAQEKPSRNDWENPEVFQINREPARAAFLPFADEASAIADDYTRSPWYMSL 71
            IFA +  +     +  +N+WENP + + N+E  R +F+ F +E +A  +   +S +Y SL
Sbjct: 6    IFALLLAVQLINAQQQQNEWENPNIVERNKEAGRTSFVLFQNETTAATNQAEQSSYYKSL 65

Query: 72   DGKWKFNWSPTPDERPKDFFNTDFNTTTWKEIGVPSNWELVGYGIPIYTNITYPFVKNPP 131
            DG+WKFN    P +RP+DF+  D +  +W  I VPSNWEL G+ IPIYTNITYPF +NPP
Sbjct: 66   DGEWKFNIVKKPIDRPQDFYKVDLDDKSWASIPVPSNWELQGFDIPIYTNITYPFPRNPP 125

Query: 132  FIDHADNPVGSYRRTFELPENWDGRRVYLHFEGGTSAMYVWINGEKVGYSQNTKSPTEFD 191
            F+D+  NPVGSYR  F +PENW+ + V LHF   +    V++NG++VG ++  K+  EFD
Sbjct: 126  FVDNNYNPVGSYRTKFTVPENWNSKEVILHFGSISGYARVYLNGKEVGMTKAAKTAAEFD 185

Query: 192  ITKYVKVGKNQVAVEVYRWSDGSYLEDQDFWRLSGIDRSVYLYSTANTRIADFFARPDLD 251
            +T +++ G+N +AV+V+RW DGSYLEDQDFWRLSGI+RSVYL +     + D+F   DLD
Sbjct: 186  VTPFLQKGENVLAVQVFRWHDGSYLEDQDFWRLSGIERSVYLQAVPKLTVWDYFVHSDLD 245

Query: 252  TSYKNGSLSVDIKLKNANSVAKNNQTVEAKLVDAAGKEVFIKTIKINLGANTVSSTTFEQ 311
             +Y NG     ++L+        N  +   L+DA GK+V+ +T K++   N     TF  
Sbjct: 246  ENYANGLFKTTVQLRAFQQNKIKNANLTLTLLDAQGKKVYAETKKVS---NLTKEVTFNS 302

Query: 312  MVKSPKLWNNETPNLYTLVLTLKDENGKFVETVATSIGFRKVELKNGQLLVNGIRIMVHG 371
             V +   W++E P LY  ++ L+  N      +    GFRKVE+KN QLLVNG  I++ G
Sbjct: 303  TVTNVNKWSSEMPYLYRYIIKLESVNPDENSVIYGKTGFRKVEIKNAQLLVNGKAILIRG 362

Query: 372  VNIHEHNPKTGHYQDEATMMKDIKLMKQLNINAVRCSHYPNNLLWVKLCNKYGLFLVDEA 431
            VN  +H+   GH  D  TM KDI LMKQ NINA+R SHYP++    +LC++YG++++DEA
Sbjct: 363  VNTQDHHETKGHTPDYPTMRKDIALMKQNNINAIRMSHYPHDPYLYQLCDEYGMYVIDEA 422

Query: 432  NIETHGMGAELQGSFDKTKHPAYLPEWKAAHMDRIYSLVERDKNQPSIILWSLGNECGNG 491
            NIE+HGMG E Q    ++KHP+Y+  W  AH+DRI  + E +KN PS+I+WS+GNECGNG
Sbjct: 423  NIESHGMGVEGQNISAESKHPSYVSMWFPAHLDRIKRMFEINKNHPSVIIWSMGNECGNG 482

Query: 492  PVFHEAYNWIKNRDKTRLVQFEQAGEQENTDVVCPMYPSMEYMKEYANRKDVKRPFIMCE 551
            PVFHEAY W+K+ D +R V  EQA E  NTD+V PMYP+++YMKEYA  KD  RPFIMCE
Sbjct: 483  PVFHEAYKWLKSTDPSRPVHSEQAKENSNTDIVAPMYPTIKYMKEYAAAKDKTRPFIMCE 542

Query: 552  YSHAMGNSNGNFQEYWDIIHSSTNMQGGFIWDWVDQGFEETDEAGRKYWAYGGDMGGQNY 611
            YSHAMGNS+GNFQEYWDII+SS +MQGGFIWDWVDQG +  DE G  Y+AYGGD+G ++ 
Sbjct: 543  YSHAMGNSSGNFQEYWDIINSSRHMQGGFIWDWVDQGIKTKDENGTVYYAYGGDLGSKDL 602

Query: 612  TNDQNFCHNGLVWPDRTPHPGAFEVKKVYQDILF--KGVNLDKGIIEVENGFGYTNLDKY 669
             ND+NFC NGL+  +R PHPG  EVKKVYQ ILF  KG+N     + ++N F YTNL++Y
Sbjct: 603  RNDENFCDNGLISANRKPHPGLAEVKKVYQSILFSLKGLNQ----LTIKNIFQYTNLNQY 658

Query: 670  LFKFEVLKNGLVIKSGVINIRLAPQSKKQIQIELPKLTTEDGVEYLLNVFAYTKEGTELL 729
             FKFE+ KNG +++    N+   P   + + I+LP        EY LNV+AYTK  T+++
Sbjct: 659  QFKFELFKNGHLVQEKTFNVDCEP--GENVTIDLPVEALNANAEYFLNVYAYTKTATDMV 716

Query: 730  PQNFEIAREQFSIGESNYFV-KVAKASTNPIVKDSQDA--ITLSANGVEVTINKKTGLMQ 786
            P N E+AREQF+IG+ N+F  K  + S    +K    A  +T   N V  + +   G+++
Sbjct: 717  PVNHEVAREQFAIGQGNFFAEKTMEISKKKDLKFKTKASVLTFETNAVAGSFDLSKGVLK 776

Query: 787  KYTS--GEENYFNQMPVPNFWRAPTDNDFGNYMQVNSNVWRTVGRFSSLDSIEV-KEVST 843
             Y S           P P FWRAPTDNDFGN M     VW+       + S+ V K+   
Sbjct: 777  SYISKNNPSEIVTAFPTPYFWRAPTDNDFGNKMPEKLVVWKEAHLNPEVKSVTVGKQTEQ 836

Query: 844  QTTVVAHLFLKDIASTYTITYSMDADGSLTLQNSFKAGEMALSEMPRFGMLFSLKKELDN 903
               +     L +    Y + Y +  +G + +  S    +  + EMPRFGM   L    +N
Sbjct: 837  GLPIDVEYALAEAKIPYIVNYLIQNNGEIKVTASIDVTDKKMPEMPRFGMRLELPGAYEN 896

Query: 904  FSYYGRGPWENYQDRNTSSLKGIYESKVADQYV-PYTRPQENGYKTDIRWITLTNSSGNG 962
             SYYGRGP ENY DR ++S  GIY+ KVA+Q+   Y RPQE G KTD RW +L N  G G
Sbjct: 897  LSYYGRGPLENYSDRKSASFVGIYQDKVANQFTWEYIRPQECGNKTDARWFSLQNDKGVG 956

Query: 963  IEILGLQPLGVSALNNYPEDFDPGLTKKQQHTNDITPRDEVIICVDLAQRGLGGDNSWGA 1022
            ++I G QPL  S LN   E  DPG TK Q+HTND+ P+D+V + +DLAQRGLGGD+SW +
Sbjct: 957  LKITGEQPLSFSTLNVSTESLDPGKTKMQRHTNDVKPQDKVFVHLDLAQRGLGGDDSWKS 1016

Query: 1023 MPHEQYQLRNKAYSYGFVI 1041
            +PHEQY L    YSY + +
Sbjct: 1017 LPHEQYLLTAPKYSYSYTL 1035


Lambda     K      H
   0.316    0.134    0.410 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3330
Number of extensions: 183
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1046
Length of database: 1041
Length adjustment: 45
Effective length of query: 1001
Effective length of database: 996
Effective search space:   996996
Effective search space used:   996996
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory