GapMind for catabolism of small carbon sources

 

Aligments for a candidate for iorAB in Herbaspirillum seropedicae SmR1

Align phenylpyruvate ferredoxin oxidoreductase (EC 1.2.7.8) (characterized)
to candidate HSERO_RS12755 HSERO_RS12755 MFS transporter

Query= reanno::Marino:GFF880
         (1172 letters)



>lcl|FitnessBrowser__HerbieS:HSERO_RS12755 HSERO_RS12755 MFS
            transporter
          Length = 1180

 Score = 1090 bits (2819), Expect = 0.0
 Identities = 578/1168 (49%), Positives = 781/1168 (66%), Gaps = 30/1168 (2%)

Query: 2    SADTPQLDDYKLEDRYLRESGRVFLTGTQALVRIPLMQAALDRKQGLNTAGLVSGYRGSP 61
            S D   L    L+D+Y   +G ++L+G QALVR+PL+Q   DR  GLNTAG +SGYRGSP
Sbjct: 8    STDDVPLAPVSLDDKYTATTGAIYLSGIQALVRLPLLQQIRDRAAGLNTAGFISGYRGSP 67

Query: 62   LGAVDQALWQAKDLLDENRIDFVPAINEDLAATILLGTQQVETDEDRQVEGVFGLWYGKG 121
            LG +D+ALW A+  L  +R+ F P +NED+AAT + GTQQV+       +GV+ +WYGKG
Sbjct: 68   LGGLDEALWHAQPHLAAHRVKFQPGVNEDMAATAVWGTQQVKLIGPSDYDGVYAMWYGKG 127

Query: 122  PGVDRAGDALKHGTTYGSSPHGGVLVVAGDDHGCVSSSMPHQSDVAFMSFFMPTINPANI 181
            PGVDR GD LKH    G+S HGGVL+VAGDDHG  SS++PHQSD  F +  +P + P N+
Sbjct: 128  PGVDRCGDVLKHMNHAGTSAHGGVLLVAGDDHGAYSSTLPHQSDHIFSASMIPMLYPCNV 187

Query: 182  AEYLEFGLWGYALSRYSGCWVGFKAISETVESAASVEIPP-APDFVTPDDFTAPESGLHY 240
             EYL+ GL G+A+SRYSGC VGFKA+++TVES+ASV+  P       P DF  PE GL+ 
Sbjct: 188  QEYLDLGLHGWAMSRYSGCVVGFKALADTVESSASVDADPFRVQIRLPSDFVMPEGGLNA 247

Query: 241  RW-PDLPGPQLETR----IEHKLAAVQAFARANRIDRCLFDNKEARFGIVTTGKGHLDLL 295
            R   D  G Q   +     ++K+ A  A+ARANR++R + D+ +AR GI+ +GK +LD+L
Sbjct: 248  RLSTDTLGVQARKQEALMQDYKIYAALAYARANRLNRVMIDSPKARLGIIASGKSYLDVL 307

Query: 296  EALDLLGIDEDKARDMGLDIYKVGMVWPLERRGILDFVHGKEEVLVIEEKRGIIESQIKE 355
            EAL  LGIDE  A ++GL ++KV M WPLE  G+ +F  G +E+LV+EEKR ++E Q+KE
Sbjct: 308  EALSELGIDEAFAAEIGLRLFKVSMPWPLEPDGVREFAQGLDEILVVEEKRQMVEYQLKE 367

Query: 356  YMSEPDRPGEVLITGKQDELGRPLIP-------YVGELSPKLVAGFLAARLGRFFEVD-F 407
             +          + GK DE G  + P          + S   +A  +AAR+ RF   D  
Sbjct: 368  QLYNWRDDVRPRVIGKFDEKGEWVAPRGEWLLTSKADFSVAQIARVIAARIARFHTSDLI 427

Query: 408  SERMAEISA--MTTAQDPGGVKRMPYFCSGCPHNTSTKVPEGSKALAGIGCHFMASWMGR 465
              R+A + A     ++      R  Y+CSGCPHN+ST+VPEGS ALAGIGCH MA+ +  
Sbjct: 428  KARLAFLDAKDAVLSKAVNTPPRPAYYCSGCPHNSSTRVPEGSFALAGIGCHVMATAIYP 487

Query: 466  NTESL-IQMGGEGVNWIGKSRYTGNPHVFQNLGEGTYFHSGSMAIRQAVAAGINITYKIL 524
                L   MGGEG  WIG++ ++  PHVF NLG+GTYFHSG +AIR AVAA +N+TYKIL
Sbjct: 488  EFNKLTTHMGGEGAPWIGQAPFSKVPHVFANLGDGTYFHSGYLAIRAAVAAKVNMTYKIL 547

Query: 525  FNDAVAMTGGQPVDGQITVDRIAQQMAAEGVNRVVVLSDEPEKYDGHHDLFPKDVTFHDR 584
            +NDAVAMTGGQPVDG ++V  IAQQMAAEGV R+ ++S++P +Y     L P  VT HDR
Sbjct: 548  YNDAVAMTGGQPVDGTVSVPMIAQQMAAEGVQRIALVSEDPGRYADRSSL-PAAVTVHDR 606

Query: 585  SELDQVQRELRDIPGCTVLIYDQTCAAEKRRRRKRKQFPDPAKRAFINHHVCEGCGDCSV 644
             ++D VQRELR++PG TV+IYDQTCAAEKRRRRK+  +PDP +R FIN  VCEGCGDC V
Sbjct: 607  KDMDAVQRELRELPGVTVIIYDQTCAAEKRRRRKKGDYPDPNQRLFINAAVCEGCGDCGV 666

Query: 645  QSNCLSVVPRKTELGRKRKIDQSSCNKDFSCVNGFCPSFVTIEGGQLRKSRGVDTGSVLT 704
            QSNC S++P +T+LGRKR IDQSSCNKD+SCV GFCPSFVT+ GG+LRKSR   TG    
Sbjct: 667  QSNCTSILPLETDLGRKRVIDQSSCNKDYSCVKGFCPSFVTVTGGKLRKSR---TGVSRQ 723

Query: 705  RKLAD---IPAPKLPEMTGSYDLLVGGVGGTGVVTVGQLITMAAHLESRGASVLDFMGFA 761
             +  D   +P P LP     +++L+ G+GGTGV+T+G L+ MAAHLE +GASVLD  G +
Sbjct: 724  EERDDFGLLPQPVLPACDTPFNILINGIGGTGVITIGALMGMAAHLEGKGASVLDMTGMS 783

Query: 762  QKGGTVLSYVRMAPSPDKLHQVRISNGQADAVIACDLVVASSQKALSVLRPNHTRIVANE 821
            QK G+V S+VR+A   D +   RI+ G+AD V+ CD++ A +  A+S +RP  TR V N 
Sbjct: 784  QKNGSVTSHVRIAARRDAIRAQRIATGEADLVLGCDMLTAGAFDAISKMRPGRTRAVVNL 843

Query: 822  AELPTADYVLFRDADMKADKRLGLLKNAVGEDHFDQLDANGIAEKLMGDTVFSNVMMLGF 881
             + P   +    D +   ++   L+  +VG+   D +DA  +A  LMGD++ +N+ MLG+
Sbjct: 844  HQQPPGQFARNPDWEFPVEEVKALIVESVGQ-QADFIDATRLATALMGDSIATNLFMLGY 902

Query: 882  AWQKGLLPLSEAALMKAIELNGVAIDRNKEAFGWGRLSAVDPSAVTDLLDDSNAQVVEVK 941
            AWQ+G LPL+EA+L++AIELNGVA+  NK AF WGR +AVD + V  +     AQ V + 
Sbjct: 903  AWQRGELPLTEASLLRAIELNGVAVQANKTAFAWGRRAAVDLARVEQIA--VPAQPVLLH 960

Query: 942  PEPTLDELINTRHKHLVNYQNQRWADQYRDAVAGVRKAEESLGETNLLLTRAVAQQLYRF 1001
               +LD+LI  R   L +YQ+  +A Q+ + V  VR AE +LG     L  AVA+ L R 
Sbjct: 961  LPQSLDQLIKRRVSLLTDYQDAAYAAQFLEVVEAVRAAEAALGSDK--LATAVARNLSRL 1018

Query: 1002 MAYKDEYEVARLFAETDFMKEVNETFEGDFKVHFHLAPPLLSGETDAQGRPKKRRFGPWM 1061
            MAYKDEYEVARL+    F KE+ + FEGDF + FHLAPPLL+   D QG   K R+G W+
Sbjct: 1019 MAYKDEYEVARLYTNGQFQKELAQQFEGDFSLSFHLAPPLLA-RKDGQGHLLKARYGGWV 1077

Query: 1062 FRAFRLLAKLRGLRGTAIDPFRYSADRKLDRAMLKDYQSLVDRIGRELNASNYETFLQLA 1121
             +AF+LLA+++GLRG  +D F ++ +R+++R ++  Y+ LV  +   L A+N  T ++LA
Sbjct: 1078 MQAFKLLARMKGLRGGLLDLFGHTEERRMERELIVQYRQLVLDLLPRLTAANLATAIELA 1137

Query: 1122 ELPADVRGYGPVREQAAESIREKQTQLI 1149
            +LP  VRGYG V+ +A  ++R +Q QL+
Sbjct: 1138 QLPEQVRGYGHVKLKAVHAMRTRQQQLL 1165


Lambda     K      H
   0.319    0.136    0.405 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2948
Number of extensions: 104
Number of successful extensions: 11
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1172
Length of database: 1180
Length adjustment: 47
Effective length of query: 1125
Effective length of database: 1133
Effective search space:  1274625
Effective search space used:  1274625
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory