GapMind for catabolism of small carbon sources

 

Alignments for a candidate for put1 in Kocuria turfanensis HO-9042

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_084271505.1 AYX06_RS07690 bifunctional proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::azobra:AZOBR_RS23695
         (1235 letters)



>NCBI__GCF_001580365.1:WP_084271505.1
          Length = 1168

 Score =  253 bits (647), Expect = 5e-71
 Identities = 299/971 (30%), Positives = 418/971 (43%), Gaps = 114/971 (11%)

Query: 262  ISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIG--LNIDAEEADRLELSLDL 319
            +SIK+S+    +S    D  + E++ ++  L   A  +     +N+D EE   L+++L +
Sbjct: 196  VSIKVSSTVAPHSAWAFDEAVAEVVEKLTPLYARAVSFSTPKFINLDMEEYKDLDMTLAV 255

Query: 320  MESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDL-----ARRS--GHRLMIRLVKGA 372
               +   P        G V+Q Y    P  +  +I L     ARR+  G R+ +R+VKGA
Sbjct: 256  FTRILDQPQFQDLEA-GIVLQTY---LPDALQAMIRLQEWAQARRAAGGARVKVRVVKGA 311

Query: 373  YWDSEIKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAP--EAVFPQFATHNAQTLAT 430
                E   A L   P    +  K  +D  Y       L     +AV    A HN   +A 
Sbjct: 312  NLPMEQVEASLHDWP-LATWGTKQDSDTHYKRVLNYALQPEHADAVRIGVAGHNLFDIAF 370

Query: 431  IYEMAGSDFQVGKYEFQCLHGMGE---PLYKEVVGPLKRPCRIYAPV---GTHETLLAYL 484
             + +A         +F+ L GM        K  VG L     +Y PV      +  +AYL
Sbjct: 371  AWLLASERGVTEAIDFEMLLGMATGQAQAVKRDVGSLL----LYTPVVHPAEFDVAIAYL 426

Query: 485  VRRLLENGANSSFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERA 544
            +RRL E  +  +F++ + + +   +    +     R +A   A    + LP         
Sbjct: 427  IRRLEEGASQENFMSAVFELSANEELFEREK---NRFLASLAALDDTVPLPHR------- 476

Query: 545  NSAGIDLSDETELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVT 604
                   + +  L  L A     A  T    P +A     G+A   R P       G+ T
Sbjct: 477  -------TQDRRLPELPAPTDGFAN-TPDTDPAVAANRDWGRAILDRVPGS---TAGTAT 525

Query: 605  EASEALVAE-----AFGHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREA 659
             A+ AL  E         A+AA   WAA    +RA  L RA + +  R   LL +   EA
Sbjct: 526  VAAAALQTEEQLDRVIDAALAAGRGWAALTGAQRAEVLHRAGEALAARRAELLEVAAAEA 585

Query: 660  GKSLPNAIAEVREAIDFLRYYGAQVR--DRFDNATHRPLGPVVCISPWNFPLAIFSGQIA 717
            GK+L  A  E+ EAIDF  YY  + R  D  D AT  P    V   PWNFP+AI +G   
Sbjct: 586  GKTLDQADPEISEAIDFAHYYAERARELDEVDGATFLPSRLTVVTPPWNFPVAIPAGSAL 645

Query: 718  AALAAGNPVLAKPAEETP---LIAAEAV-RILHAAGIPAGALQLLP-GAGEVGAALVGHE 772
            AALAAG+PV+ KPA  T     + AEA+   L A+GIP   L L+      +G  L+ H 
Sbjct: 646  AALAAGSPVVFKPAGPTKRCGAVVAEAIWEALDASGIPRDVLALVDLEERALGRQLISHP 705

Query: 773  AVRGVMFTGSTEVARLIQRQLAGRLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVI 832
            +V  ++ TG+ E A L +           + +PL+AET G+NA++V  SA  +    DV 
Sbjct: 706  SVDRLILTGAYETAELFR--------SFRSDLPLLAETSGKNALVVTPSADFDLAAKDVA 757

Query: 833  ASAFDSAGQRCSALRILCLQEDVA--DRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEA 890
             SAF  AGQ+CSA  ++ L   VA   R    L  A+  L +  P   A  +GP ++E A
Sbjct: 758  YSAFGHAGQKCSAASLVILVGSVAKSKRFRNQLVDAVSSLTVDYPTNPAAQMGP-LTEPA 816

Query: 891  RATIAAHIEAMRAKGRNVEFLPLPAETAD-GTFIAP---TVIEIGGIHELEREVFGPVLH 946
               +   ++A+        +L  P +  D G   +P   T ++ G    L  E FGPVL 
Sbjct: 817  EGKL---LQALTTLEDGQSWLIEPQQLDDTGKLWSPGVRTGVQRGSEFHLV-EYFGPVLG 872

Query: 947  VVRFHRDDLDALVDSINATGYGLTFGLHTRIDATIERVTGRIGAGNVYVNRNTIGAVVGV 1006
            V+    + L+  V   N   YGLT GLH+     I      + AGNVYVNR   GA+V  
Sbjct: 873  VMT--AETLEEAVAIQNEVDYGLTGGLHSLDPEEIGYWLEHVQAGNVYVNRGITGAIVQR 930

Query: 1007 QPFGGHGLSGTGP--KAGGPLYLSRLLSRRPKGWLEFRG-----PDAARAAGLAYGEWLR 1059
            QPFGG   S  GP  KAGGP YL  L    P      RG      DAA A  LA     R
Sbjct: 931  QPFGGWKKSAVGPGTKAGGPSYLMGLGEWEP---APLRGDGRPVTDAAVADLLAAARAGR 987

Query: 1060 AKGFTAEASRCAGYVARSAIGGGAELNGPVGE-RNLYELHGRGRVLLLPQT--------R 1110
            A G T + +       RS     AE  G   +   LY      R L LP T         
Sbjct: 988  A-GVTEQEAGTLEQALRSDEAAWAEEFGVARDVSQLYAERNVFRYLPLPVTVRLSEGEPL 1046

Query: 1111 TGLLLQLGAVLATGN----SAAVDAPPDLAELL--RGLPPALAARVRTTADW----RDVG 1160
              LL  +GA L  G     S+A++ P  +  +L  RG+P     RV   A W     ++G
Sbjct: 1047 ADLLRVVGAGLRAGAELSVSSALELPAAVRSVLTGRGVP----VRVEDDAAWLARAAELG 1102

Query: 1161 PLAAVLVEGDR 1171
                 L+ GDR
Sbjct: 1103 GGRVRLIGGDR 1113


Lambda     K      H
   0.319    0.136    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3182
Number of extensions: 167
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1235
Length of database: 1168
Length adjustment: 47
Effective length of query: 1188
Effective length of database: 1121
Effective search space:  1331748
Effective search space used:  1331748
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.3 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory