GapMind for catabolism of small carbon sources

 

Alignments for a candidate for put1 in Methylobacterium nodulans ORS 2060

Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_015928132.1 MNOD_RS06935 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase

Query= reanno::azobra:AZOBR_RS23695
         (1235 letters)



>NCBI__GCF_000022085.1:WP_015928132.1
          Length = 1231

 Score = 1593 bits (4124), Expect = 0.0
 Identities = 832/1222 (68%), Positives = 953/1222 (77%), Gaps = 11/1222 (0%)

Query: 18   FADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAAAATARKLITALRA 77
            F  F    R  T LRAA+TAAYRRPE EC+  L   A+LP     A A TA +L+ ALR 
Sbjct: 15   FGRFIAGTRHQTGLRAAVTAAYRRPEGECVQALLPLAALPEAQARAVAGTAERLVRALRE 74

Query: 78   KPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDKIAGGDWQAHLGKG 137
            K R  GVEGLIHEY+LSSQEG+ALMCLAEALLRIPD ATRDALIRDKIA GDW++H+G  
Sbjct: 75   KKRSGGVEGLIHEYALSSQEGVALMCLAEALLRIPDDATRDALIRDKIATGDWKSHVGHS 134

Query: 138  GSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGVDFAMRMMGEQFVT 197
             S+FVNAATWGL++TGKLT+   E +LS++LTRLIA+GGEPLIRRG D AMR+MGEQFVT
Sbjct: 135  PSLFVNAATWGLVVTGKLTATTSESSLSASLTRLIAKGGEPLIRRGTDLAMRLMGEQFVT 194

Query: 198  GQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAIHAIGTASAGRGVY 257
            GQTI EAL N+R MEA+GFRYSYDMLGEAA TA DAARY ADY  AIHAIG A+ GRG+Y
Sbjct: 195  GQTIAEALANSRRMEAKGFRYSYDMLGEAATTAADAARYLADYEGAIHAIGRAAQGRGIY 254

Query: 258  EGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLNIDAEEADRLELSL 317
            EGPGISIKLSA+HPRYSRA+ +RVM ELLPRVK LALLAKGYDIGLNIDAEEADRLE+SL
Sbjct: 255  EGPGISIKLSALHPRYSRAKIERVMGELLPRVKGLALLAKGYDIGLNIDAEEADRLEISL 314

Query: 318  DLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRLMIRLVKGAYWDSE 377
            DL+E+L  DPDL+GWNGIGFV+QAYGKRCP+V+D+++DLARR+  R+M+RLVKGAYWDSE
Sbjct: 315  DLLEALALDPDLSGWNGIGFVIQAYGKRCPFVVDWIVDLARRANRRIMVRLVKGAYWDSE 374

Query: 378  IKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHNAQTLATIYEMAGS 437
            IKRAQ DGL DFPV+TRKV+TDVSY+ACARKLLAAP+AVFPQFATHNAQTLA I  MAG 
Sbjct: 375  IKRAQTDGLEDFPVFTRKVHTDVSYLACARKLLAAPDAVFPQFATHNAQTLAAIMTMAGP 434

Query: 438  DFQVGKYEFQCLHGMGEPLYKEVVGP--LKRPCRIYAPVGTHETLLAYLVRRLLENGANS 495
            +F  G+YEFQCLHGMGEPLY+EVVGP  L RPCRIYAPVGTHETLLAYLVRRLLENGANS
Sbjct: 435  NFYRGQYEFQCLHGMGEPLYEEVVGPDKLNRPCRIYAPVGTHETLLAYLVRRLLENGANS 494

Query: 496  SFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERANSAGIDLSDET 555
            SFVNRIAD  VP+ +L+ADPV V +A  P G PH  I LPR+L+ P+R NSAG+DLS+E 
Sbjct: 495  SFVNRIADEKVPIADLIADPVTVVQATNPAGEPHERITLPRHLFGPDRENSAGLDLSNEE 554

Query: 556  ELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVTEASEALVAEAF 615
             LA L+  L  S    W A P  +  +       VRNPADRRDVVG   EA+ A + EA 
Sbjct: 555  RLAALAEDLKRSVAQDWQALP--SKAKAPAPLTDVRNPADRRDVVGRWREATAAEMTEAL 612

Query: 616  GHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKSLPNAIAEVREAID 675
              AV AA  WAATP  +RAA+L RAAD M+ RM  L+GLIVREAGKS PNA+AEVREA+D
Sbjct: 613  DAAVQAAPGWAATPAGDRAAALRRAADLMEARMGILIGLIVREAGKSFPNAVAEVREAVD 672

Query: 676  FLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAAGNPVLAKPAEETP 735
            FLRYY A+V     +     LGPVVCISPWNFPLAIF+GQ+AAALAAGN VLAKPAEETP
Sbjct: 673  FLRYYAAEVVRTLGSDRLPALGPVVCISPWNFPLAIFTGQVAAALAAGNVVLAKPAEETP 732

Query: 736  LIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGSTEVARLIQRQLAG 795
            LIAAEAVR+LH AGIP   LQL+PGAG VGAALV    V GVMFTGST VARLIQRQLA 
Sbjct: 733  LIAAEAVRLLHEAGIPIDGLQLVPGAGAVGAALVADPRVMGVMFTGSTAVARLIQRQLAD 792

Query: 796  RLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQEDV 855
            RL     PIP IAETGGQNA++VDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQE+V
Sbjct: 793  RLTASDQPIPFIAETGGQNALVVDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQEEV 852

Query: 856  ADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMRAKGRNVEFLPLPA 915
            ADR L ML+GAM EL +GNP  L+VDVGPVI+ EAR  I  H+  M A+G  V  LPL  
Sbjct: 853  ADRILEMLRGAMDELAVGNPGDLSVDVGPVITAEARDGIERHVTTMEARGHRVIRLPLGP 912

Query: 916  ETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSINATGYGLTFGLHT 975
            ET+ GTF+APT+IEIG I ++E+EVFGPVLHV+R+ R DLD L+D IN TGYGLTFGLHT
Sbjct: 913  ETSHGTFVAPTIIEIGRIADVEQEVFGPVLHVLRYRRADLDRLIDDINETGYGLTFGLHT 972

Query: 976  RIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAGGPLYLSRLLSRRP 1035
            RID TI RV  R+GAGNVYVNRN IGAVVGVQPFGG GLSGTGPKAGGPLYL RL+   P
Sbjct: 973  RIDETISRVVNRVGAGNVYVNRNIIGAVVGVQPFGGSGLSGTGPKAGGPLYLGRLMGTPP 1032

Query: 1036 KGWLEFRGPDAA---RAAGLAYGEWLRAKGFTAEASRCAGYVARSAIGGGAELNGPVGER 1092
            +  L  RG DAA    +    Y +WLRA+GF  +A R  G+V+RS++G   EL GPVGER
Sbjct: 1033 QTAL--RGLDAAPTTLSVARIYADWLRAQGFGEQAERVVGHVSRSSLGARVELPGPVGER 1090

Query: 1093 NLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGLPPALAARVRT 1152
            N+Y L  RGRV  L +T  GLL+Q+GA+LATGN A +DA   +  +L  LP  +   +  
Sbjct: 1091 NVYALRPRGRVAALAKTAEGLLVQVGAILATGNVAVLDAGSPVRAVLDHLPKEVRPSIEI 1150

Query: 1153 TADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAGRGEGYDLDLL 1212
              DWR    L  VL EGDR+ +  +NR VA   G I+ VQA TA AL    G+ YDL+ L
Sbjct: 1151 VPDWRATPELRGVLFEGDRDDLLQLNRAVAAREGSIVPVQATTAAALK--NGDDYDLNRL 1208

Query: 1213 LNERSVSVNTAAAGGNASLVAM 1234
            L E S+S NTAAAGGNASL+++
Sbjct: 1209 LEECSISTNTAAAGGNASLMSI 1230


Lambda     K      H
   0.319    0.136    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3828
Number of extensions: 160
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1235
Length of database: 1231
Length adjustment: 47
Effective length of query: 1188
Effective length of database: 1184
Effective search space:  1406592
Effective search space used:  1406592
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 59 (27.3 bits)

This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory