Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_015928132.1 MNOD_RS06935 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase
Query= reanno::azobra:AZOBR_RS23695 (1235 letters) >NCBI__GCF_000022085.1:WP_015928132.1 Length = 1231 Score = 1593 bits (4124), Expect = 0.0 Identities = 832/1222 (68%), Positives = 953/1222 (77%), Gaps = 11/1222 (0%) Query: 18 FADFAPPIRPATELRAAITAAYRRPEPECLPFLFEQASLPPGVITAAAATARKLITALRA 77 F F R T LRAA+TAAYRRPE EC+ L A+LP A A TA +L+ ALR Sbjct: 15 FGRFIAGTRHQTGLRAAVTAAYRRPEGECVQALLPLAALPEAQARAVAGTAERLVRALRE 74 Query: 78 KPRGRGVEGLIHEYSLSSQEGMALMCLAEALLRIPDHATRDALIRDKIAGGDWQAHLGKG 137 K R GVEGLIHEY+LSSQEG+ALMCLAEALLRIPD ATRDALIRDKIA GDW++H+G Sbjct: 75 KKRSGGVEGLIHEYALSSQEGVALMCLAEALLRIPDDATRDALIRDKIATGDWKSHVGHS 134 Query: 138 GSMFVNAATWGLLITGKLTSAGGEQALSSALTRLIARGGEPLIRRGVDFAMRMMGEQFVT 197 S+FVNAATWGL++TGKLT+ E +LS++LTRLIA+GGEPLIRRG D AMR+MGEQFVT Sbjct: 135 PSLFVNAATWGLVVTGKLTATTSESSLSASLTRLIAKGGEPLIRRGTDLAMRLMGEQFVT 194 Query: 198 GQTIQEALTNARTMEAEGFRYSYDMLGEAALTAEDAARYYADYVNAIHAIGTASAGRGVY 257 GQTI EAL N+R MEA+GFRYSYDMLGEAA TA DAARY ADY AIHAIG A+ GRG+Y Sbjct: 195 GQTIAEALANSRRMEAKGFRYSYDMLGEAATTAADAARYLADYEGAIHAIGRAAQGRGIY 254 Query: 258 EGPGISIKLSAIHPRYSRAQADRVMDELLPRVKALALLAKGYDIGLNIDAEEADRLELSL 317 EGPGISIKLSA+HPRYSRA+ +RVM ELLPRVK LALLAKGYDIGLNIDAEEADRLE+SL Sbjct: 255 EGPGISIKLSALHPRYSRAKIERVMGELLPRVKGLALLAKGYDIGLNIDAEEADRLEISL 314 Query: 318 DLMESLCFDPDLAGWNGIGFVVQAYGKRCPYVIDFLIDLARRSGHRLMIRLVKGAYWDSE 377 DL+E+L DPDL+GWNGIGFV+QAYGKRCP+V+D+++DLARR+ R+M+RLVKGAYWDSE Sbjct: 315 DLLEALALDPDLSGWNGIGFVIQAYGKRCPFVVDWIVDLARRANRRIMVRLVKGAYWDSE 374 Query: 378 IKRAQLDGLPDFPVYTRKVYTDVSYVACARKLLAAPEAVFPQFATHNAQTLATIYEMAGS 437 IKRAQ DGL DFPV+TRKV+TDVSY+ACARKLLAAP+AVFPQFATHNAQTLA I MAG Sbjct: 375 IKRAQTDGLEDFPVFTRKVHTDVSYLACARKLLAAPDAVFPQFATHNAQTLAAIMTMAGP 434 Query: 438 DFQVGKYEFQCLHGMGEPLYKEVVGP--LKRPCRIYAPVGTHETLLAYLVRRLLENGANS 495 +F G+YEFQCLHGMGEPLY+EVVGP L RPCRIYAPVGTHETLLAYLVRRLLENGANS Sbjct: 435 NFYRGQYEFQCLHGMGEPLYEEVVGPDKLNRPCRIYAPVGTHETLLAYLVRRLLENGANS 494 Query: 496 SFVNRIADPAVPVDELVADPVAVARAIAPTGAPHALIALPRNLYAPERANSAGIDLSDET 555 SFVNRIAD VP+ +L+ADPV V +A P G PH I LPR+L+ P+R NSAG+DLS+E Sbjct: 495 SFVNRIADEKVPIADLIADPVTVVQATNPAGEPHERITLPRHLFGPDRENSAGLDLSNEE 554 Query: 556 ELARLSAALSASAEMTWTAAPLLADGERAGQAQPVRNPADRRDVVGSVTEASEALVAEAF 615 LA L+ L S W A P + + VRNPADRRDVVG EA+ A + EA Sbjct: 555 RLAALAEDLKRSVAQDWQALP--SKAKAPAPLTDVRNPADRRDVVGRWREATAAEMTEAL 612 Query: 616 GHAVAAASAWAATPPEERAASLFRAADTMQERMPTLLGLIVREAGKSLPNAIAEVREAID 675 AV AA WAATP +RAA+L RAAD M+ RM L+GLIVREAGKS PNA+AEVREA+D Sbjct: 613 DAAVQAAPGWAATPAGDRAAALRRAADLMEARMGILIGLIVREAGKSFPNAVAEVREAVD 672 Query: 676 FLRYYGAQVRDRFDNATHRPLGPVVCISPWNFPLAIFSGQIAAALAAGNPVLAKPAEETP 735 FLRYY A+V + LGPVVCISPWNFPLAIF+GQ+AAALAAGN VLAKPAEETP Sbjct: 673 FLRYYAAEVVRTLGSDRLPALGPVVCISPWNFPLAIFTGQVAAALAAGNVVLAKPAEETP 732 Query: 736 LIAAEAVRILHAAGIPAGALQLLPGAGEVGAALVGHEAVRGVMFTGSTEVARLIQRQLAG 795 LIAAEAVR+LH AGIP LQL+PGAG VGAALV V GVMFTGST VARLIQRQLA Sbjct: 733 LIAAEAVRLLHEAGIPIDGLQLVPGAGAVGAALVADPRVMGVMFTGSTAVARLIQRQLAD 792 Query: 796 RLLPDGAPIPLIAETGGQNAMIVDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQEDV 855 RL PIP IAETGGQNA++VDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQE+V Sbjct: 793 RLTASDQPIPFIAETGGQNALVVDSSALAEQVVGDVIASAFDSAGQRCSALRILCLQEEV 852 Query: 856 ADRTLAMLKGAMRELRIGNPDRLAVDVGPVISEEARATIAAHIEAMRAKGRNVEFLPLPA 915 ADR L ML+GAM EL +GNP L+VDVGPVI+ EAR I H+ M A+G V LPL Sbjct: 853 ADRILEMLRGAMDELAVGNPGDLSVDVGPVITAEARDGIERHVTTMEARGHRVIRLPLGP 912 Query: 916 ETADGTFIAPTVIEIGGIHELEREVFGPVLHVVRFHRDDLDALVDSINATGYGLTFGLHT 975 ET+ GTF+APT+IEIG I ++E+EVFGPVLHV+R+ R DLD L+D IN TGYGLTFGLHT Sbjct: 913 ETSHGTFVAPTIIEIGRIADVEQEVFGPVLHVLRYRRADLDRLIDDINETGYGLTFGLHT 972 Query: 976 RIDATIERVTGRIGAGNVYVNRNTIGAVVGVQPFGGHGLSGTGPKAGGPLYLSRLLSRRP 1035 RID TI RV R+GAGNVYVNRN IGAVVGVQPFGG GLSGTGPKAGGPLYL RL+ P Sbjct: 973 RIDETISRVVNRVGAGNVYVNRNIIGAVVGVQPFGGSGLSGTGPKAGGPLYLGRLMGTPP 1032 Query: 1036 KGWLEFRGPDAA---RAAGLAYGEWLRAKGFTAEASRCAGYVARSAIGGGAELNGPVGER 1092 + L RG DAA + Y +WLRA+GF +A R G+V+RS++G EL GPVGER Sbjct: 1033 QTAL--RGLDAAPTTLSVARIYADWLRAQGFGEQAERVVGHVSRSSLGARVELPGPVGER 1090 Query: 1093 NLYELHGRGRVLLLPQTRTGLLLQLGAVLATGNSAAVDAPPDLAELLRGLPPALAARVRT 1152 N+Y L RGRV L +T GLL+Q+GA+LATGN A +DA + +L LP + + Sbjct: 1091 NVYALRPRGRVAALAKTAEGLLVQVGAILATGNVAVLDAGSPVRAVLDHLPKEVRPSIEI 1150 Query: 1153 TADWRDVGPLAAVLVEGDRERVTAINRRVADLPGPILLVQAATAEALAAGRGEGYDLDLL 1212 DWR L VL EGDR+ + +NR VA G I+ VQA TA AL G+ YDL+ L Sbjct: 1151 VPDWRATPELRGVLFEGDRDDLLQLNRAVAAREGSIVPVQATTAAALK--NGDDYDLNRL 1208 Query: 1213 LNERSVSVNTAAAGGNASLVAM 1234 L E S+S NTAAAGGNASL+++ Sbjct: 1209 LEECSISTNTAAAGGNASLMSI 1230 Lambda K H 0.319 0.136 0.396 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3828 Number of extensions: 160 Number of successful extensions: 7 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1235 Length of database: 1231 Length adjustment: 47 Effective length of query: 1188 Effective length of database: 1184 Effective search space: 1406592 Effective search space used: 1406592 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.4 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.8 bits) S2: 59 (27.3 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory