Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_013090943.1 BC1002_RS15505 trifunctional transcriptional regulator/proline dehydrogenase/L-glutamate gamma-semialdehyde dehydrogenase
Query= reanno::Cup4G11:RR42_RS20125 (1333 letters) >NCBI__GCF_000092885.1:WP_013090943.1 Length = 1312 Score = 1754 bits (4544), Expect = 0.0 Identities = 926/1350 (68%), Positives = 1063/1350 (78%), Gaps = 55/1350 (4%) Query: 1 MATTTLGVKLDDASRERLKRVAQSIDRTPHWLIKQAIFTYLEQVERGNIPHETSAAGTGS 60 MA+TTLGVK+DD R RLK A ++RTPHWLIKQAIF YLE++E G +P E S Sbjct: 1 MASTTLGVKVDDLLRSRLKDAATRLERTPHWLIKQAIFAYLEKIEHGQLPAELS------ 54 Query: 61 EGAADGADAFDGAA----SDGAIQPFLEFAQSVQPQSVLRAAITAAYRRPESECVPVLLE 116 GAA AD DGAA DGA PFL+FAQ+VQPQSVLRAAITAAYRRPE ECVP LL Sbjct: 55 -GAAGTADLADGAAVENEEDGASHPFLDFAQNVQPQSVLRAAITAAYRRPEPECVPFLLG 113 Query: 117 QARLPHQQAEAALAMARTLATRLRERKVGTGREGLVQGLIQEFSLSSQEGVALMCLAEAL 176 QARLP A AMA L LR + G G V+GLI EFSLSSQEGVALMCLAEAL Sbjct: 114 QARLPANLAGDVQAMAGKLVETLRTKSKGGG----VEGLIHEFSLSSQEGVALMCLAEAL 169 Query: 177 LRIPDKATRDALIRDKISGANWQSHLGQSPSVFVNAATWGLLFTGKLVATHTEAGLSKAL 236 LRIPD+ATRDALIRDKIS +W+SH+GQ+PS+FVNAATWGL+ TGKLV T++E LS AL Sbjct: 170 LRIPDRATRDALIRDKISKGDWKSHVGQAPSLFVNAATWGLMITGKLVTTNSETSLSSAL 229 Query: 237 TRIIGKGGEPLIRKGVDMAMRLMGEQFVTGETISEALANARKYEAEGFRYSYDMLGEAAM 296 TR+IGKGGEPLIRKGVDMAMRLMGEQFVTGE ISEALAN+RKYEA GFRYSYDMLGEAA Sbjct: 230 TRLIGKGGEPLIRKGVDMAMRLMGEQFVTGENISEALANSRKYEARGFRYSYDMLGEAAT 289 Query: 297 TEADAQRYLASYEQAINAIGQASRGRGIYEGPGISIKLSALHPRYSRAQHERVIGELYGR 356 TEADAQRY ASYEQAI+AIG+A+ GRGIYEGPGISIKLSALHPRYSRAQ ER + EL R Sbjct: 290 TEADAQRYFASYEQAIHAIGKAAGGRGIYEGPGISIKLSALHPRYSRAQQERTMSELLPR 349 Query: 357 LKSLTLLARQYDIGINIDAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQGYQKRCPF 416 ++SL +LAR+YDIG+NIDAEEADRLE+SLDLLE LCF+PEL GWNGIGFVVQ YQKRCPF Sbjct: 350 VRSLAILARRYDIGLNIDAEEADRLELSLDLLEALCFDPELQGWNGIGFVVQAYQKRCPF 409 Query: 417 VIDYLIDLARRSRHRLMIRLVKGAYWDSEIKRAQVDGLEGYPVYTRKVYTDVSYVACARK 476 VIDY++DLARRSRHR+M+RLVKGAYWD+EIKRAQVDGLEGYPVYTRKVYTDVSY+ACA+K Sbjct: 410 VIDYIVDLARRSRHRIMVRLVKGAYWDTEIKRAQVDGLEGYPVYTRKVYTDVSYLACAKK 469 Query: 477 LLSVPDVIYPQFATHNAHTLAAIYQIAGHNYYPGQYEFQCLHGMGEPLYDQVVGPLADGK 536 LL PD +YPQFATHNAHTL+AIY +AG+NYYPGQYEFQCLHGMGEPLY++V G K Sbjct: 470 LLGAPDAVYPQFATHNAHTLSAIYHLAGNNYYPGQYEFQCLHGMGEPLYEEVTG---RDK 526 Query: 537 FNRPCRIYAPVGTHETLLAYLVRRLLENGANTSFVNRIADDTISLDELVADPVAVVEQMH 596 NRPCR+YAPVGTHETLLAYLVRRLLENGANTSFVNRIAD+++++ +LVADPV ++ Sbjct: 527 LNRPCRVYAPVGTHETLLAYLVRRLLENGANTSFVNRIADESVAIKDLVADPVEEASKI- 585 Query: 597 ADEGALGLPHPRIAQPRTLYGESRANSAGIDLSNEHRLASLSSALLAGTSEAVSAVPLLG 656 LG PH RI PR LYG R NS G+DLSNEHRLASLSSALLA A P+L Sbjct: 586 ---VPLGAPHARIPLPRNLYGAERLNSMGLDLSNEHRLASLSSALLASAHHPWRAAPMLE 642 Query: 657 TEAAAGEDVNQPAPVRNPSDQRDVVGHVTEASMAEVEAALQAAVNAAPIWQATPADVRAA 716 A V VRNP+DQRD+VG V EA+ V AAL AV AAPIWQATP + RA Sbjct: 643 DNQIA---VGAGRDVRNPADQRDLVGTVVEATPEHVSAALGHAVAAAPIWQATPVEARAD 699 Query: 717 ALERAAELMEAQMQSLMGIIVREAGKTFSNAIAEVREAVDFLRYYAAQVRETFSSDTHRP 776 L RAA+L+EAQM +LMG++VREAGK+ +NA+AE+REA+DFLRYY+AQ+R+ FS+DTHRP Sbjct: 700 CLARAADLLEAQMHTLMGLVVREAGKSLANAVAEIREAIDFLRYYSAQIRDEFSNDTHRP 759 Query: 777 LGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRLLREAGVPAGAV 836 LGPVVCISPWNFPLAIF GQVAAALAAGNTVLAKPAEQTPLIAAQAVR+LREAGVPAGAV Sbjct: 760 LGPVVCISPWNFPLAIFMGQVAAALAAGNTVLAKPAEQTPLIAAQAVRILREAGVPAGAV 819 Query: 837 QLLPGRGETVGAALVGDARVKGVMFTGSTEVARLLQRSVAGRLDAAGRPVPLIAETGGQN 896 QLLPG GETVGAALV D R + VMFTGSTEVARL+ ++++GRLD G+P+PLIAETGGQN Sbjct: 820 QLLPGNGETVGAALVADPRTRAVMFTGSTEVARLINKTLSGRLDPDGKPIPLIAETGGQN 879 Query: 897 AMIVDSSALAEQVVGDVVNSAFDSAGQRCSALRVLCLQEEVADRVLEMLKGAMDELTMGN 956 AMIVDSSALAEQVV DV+ S+FDSAGQRCSALRVLCLQ+++ADR LEML GAM EL +GN Sbjct: 880 AMIVDSSALAEQVVADVLQSSFDSAGQRCSALRVLCLQDDIADRTLEMLTGAMRELALGN 939 Query: 957 PDRLSTDVGPVIDEEARGNIVRHIDAMRAKGRRVHQAD-PNGALSAACRNGTFVSPTLIE 1015 PDRLSTDVGPVID +A+ I HI AMR KGR+V Q P GA GTFV PTLIE Sbjct: 940 PDRLSTDVGPVIDLDAKRGIDAHIAAMRDKGRKVEQLPLPEGA-----TQGTFVPPTLIE 994 Query: 1016 LDSIEELQREVFGPVLHVVRYPRAGLDTLLAQINGTGYGLTMGIHTRIDETIEHIVERAE 1075 LDSI+EL+REVFGPVLHVVRY R+ LD LL QI TGYGLT+GIHTRIDETI H++ A Sbjct: 995 LDSIDELKREVFGPVLHVVRYRRSQLDKLLEQIRATGYGLTLGIHTRIDETIAHVISHAH 1054 Query: 1076 VGNLYVNRNIVGAVVGVQPFGGEGLSGTGPKAGGPLYLHRLLSVCPLD-----AVARVVR 1130 VGN+YVNRN++GAVVGVQPFGGEGLSGTGPKAGG LYL RLL+ P A A VV Sbjct: 1055 VGNIYVNRNVIGAVVGVQPFGGEGLSGTGPKAGGALYLQRLLAKRPAGLPKSLAQALVVE 1114 Query: 1131 ASDTVGGADETGPVRRTLTETLATLKEW--AQRESAALPGLVAACERFAAASAAGLSVTL 1188 A++ G LT L++W A+RE P L A C+ + + AG + L Sbjct: 1115 VPQAAQAAEKGGNPAAALT----ALRDWLIAERE----PQLAARCDGYLSHVPAGATAVL 1166 Query: 1189 PGPTGERNTYTLLPRAAVLCLAQQETDLAVQLAAVLAAGSQAVWVESPMARALFARLPKA 1248 GPTGERNTYTL PR VLC+A + VQ AA LA G++A++ E L ++LP + Sbjct: 1167 SGPTGERNTYTLGPRGTVLCIASTASGARVQFAAALATGNRALF-EGAAGEQLVSQLPAS 1225 Query: 1249 VQSRVRLVADWSAGDTGFDAVLHHGDSDQLRAVCEQLATRPGPIISVQG-----LAHGEP 1303 +++ + + D FDAVL GDSD+L + +++A R GPI+SVQG LA G+ Sbjct: 1226 LKAHASVK---KSADAPFDAVLFEGDSDELLTLVKEVAKRAGPIVSVQGVAARALASGDE 1282 Query: 1304 NIAIERLLIERSLSVNTAAAGGNASLMTIG 1333 + A+ERLL ERS+SVNTAAAGGNA+LMTIG Sbjct: 1283 DYALERLLTERSVSVNTAAAGGNANLMTIG 1312 Lambda K H 0.318 0.133 0.383 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 4007 Number of extensions: 154 Number of successful extensions: 11 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1333 Length of database: 1312 Length adjustment: 49 Effective length of query: 1284 Effective length of database: 1263 Effective search space: 1621692 Effective search space used: 1621692 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 59 (27.3 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory