Align L-glutamate gamma-semialdehyde dehydrogenase (EC 1.2.1.88); Proline dehydrogenase (EC 1.5.5.2) (characterized)
to candidate WP_062735451.1 AYX06_RS08775 aldehyde dehydrogenase family protein
Query= reanno::Cup4G11:RR42_RS20125 (1333 letters) >NCBI__GCF_001580365.1:WP_062735451.1 Length = 1178 Score = 286 bits (733), Expect = 6e-81 Identities = 323/1051 (30%), Positives = 446/1051 (42%), Gaps = 126/1051 (11%) Query: 257 RLMGEQFVTGETISEALANARKYEAEGFRYSYDMLGEAAMTEADAQRYLASYEQAINAIG 316 RL+G V A RK A+G R + ++LGE + + +A ++LA +A Sbjct: 119 RLVGHMIVDARPKPFGQA-VRKLTAQGHRLNINLLGEEVLGQEEADKHLA------DAAA 171 Query: 317 QASRGRGIYEGPGISIKLSALHPRYSRAQHERVIGELYGRLKSLTLLARQYDIG---INI 373 R Y +S+K+S++ P+ S E + + RL + A Q G IN+ Sbjct: 172 LLDRDDVDY----VSLKVSSVVPQLSHWGFEGTVAAVVERLLPVCRAAAQAPAGSKFINL 227 Query: 374 DAEEADRLEISLDLLERLCFEPELAGWNGIGFVVQGYQKRCPFVIDYLIDLARRSRHR-- 431 D E L ++ ++ RL PEL G G VVQ Y P + + L+ S R Sbjct: 228 DMEVYSDLHLTTEVFTRLLERPELRGLEA-GIVVQAY---LPDALAVVRRLSAWSAERVA 283 Query: 432 -----LMIRLVKGAYWDSEIKRAQVDGLEGYPVYT--RKVYTDVSYVACARKLLSVPDVI 484 + +RLVKGA E A+ L G+P+ T K TD +Y +++L V Sbjct: 284 AGGAGIKVRLVKGANLPMERVDAE---LHGWPLATWGTKQDTDTNY----KRVLDW--VF 334 Query: 485 YPQ--------FATHNAHTLAAIYQIAGHNYYPGQYEFQCLHGMGEPLYDQVVGPLADGK 536 P+ A HN +A + +A + EF+ L GM V + + Sbjct: 335 RPERMSGLRLGVAGHNLFDIAFAHLLAQRRGVTERIEFEMLQGMAVEQARAVSADVGE-- 392 Query: 537 FNRPCRIYAPV---GTHETLLAYLVRRLLENGANTSFVNRIADDTISLDELVADPVAVVE 593 +Y P + ++YLVRRL EN ++ +F++ I D + + Sbjct: 393 ----LLLYVPAVRPAEFDVAVSYLVRRLEENASHENFMSGIFDLAPDNEVFAREERRFRA 448 Query: 594 QMHA-DEGALGLPHPRIAQPRTLYGESRANSAGIDLSNEHRLASLSSALLAGTSEAVSAV 652 + A D P P AQ R+ G A S + L R + + + Sbjct: 449 ALEALDPRQDADPVPARAQDRSAPG---AGSEHLPLDEAFRNEPDTDVAVPANQAWAEGI 505 Query: 653 PLLGTEAAAGEDVNQPAPVRNPSDQRDVVGHVTEASMAEVEAALQAAVNAAPIWQATPAD 712 T+ E++ P PV A+VE + AA W A A Sbjct: 506 AARITDRRWYEELPVPGPV----------------DAADVERLVATGRAAAEDWHARGAR 549 Query: 713 VRAAALERAAELMEAQMQSLMGIIVREAGKTFSNAIAEVREAVDFLRYYAAQVRETFSSD 772 RA L RAA+++ A+ L+ + EAGKT + EV EAVDF R+YA + E + D Sbjct: 550 ERAEILNRAADVLAARRAHLLAVAGAEAGKTLGQSDPEVSEAVDFCRWYARRALELETVD 609 Query: 773 THR--PLGPVVCISPWNFPLAIFTGQVAAALAAGNTVLAKPAEQTPLIAAQAVRLLREAG 830 R P V+ PWNFPLAI TG AALAAG VL KPA + + V L EAG Sbjct: 610 GARFVPDRLVLVTPPWNFPLAIPTGSTVAALAAGAAVLHKPAPEVRRCSTAVVEALWEAG 669 Query: 831 VPAGAVQLLPGRGETVGAALVGDARVKGVMFTGSTEVARLLQRSVAGRLDAAGRPVPLIA 890 VP +QL+ VG ALV V ++ TG+ E A+L GR PL A Sbjct: 670 VPRDVLQLVDSVEGEVGQALVSHPGVDRIVLTGAYETAQLFGSWRPGR--------PLTA 721 Query: 891 ETGGQNAMIVDSSALAEQVVGDVVNSAFDSAGQRCSALRVLCLQEEVA--DRVLEMLKGA 948 ET G+NAM++ SA ++ V D+V SAF AGQ+CSA + L VA R L A Sbjct: 722 ETSGKNAMVITPSADRDRAVWDLVTSAFGHAGQKCSAASLGILVGSVARSPRFRRQLLDA 781 Query: 949 MDELTMGNPDRLSTDVGPVIDEEARGNIVR---HIDAMRAKGRRVHQADPNGAL-----S 1000 + + P L VGP IDE G + R +D Q D G L Sbjct: 782 ASSMVVDWPQNLQATVGPTIDEPG-GKLRRALTRLDPGEEWLLEPRQLDDTGRLWRPGIK 840 Query: 1001 AACRNGTFVSPTLIELDSIEELQREVFGPVLHVVRYPRAGLDTLLAQINGTGYGLTMGIH 1060 G+F T E FGPVL ++ + L L N YGLT G+H Sbjct: 841 TGVAPGSFYHRT------------ECFGPVLGLME--ASDLHEALELQNAVHYGLTAGLH 886 Query: 1061 TRIDETIEHIVERAEVGNLYVNRNIVGAVVGVQPFGGEGLS--GTGPKAGGPLYLHRLLS 1118 + E I V+ E GNLYVNR I GA+V QPFGG S G G KAGGP YL +L S Sbjct: 887 SLDPEEISTWVDLVEAGNLYVNRGITGAIVQRQPFGGWKRSSVGLGAKAGGPNYLVQLGS 946 Query: 1119 VCPLDAVARVVRASDTVGGADETGPVRRTLTETLATLKEWAQRESAALPGLVAACERFAA 1178 A V A + GA GP+ L L+ AA A ++ Sbjct: 947 W--QKAPFAEVPAGQSGTGARPAGPI----AHHLEALRPLVPEADAAWLTACAEQDQAWW 1000 Query: 1179 ASAAGLSVTLPGPTGERNTYTLLPRAAVLCLAQQE--TD-LAVQLAAVLAAGSQAVWVES 1235 G G E N + PR V+ A+ TD + V LAAV + S+AV V + Sbjct: 1001 EREFGRVHDASGLACEANLFRYRPRDVVVRAAEDAPLTDVIRVGLAAVRSC-SRAVLVPA 1059 Query: 1236 P----MARALFARL--PKAVQSRVRLVADWS 1260 R LF + P ++ R+ ADWS Sbjct: 1060 AEVPGEVRELFRSVLDPCGDEAFARVAADWS 1090 Lambda K H 0.318 0.133 0.383 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3198 Number of extensions: 153 Number of successful extensions: 10 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1333 Length of database: 1178 Length adjustment: 48 Effective length of query: 1285 Effective length of database: 1130 Effective search space: 1452050 Effective search space used: 1452050 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 59 (27.3 bits)
This GapMind analysis is from Sep 24 2021. The underlying query database was built on Sep 17 2021.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory