GapMind for catabolism of small carbon sources

 

Aligments for a candidate for gdhA in Cupriavidus basilensis 4G11

Align Glutamate dehydrogenase; EC 1.4.1.2 (characterized, see rationale)
to candidate RR42_RS07270 RR42_RS07270 NAD-glutamate dehydrogenase

Query= uniprot:G8AE86
         (1618 letters)



>FitnessBrowser__Cup4G11:RR42_RS07270
          Length = 1626

 Score = 1552 bits (4019), Expect = 0.0
 Identities = 864/1646 (52%), Positives = 1084/1646 (65%), Gaps = 64/1646 (3%)

Query: 1    MALRAEQLKDELTEEVVRQVRERLGRSRAAPAERFVRQFYDNVPPDDIIQAPAEQLYGAA 60
            M   +E  + +L  ++++  + RL       AE F+R +YD    +D+++     LYGA 
Sbjct: 1    MLAESEARQAQLLADLMQFAQGRLPGDMFVQAEPFLRHYYDLADAEDLLKRNVADLYGAV 60

Query: 61   LAMWQWGQQREATDRAKVRVYNPRVEEHGWQSHRTVVEIVNDDMPFLVDSVTAELNRQGL 120
            +A WQ  Q R     A++RVYNP +E+HGW S  +VVEIVNDDMPFLVDSVT E+NR GL
Sbjct: 61   MAHWQTAQ-RFVPGNARLRVYNPNLEQHGWHSDHSVVEIVNDDMPFLVDSVTMEINRLGL 119

Query: 121  TVHLVIHPVVRVKRDADGQLAEL----------YEPAAAPTDAAPESFMHVEVGAVTGAA 170
             +H  IHPV RV R+ADG +A++           EPA  P     ESF+H EV      A
Sbjct: 120  ALHSAIHPVFRVWRNADGTIAKVGLGGEGDTATSEPAGGPR---LESFIHFEVDRCGDTA 176

Query: 171  ALDQAREGLERVLADVRAAVADWRAMRQQVRAAIVEADCARAAVPAIIPDDEVDEAKAFL 230
              +  R G+ RVL DVRAAV DW AM    RA I   D          PD +  EA AFL
Sbjct: 177  TQEALRNGIARVLGDVRAAVQDWPAMTAITRATI---DMLAQGPDGNQPDTQ--EACAFL 231

Query: 231  SWADDDHFTFLGYREYRFESGADGADSSLGLVAGSGLGIL-------RDDSVTVFDGLRN 283
             W  D+HFTFLG R+Y   +  D     L  V GSG GIL        +D++T       
Sbjct: 232  QWMLDEHFTFLGQRDYELVARDDRF--YLRGVPGSGTGILSEILHPPEEDTLT------- 282

Query: 284  YATLPPDVRDFLRNPRVLMVTKGNRPSPVHRAVPMDAFLIKRFDAEGRIIGERLVAGLFT 343
               LP      + +   + VTK N  S VHR   +D   IK  DA G++ GER   GL+T
Sbjct: 283  --ELPAAATSVIEDTSPIFVTKANSRSTVHRPGYLDYVGIKLKDANGKLFGERRFVGLYT 340

Query: 344  SVAYNRSPREIPYLRRKVAEVMELAGFDPQGHDGKALLHILETYPRDELFQIQVPELLDI 403
            S  Y  S  +IP +RRK   ++  AGF  +GH  K+L+ I+E YPRDELFQ +  EL  I
Sbjct: 341  STTYMMSCEDIPLVRRKFGNILTRAGFLTKGHLYKSLVTIIEQYPRDELFQAEEDELFHI 400

Query: 404  AVGILHLQERQRLALFVRKDPFERFASCLVYVPRDRYDTTLRRRIQSILEAAYDGTCTGF 463
             +GIL LQE QR+ LFVR+D F+RF SCLV+VPRD+Y+T LR+RIQ +L AA+ G    F
Sbjct: 401  TLGILRLQEHQRIRLFVRRDRFDRFVSCLVFVPRDKYNTDLRQRIQKLLMAAFCGNTCEF 460

Query: 464  TTQLTESVLARLHFIIRTEPGRVPTVDATDLEARLVQASRGWDDHLRDALVEAHGEEQGR 523
            T QL+ES LAR+  I+R EPG +P V+  +LE R+VQA+R W D L  AL+++ GEEQG 
Sbjct: 461  TPQLSESPLARIQLIVRGEPGTMPEVNPDELEERIVQAARRWQDDLAAALLDSTGEEQGN 520

Query: 524  TLFRRYADAFPTAYREEFNAEAAVFDIERIEKATAQGTLGINLYRPLEAEGDELHVKIYH 583
             L RRY D+FP  YRE++ A  AV DIE +E A A G + +NLYRP+EA       K+Y 
Sbjct: 521  RLLRRYGDSFPAGYREDYPARIAVRDIELMEAAQATGGIAMNLYRPIEAAPGAFRFKVYR 580

Query: 584  EGRPVPLSDVLPMLEHMDLKVITEAPFEIAIAGHAAPVWIHDFTARSQNG------LPID 637
               P+ LS  LPMLEH+ ++V  E P+ I   G AAPVWIHDF     +G      L  D
Sbjct: 581  AREPIALSLSLPMLEHLGVRVDEERPYLIEPNG-AAPVWIHDFGLEMADGVAAGIDLGAD 639

Query: 638  CAMVKEKFQDAFAAVWDGRMEDDGFNRLVLRAGLTAREVTVLRAYAKYLRQARIPYGQDV 697
             A +K  F+DAFA  W+G +E+D FNRLVLRA L AR+VT+LRAYA+YLRQ    +    
Sbjct: 640  IARIKALFEDAFARAWNGEIENDDFNRLVLRAELAARDVTILRAYARYLRQVGSTFSDAY 699

Query: 698  VESTLAGHPAIARKLVALFHSRFDPAR---RSQNDPGLAAEIERALDGVKNLDEDRILRR 754
            +E  L G+PAIA + V LF +RFDPA    R+     L   I   LD V NLDEDRILR 
Sbjct: 700  IERALTGNPAIASRFVELFVARFDPATENARAARCERLQQAIGTELDQVPNLDEDRILRL 759

Query: 755  FLNLLCNTLRTNAYQNGADGRPKTYLSFKIDSRNIDDLPLPRPMVEVFVYSPRMEGVHLR 814
            FL ++  T+RTN +++G DG P+ YLSFK +   +  LP PRPM E++VYSPR+EGVHLR
Sbjct: 760  FLGVINATVRTNYFRHGPDGGPRPYLSFKFNPALVPGLPEPRPMFEIWVYSPRVEGVHLR 819

Query: 815  GGKVARGGIRWSDRREDFRTEILGLMKAQMVKNTVIVPVGSKGGFVVKRPPPPSAGREAQ 874
            GG VARGG+RWSDRREDFRTE+LGLMKAQMVKNTVIVPVGSKGGFVVKRPPP +  R+A 
Sbjct: 820  GGPVARGGLRWSDRREDFRTEVLGLMKAQMVKNTVIVPVGSKGGFVVKRPPPAN-DRDAF 878

Query: 875  LAEGIECYKTLMRGLLDITDNLDAQGAVVPPPEVVRHDGDDPYLVVAADKGTATFSDIAN 934
            L EGI CY+T +RGLLD+TDN  A G +VPPPEVVR DGDDPYLVVAADKGTA+FSD AN
Sbjct: 879  LQEGIACYQTFLRGLLDVTDNRVA-GTLVPPPEVVREDGDDPYLVVAADKGTASFSDYAN 937

Query: 935  SVSVDHGFWLGDAFASGGSAGYDHKKMGITARGAWESVKRHFRELGHDTQTQDFTVVGVG 994
            ++S ++GFWL DAFASGGS GYDHKKM ITARGAWESVKRHFRE+G D Q+ DFTV G+G
Sbjct: 938  AISAEYGFWLSDAFASGGSVGYDHKKMAITARGAWESVKRHFREMGVDIQSTDFTVAGIG 997

Query: 995  DMSGDVFGNGMLLSKHIRLLAAFDHRHIFIDPDPDAARSWEERQRLFDLPRSSWADYDAS 1054
            DMSGDVFGNGMLLS HI+L+AAFDHRHIF+DP+PD A S  ERQR+F+LPRSSWADYD S
Sbjct: 998  DMSGDVFGNGMLLSPHIKLVAAFDHRHIFLDPNPDPAASLRERQRMFELPRSSWADYDMS 1057

Query: 1055 LLSAGGRVFDRSAKSLELTPEIRQRFGIAKDHVTPLELMQTLLKAEVDLLWFGGIGTYLK 1114
            L+SAGG +F R+AK++ +TP+++   GI+   ++P EL+  +L A VDLL+ GGIGTY+K
Sbjct: 1058 LVSAGGGLFPRTAKTIAITPQVQASLGISASVLSPAELVHAILLAPVDLLYNGGIGTYVK 1117

Query: 1115 AAEETNAEVGDKANDALRIDGRDVRAKVIGEGANLGVTQRGRIEAAQHGVRLNTDAIDNS 1174
            ++ E++ +VGD+ANDA+R++G ++R KV+GEG NLG TQ GRIE A +G R+NTDAIDNS
Sbjct: 1118 SSRESHLQVGDRANDAVRVNGAELRCKVVGEGGNLGFTQLGRIEFALNGGRINTDAIDNS 1177

Query: 1175 AGVDTSDHEVNIKILLNDVVVRGDMTLKQRDQLLAAMTDEVAGLVLADNYLQSQALTVAR 1234
            AGVD SDHEVNIKILL  VV  G+MT KQR++LLA MTDEV  LVL DNY QSQAL+VA 
Sbjct: 1178 AGVDCSDHEVNIKILLGLVVADGEMTEKQRNKLLAQMTDEVGLLVLEDNYYQSQALSVAG 1237

Query: 1235 AQGPDALEAQARLIRSLEKAGRLNRAIEYLPDEEELSARMANREGLTRPELAVLLAYAKI 1294
               P  L+A+ RLIR LE+AGRL RA+E+LP E+EL  R A   GLT PE AVLLAY+K+
Sbjct: 1238 RNAPALLDAEGRLIRWLERAGRLKRALEFLPTEDELGERKAAGLGLTSPERAVLLAYSKM 1297

Query: 1295 TLYDDLLASDLPDDPFMADDLTRYFPKPLRKAHAEAVGRHRLRREIIATSVTNSLVNRTG 1354
             LYD+LL S LP+DP +A  L  YFP+PLR+ HA+ + RH LRREI+AT +TN+LVNR G
Sbjct: 1298 WLYDELLTSALPEDPLVAGLLPAYFPQPLRERHADTMLRHPLRREILATHLTNTLVNRIG 1357

Query: 1355 PTFVKEMMEKTGMGPADVARAYTIVRDAFGLRSLWTGIEDLDTVVPAALQTSMILETVRH 1414
             TFV  M E+T   PAD+ RA  I RD FGL  LW  I+ LD  V   +Q  M     R 
Sbjct: 1358 ATFVHRMQEETDARPADIVRACLIARDVFGLDPLWLRIDALDNQVDDDVQARMFAAVGRL 1417

Query: 1415 MERAAAWFL----ASCQQPLDIARETEA---FRPGIETLLAGLDNVLDAEETARLTARVA 1467
            ++ A+ WFL    A      D AR  EA     P +  LLAG      AE T  +  R  
Sbjct: 1418 LDHASLWFLRHPHAGGPTDGDSARYAEAAAWLTPQLPALLAG------AEATTLMQWR-Q 1470

Query: 1468 SYQEQGVPAELARRMAALPVLAAAPDLVRIAGRTGRGVADVAAVYFMLGRRFGLEWLRDK 1527
               + GV  ELA R+AA  + AAA D+  +A  T R +A VA VYF L   F   WLR++
Sbjct: 1471 DLTQSGVDDELALRVAAGEICAAALDIADVAAATQRSLALVAGVYFALDTEFSFSWLRER 1530

Query: 1528 AAAAKAENHWQKQAVAALVDDLFAHQTALTTRVLEAVDQLPAEAP-VEAWIAHRRPVVER 1586
            A A  A++HW   A    ++DL   + ALT  VL    +L   A  +EAW   R+  +ER
Sbjct: 1531 ALALPADSHWDLLARTTTLEDLGRLKRALTVSVLAQPQELDTPAQLIEAWRGERQAQIER 1590

Query: 1587 VEQLLSELRTQPNVDLSMLAVANRQL 1612
              ++L++ R      LSML+VA R++
Sbjct: 1591 FSRMLADQRASGAAGLSMLSVAVREI 1616


Lambda     K      H
   0.320    0.136    0.399 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 6068
Number of extensions: 317
Number of successful extensions: 13
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1618
Length of database: 1626
Length adjustment: 52
Effective length of query: 1566
Effective length of database: 1574
Effective search space:  2464884
Effective search space used:  2464884
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 61 (28.1 bits)

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory