GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Cupriavidus basilensis 4G11

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate RR42_RS13510 RR42_RS13510 carbamoyl phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>FitnessBrowser__Cup4G11:RR42_RS13510
          Length = 1082

 Score = 1449 bits (3750), Expect = 0.0
 Identities = 750/1087 (68%), Positives = 867/1087 (79%), Gaps = 21/1087 (1%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDIKSILI+GAGPI+IGQACEFDYSGAQACKALREEG++V+LVNSNPATIMTDP  
Sbjct: 1    MPKRTDIKSILIIGAGPIIIGQACEFDYSGAQACKALREEGFKVVLVNSNPATIMTDPNT 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            AD TYIEPI WEVV +II KERPDA+LPTMGGQTALNCAL+L R GVL ++ V +IGA+ 
Sbjct: 61   ADVTYIEPITWEVVERIIAKERPDAILPTMGGQTALNCALDLHRHGVLAKYNVELIGASP 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADV-------GFPCIIRPSFT 173
            +AIDKAEDR++F  AM KIGL +A+SGIAH+MEEALAV   +       G+P +IRPSFT
Sbjct: 121  EAIDKAEDRQKFKEAMTKIGLGSAKSGIAHSMEEALAVQTQIAKETATGGYPIVIRPSFT 180

Query: 174  MGGSGGGIAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCS 233
            +GGSGGGIAYNREEFEEIC RGLDLSPT+ELLI+ESL+GWKEYEMEVVRDK DNCII+CS
Sbjct: 181  LGGSGGGIAYNREEFEEICKRGLDLSPTRELLIEESLLGWKEYEMEVVRDKKDNCIIICS 240

Query: 234  IENFDAMGIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNG 293
            IEN D MGIHTGDSITVAPAQTLTDKEYQI+RNAS+AVLREIGV+TGGSNVQF++NP +G
Sbjct: 241  IENLDPMGIHTGDSITVAPAQTLTDKEYQILRNASLAVLREIGVDTGGSNVQFSINPADG 300

Query: 294  RLIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSID 353
            R+IVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDEL N+ITGG TPASFEPSID
Sbjct: 301  RMIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELKNEITGGATPASFEPSID 360

Query: 354  YVVTKIPRFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPKV 413
            YVVTK+PRF FEKF  A+  LTTQMKSVGEVMA+GRT QES QKALRGLEVG  G D K 
Sbjct: 361  YVVTKVPRFAFEKFPQADSHLTTQMKSVGEVMAMGRTFQESFQKALRGLEVGVDGLDDKS 420

Query: 414  SLDDPEALTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLE 473
            +  D     +I  E+ +AG DRIWY+ DAFR G+S++ V   T ID WFL QIE++V+ E
Sbjct: 421  TDRD-----EIVEEIGEAGPDRIWYVGDAFRIGMSLEEVHAETAIDPWFLAQIEDIVKTE 475

Query: 474  EKVAEVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCA 533
              V    +  L+A  LR LK+KGF+D RLAKL G   A +R  R    + PVYKRVDTCA
Sbjct: 476  TLVKARKLDSLSAAELRHLKQKGFSDRRLAKLMGAEPAAVRVARHAAGVRPVYKRVDTCA 535

Query: 534  AEFATDTAYMYSTYEE---ECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALR 590
            AEFAT+TAYMY TYE    ECEA+P+ +R KIMVLGGGPNRIGQGIEFDYCCVHA+LALR
Sbjct: 536  AEFATNTAYMYGTYEAEHGECEADPTANR-KIMVLGGGPNRIGQGIEFDYCCVHAALALR 594

Query: 591  EDGYETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGVIVQYGGQTPLKL 650
            EDGYETIMVNCNPETVSTDYDTSDRLYFEP+TLEDVLEIV  EKP GVIVQYGGQTPLKL
Sbjct: 595  EDGYETIMVNCNPETVSTDYDTSDRLYFEPLTLEDVLEIVDREKPVGVIVQYGGQTPLKL 654

Query: 651  ARALEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAKEIGY 710
            A  LEA GVP+IGTSPD ID AEDRERFQ  +  L L+QP N T  A + A+  A EIGY
Sbjct: 655  ALDLEANGVPIIGTSPDMIDAAEDRERFQKLLHELGLRQPPNRTARAEDEALRLATEIGY 714

Query: 711  PLVVRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDAICDG 770
            PLVVRPSYVLGGRAMEIV++  DL RY + AV VS+D+PVLLD FL+DA+E DVDA+CDG
Sbjct: 715  PLVVRPSYVLGGRAMEIVHEPRDLERYMREAVKVSHDSPVLLDRFLNDAIECDVDALCDG 774

Query: 771  EMVLIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLMNVQF 830
            + V IGG+MEHIEQAGVHSGDSACSLP Y+L+Q   D +++Q   +A  L V GLMNVQF
Sbjct: 775  QRVFIGGVMEHIEQAGVHSGDSACSLPPYSLAQATVDELKRQTAAMAKALNVIGLMNVQF 834

Query: 831  AVK--NNE--VYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQGVTKEVIPP 886
            A++  N E  VY++EVNPRA+RTVP+VSKATG+ LAK+AAR MAG++L  QGV  EV+PP
Sbjct: 835  AIQQVNGEDIVYVLEVNPRASRTVPYVSKATGLSLAKIAARCMAGQTLDSQGVFDEVVPP 894

Query: 887  YYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGRAL 946
            Y+SVKE V PFNKFPGVDP+LGPEMRSTGEVMGVG+TF EA  K+QL + S + + G  L
Sbjct: 895  YFSVKEAVFPFNKFPGVDPVLGPEMRSTGEVMGVGKTFGEALFKSQLAAGSRLPEKGTVL 954

Query: 947  LSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDRIK 1006
            L+V++ DK   V +A  L   G+ + AT GTA  +  AGI  R+VNKV +GRPHI D +K
Sbjct: 955  LTVKDSDKPHAVGVARMLHDMGYPIVATRGTASAIEAAGIPVRVVNKVKDGRPHIVDMLK 1014

Query: 1007 NGEYTYIINTT-SGRRAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALNADATEKVISV 1065
            NGE   +  T    R AI DSR IR SAL  +V Y TT+ G  A    L    + +V  +
Sbjct: 1015 NGELALVFTTVDETRTAIADSRSIRISALASRVPYYTTIAGARAAVEGLKHMQSLEVYDL 1074

Query: 1066 QEMHAQI 1072
            Q +HA +
Sbjct: 1075 QSLHASL 1081


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3119
Number of extensions: 137
Number of successful extensions: 20
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1082
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1036
Effective search space:  1063972
Effective search space used:  1063972
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate RR42_RS13510 RR42_RS13510 (carbamoyl phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.22699.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1533.3   0.0          0 1533.2   0.0    1.0  1  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  RR42_RS13510 carbamoyl phosphate


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cup4G11:RR42_RS13510  RR42_RS13510 carbamoyl phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1533.2   0.0         0         0       1    1050 [.       2    1061 ..       2    1063 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1533.2 bits;  conditional E-value: 0
                                 TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYieP 67  
                                                pkr+dik++l+iG+Gpi+igqA+EFDYsG+qa+kal+eeg++vvLvnsn+At+mtd++ ad +YieP
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510    2 PKRTDIKSILIIGAGPIIIGQACEFDYSGAQACKALREEGFKVVLVNSNPATIMTDPNTADVTYIEP 68  
                                                689**************************************************************** PP

                                 TIGR01369   68 ltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkea 134 
                                                +t+e+ve+ii kErpDail+t+GGqtaLn+a++l+++GvL+ky+v+l+G++ eai+kaedR+kFkea
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510   69 ITWEVVERIIAKERPDAILPTMGGQTALNCALDLHRHGVLAKYNVELIGASPEAIDKAEDRQKFKEA 135 
                                                ******************************************************************* PP

                                 TIGR01369  135 lkeineevakseivesveealeaaeei.......gyPvivRaaftlgGtGsgiaeneeelkelveka 194 
                                                +++i++  aks i++s+eeal+++++i       gyP+++R++ftlgG+G+gia+n+ee++e+++++
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  136 MTKIGLGSAKSGIAHSMEEALAVQTQIaketatgGYPIVIRPSFTLGGSGGGIAYNREEFEEICKRG 202 
                                                ********************999887766666668******************************** PP

                                 TIGR01369  195 lkaspikqvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyq 261 
                                                l++sp++++l+e+sl gwkE+E+EvvRD+kdncii+c+iEnlDp+G+HtGdsi+vaP+qtLtdkeyq
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  203 LDLSPTRELLIEESLLGWKEYEMEVVRDKKDNCIIICSIENLDPMGIHTGDSITVAPAQTLTDKEYQ 269 
                                                ******************************************************************* PP

                                 TIGR01369  262 llRdaslkiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavG 327 
                                                +lR+asl+++re+gv+++ +nvqf+++P + r++viE+npRvsRssALAskAtG+PiAkvaaklavG
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  270 ILRNASLAVLREIGVDTGgSNVQFSINPADGRMIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVG 336 
                                                ***************9988************************************************ PP

                                 TIGR01369  328 ysLdelkndvtk.etvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealq 393 
                                                y+Ldelkn++t+  t+AsfEPs+DYvv+k+Pr++++kf ++d++l+tqmksvGEvma+grtf+e++q
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  337 YTLDELKNEITGgATPASFEPSIDYVVTKVPRFAFEKFPQADSHLTTQMKSVGEVMAMGRTFQESFQ 403 
                                                ***********878***************************************************** PP

                                 TIGR01369  394 kalrsleekllglklkekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrfflek 460 
                                                kalr le ++ gl+ k    ++++e+ e++ ++ ++R++++ +a+r g+s+eev+  t id +fl +
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  404 KALRGLEVGVDGLDDK---STDRDEIVEEIGEAGPDRIWYVGDAFRIGMSLEEVHAETAIDPWFLAQ 467 
                                                ********99998776...778888999*************************************** PP

                                 TIGR01369  461 lkklvelekeleeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtv 527 
                                                ++++v++e+ ++  kl+ l++ +l+++k++Gfsd+++akl++ + a+vr +r+++g+ pv+krvDt+
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  468 IEDIVKTETLVKARKLDSLSAAELRHLKQKGFSDRRLAKLMGAEPAAVRVARHAAGVRPVYKRVDTC 534 
                                                ******************************************************************* PP

                                 TIGR01369  528 aaEfeaktpYlYstyeeekddve..vtekkkvlvlGsGpiRigqgvEFDycavhavlalreagykti 592 
                                                aaEf ++t+Y+Y tye+e+ + e   t ++k++vlG+Gp+Rigqg+EFDyc+vha+lalre gy+ti
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  535 AAEFATNTAYMYGTYEAEHGECEadPTANRKIMVLGGGPNRIGQGIEFDYCCVHAALALREDGYETI 601 
                                                *****************9665541156779************************************* PP

                                 TIGR01369  593 linynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGt 659 
                                                ++n+nPEtvstDyd++drLyFe+lt+edvl+i+++ek+ gvivq+gGqt+l+la +le++gv+i+Gt
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  602 MVNCNPETVSTDYDTSDRLYFEPLTLEDVLEIVDREKPVGVIVQYGGQTPLKLALDLEANGVPIIGT 668 
                                                ******************************************************************* PP

                                 TIGR01369  660 saesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameivene 726 
                                                s++ id aEdRe+F+kll+elg+ qp +++a++ +ea ++a+eigyP++vRpsyvlgGrameiv++ 
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  669 SPDMIDAAEDRERFQKLLHELGLRQPPNRTARAEDEALRLATEIGYPLVVRPSYVLGGRAMEIVHEP 735 
                                                ******************************************************************* PP

                                 TIGR01369  727 eeleryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlpp 793 
                                                 +lery++eav+vs+++Pvl+d++l+da+E+dvDa++dg++v+i g++eHiE+aGvHsGDs+++lpp
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  736 RDLERYMREAVKVSHDSPVLLDRFLNDAIECDVDALCDGQRVFIGGVMEHIEQAGVHSGDSACSLPP 802 
                                                ******************************************************************* PP

                                 TIGR01369  794 qklseevkkkikeivkkiakelkvkGllniqfvvkd....eevyviEvnvRasRtvPfvskalgvpl 856 
                                                 +l + +++++k++++++ak+l+v Gl+n+qf++++    + vyv+Evn+RasRtvP+vska+g++l
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  803 YSLAQATVDELKRQTAAMAKALNVIGLMNVQFAIQQvngeDIVYVLEVNPRASRTVPYVSKATGLSL 869 
                                                *********************************9875543569************************ PP

                                 TIGR01369  857 vklavkvllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdlee 923 
                                                +k+a+++++g++l +  +gv  e  + +++vk+avf+f+k+ gvd+vlgpem+stGEvmg+g+++ e
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  870 AKIAARCMAGQTLDS--QGVFDEVVPPYFSVKEAVFPFNKFPGVDPVLGPEMRSTGEVMGVGKTFGE 934 
                                                **************9..889*********************************************** PP

                                 TIGR01369  924 allkallaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlk 990 
                                                al+k++la+++++++kg+vll+vkd+dk +++ +a++l+++g+ ++at+gta+++e agi ++vv+k
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510  935 ALFKSQLAAGSRLPEKGTVLLTVKDSDKPHAVGVARMLHDMGYPIVATRGTASAIEAAGIPVRVVNK 1001
                                                ******************************************************************* PP

                                 TIGR01369  991 vseeaekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaealle 1050
                                                v++ +++i+++lk++e+ lv+++ +++++a  ++ +ir +a+  +vp+ t++++a+a++e
  lcl|FitnessBrowser__Cup4G11:RR42_RS13510 1002 VKDGRPHIVDMLKNGELALVFTTVDETRTAIADSRSIRISALASRVPYYTTIAGARAAVE 1061
                                                *****************************************************9999876 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1082 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.09u 0.05s 00:00:00.14 Elapsed: 00:00:00.12
# Mc/sec: 8.76
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory