GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Sulfurimonas denitrificans DSM 1251

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate WP_011373096.1 SUDEN_RS07665 carbamoyl-phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>NCBI__GCF_000012965.1:WP_011373096.1
          Length = 1085

 Score = 1159 bits (2999), Expect = 0.0
 Identities = 609/1078 (56%), Positives = 773/1078 (71%), Gaps = 34/1078 (3%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDI +IL++G+GPI+IGQACEFDYSG QA K L+E GYRV+L+NSNPATIMTDPE 
Sbjct: 1    MPKRTDIHTILLIGSGPIIIGQACEFDYSGTQAVKTLKELGYRVVLINSNPATIMTDPEF 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            AD TYIEPI  +++ KII+ E+ DAVLPTMGGQTALN A  + ++G+LE  GV  +GA+ 
Sbjct: 61   ADRTYIEPIREDIIAKIIKDEKVDAVLPTMGGQTALNVATSMYKKGMLE--GVEFLGASP 118

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADVGFPCIIRPSFTMGGSGGG 180
            +AI K EDR  F+ AM KIG++  +S  A+++EEAL VA ++GFP I R SFT+ G G G
Sbjct: 119  EAIHKGEDRSAFNKAMIKIGMDLPKSRNAYSVEEALEVALEIGFPVISRASFTLAGGGSG 178

Query: 181  IAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCSIENFDAM 240
            +AYN EEF+ +   G+  SP  E+ I ES++GWKEYEMEV+RDK DNCIIVCSIENFD M
Sbjct: 179  VAYNMEEFKILAQEGISASPVSEIEIMESMLGWKEYEMEVIRDKADNCIIVCSIENFDPM 238

Query: 241  GIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNGRLIVIEM 300
            G+HTGDSITVAPA TLTDKEYQ MR+AS  +LREIGV+TGGSNVQF+++PK GR+IVIEM
Sbjct: 239  GVHTGDSITVAPALTLTDKEYQRMRDASFDILREIGVDTGGSNVQFSIDPKTGRMIVIEM 298

Query: 301  NPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSIDYVVTKIP 360
            NPRVSRSSALASKATG+PIAKVA  LAVG+TLDE+ NDITG  TPASFEP IDYVVTKIP
Sbjct: 299  NPRVSRSSALASKATGYPIAKVATLLAVGFTLDEITNDITG--TPASFEPVIDYVVTKIP 356

Query: 361  RFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPKVSLDDPEA 420
            RF FEKF  A   L+T MKSVGEVMAIGRT +ES+QKAL  LE G  GFDP +  D    
Sbjct: 357  RFTFEKFPEAQSTLSTSMKSVGEVMAIGRTFKESIQKALCSLETGLCGFDP-IDAD---- 411

Query: 421  LTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLEEKVAEVG 480
               I+ E++   ADRI Y+A+ FR G+S++ +F+  NID WFL QIEE++++E  + +  
Sbjct: 412  FDFIKHEIRRPNADRILYVAEGFRRGMSIEEMFDTCNIDPWFLYQIEEMIKVESIIDKKI 471

Query: 481  ITGLNADFLRQLKRKGFADARLAKLAG------VREAEIRKLRDQYDLHPVYKRVDTCAA 534
            ++  +  F+R +K  GF+D R+A+L        + E ++ K +    ++  Y  VDTCAA
Sbjct: 472  LS--DETFMRSVKVDGFSDKRIAQLISQKSETKITEDDVYKAKKTLGVNLEYNEVDTCAA 529

Query: 535  EFATDTAYMYSTYEEECEANPS---TDREKIMVLGGGPNRIGQGIEFDYCCVHASLALRE 591
            EF   T Y+YST       N     ++ +K+++LGGGPNRIGQGIEFDYCCVHA+ AL+E
Sbjct: 530  EFEALTPYLYSTTNITKLPNVKNRVSEAKKVLILGGGPNRIGQGIEFDYCCVHAAFALKE 589

Query: 592  DGYETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGVIVQYGGQTPLKLA 651
             G ETIM NCNPETVSTDYDTSD LYFEP+  E V E++  EKP GVIV +GGQTPLKLA
Sbjct: 590  MGIETIMYNCNPETVSTDYDTSDVLYFEPIDFEHVREVIENEKPDGVIVHFGGQTPLKLA 649

Query: 652  RALEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAKEIGYP 711
             AL   G  + GT    ID AEDRE+F + V    LKQPAN      + A + A  +G+P
Sbjct: 650  NALTKIGANIAGTPSHVIDLAEDREQFSNFVNSHGLKQPANGLARTKDEAHDIALRLGFP 709

Query: 712  LVVRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDAICDGE 771
            ++VRPSYVLGGR M IVY + +LR+Y   AV VSNDAPVL+D FLD A+E+DVDAICDG 
Sbjct: 710  VLVRPSYVLGGRGMRIVYSQEELRQYMDLAVLVSNDAPVLVDKFLDQAIELDVDAICDGV 769

Query: 772  MVLIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLMNVQFA 831
             V IG +M+HIE+AG+HSGDSACSLP  +LS+E+ D +  Q + +A  L VRGLMNVQ+A
Sbjct: 770  DVYIGSVMQHIEEAGIHSGDSACSLPPVSLSKELIDQVEAQTKTIALGLGVRGLMNVQYA 829

Query: 832  VKNNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSL--------------AEQ 877
            +  +E+YLIEVNPRA+RTVPFVSKATG+PLAKVA RVM G++L               E 
Sbjct: 830  IYQDEIYLIEVNPRASRTVPFVSKATGMPLAKVATRVMVGETLKNALNYYDKYNIVMEEN 889

Query: 878  GVTKEVIPPYYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNS 937
            G+ K  +  + SVKE V PF+K  G D +LGPEM+STGEVMG+   F  +FAKAQ+ + +
Sbjct: 890  GLLKPRLKGHISVKEAVFPFHKLYGADLVLGPEMKSTGEVMGISSNFGISFAKAQIAAGN 949

Query: 938  TMKKHGRALLSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEG 997
             +   G   LS  + DK+   ++A+ L + GF+L AT GT   + EAGI   +V K+ EG
Sbjct: 950  RIVTEGTCFLSFVDTDKKYASEIASALHRHGFKLLATKGTQASIEEAGIPCEVVLKISEG 1009

Query: 998  RPHIQDRIKNGEYTYIINTTSGRRAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALN 1055
            RP+I+D +KN      INT+    + +D+ VIR+  L+  + Y TTL+   A  +AL+
Sbjct: 1010 RPNIEDSMKNDAIDMAINTSDNNTSKKDAIVIRQEVLKRNIPYFTTLSAARALILALD 1067


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2895
Number of extensions: 126
Number of successful extensions: 18
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1085
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1039
Effective search space:  1067053
Effective search space used:  1067053
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_011373096.1 SUDEN_RS07665 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.17996.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1452.0   2.1          0 1451.8   2.1    1.0  1  lcl|NCBI__GCF_000012965.1:WP_011373096.1  SUDEN_RS07665 carbamoyl-phosphat


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000012965.1:WP_011373096.1  SUDEN_RS07665 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1451.8   2.1         0         0       1    1051 [.       2    1065 ..       2    1066 .. 0.97

  Alignments for each domain:
  == domain 1  score: 1451.8 bits;  conditional E-value: 0
                                 TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYieP 67  
                                                pkr+di+++l+iGsGpi+igqA+EFDYsG+qa+k+lke g++vvL+nsn+At+mtd+e+ad++YieP
  lcl|NCBI__GCF_000012965.1:WP_011373096.1    2 PKRTDIHTILLIGSGPIIIGQACEFDYSGTQAVKTLKELGYRVVLINSNPATIMTDPEFADRTYIEP 68  
                                                689**************************************************************** PP

                                 TIGR01369   68 ltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkea 134 
                                                + + +++kii+ E++Da+l+t+GGqtaLn+a ++ +kG+Le   v+ lG++ eai+k+edR +F++a
  lcl|NCBI__GCF_000012965.1:WP_011373096.1   69 IREDIIAKIIKDEKVDAVLPTMGGQTALNVATSMYKKGMLEG--VEFLGASPEAIHKGEDRSAFNKA 133 
                                                ****************************************96..*********************** PP

                                 TIGR01369  135 lkeineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspik 201 
                                                + +i+++++ks+ + sveeale+a eig+Pvi Ra+ftl+G Gsg+a+n+ee+k l++++++asp++
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  134 MIKIGMDLPKSRNAYSVEEALEVALEIGFPVISRASFTLAGGGSGVAYNMEEFKILAQEGISASPVS 200 
                                                ******************************************************************* PP

                                 TIGR01369  202 qvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdasl 268 
                                                ++ + +s+ gwkE+E+Ev+RD++dnciivc+iEn+Dp+GvHtGdsi+vaP+ tLtdkeyq++Rdas+
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  201 EIEIMESMLGWKEYEMEVIRDKADNCIIVCSIENFDPMGVHTGDSITVAPALTLTDKEYQRMRDASF 267 
                                                ******************************************************************* PP

                                 TIGR01369  269 kiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelk 334 
                                                 i+re+gv+++ +nvqf++dP++ r++viE+npRvsRssALAskAtGyPiAkva+ lavG++Lde++
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  268 DILREIGVDTGgSNVQFSIDPKTGRMIVIEMNPRVSRSSALASKATGYPIAKVATLLAVGFTLDEIT 334 
                                                ********9988******************************************************* PP

                                 TIGR01369  335 ndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrslee 401 
                                                nd+t+ t+AsfEP +DYvv+kiPr+ ++kf +++++l+t+mksvGEvmaigrtf+e++qkal+sle+
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  335 NDITG-TPASFEPVIDYVVTKIPRFTFEKFPEAQSTLSTSMKSVGEVMAIGRTFKESIQKALCSLET 400 
                                                ****9.99*********************************************************** PP

                                 TIGR01369  402 kllglklkekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvele 468 
                                                +l g++       + + +++++++pn++R++++ae +rrg+s+ee++++++id +fl+++++++++e
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  401 GLCGFDPI---DADFDFIKHEIRRPNADRILYVAEGFRRGMSIEEMFDTCNIDPWFLYQIEEMIKVE 464 
                                                ****9986...4455556789********************************************** PP

                                 TIGR01369  469 keleeeklkelkkellkkakklGfsdeqiaklvkvseae......vrklrkelgivpvvkrvDtvaa 529 
                                                + ++++ l+  ++  ++++k  Gfsd++ia+l+++++++      v k++k+lg+   +++vDt+aa
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  465 SIIDKKILS--DETFMRSVKVDGFSDKRIAQLISQKSETkiteddVYKAKKTLGVNLEYNEVDTCAA 529 
                                                ***977776..8899*****************9955444455555********************** PP

                                 TIGR01369  530 EfeaktpYlYstyeee...kddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktil 593 
                                                Efea tpYlYst + +   + +++v+e kkvl+lG+Gp+Rigqg+EFDyc+vha+ al+e+g++ti+
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  530 EFEALTPYLYSTTNITklpNVKNRVSEAKKVLILGGGPNRIGQGIEFDYCCVHAAFALKEMGIETIM 596 
                                                ************99884433456677889************************************** PP

                                 TIGR01369  594 inynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGts 660 
                                                 n+nPEtvstDyd++d LyFe++ +e+v ++ie+ek++gviv++gGqt+l+la++l++ g++i Gt 
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  597 YNCNPETVSTDYDTSDVLYFEPIDFEHVREVIENEKPDGVIVHFGGQTPLKLANALTKIGANIAGTP 663 
                                                ******************************************************************* PP

                                 TIGR01369  661 aesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameivenee 727 
                                                 ++id aEdRe+Fs+++++ g+kqp++  a++++ea++ia ++g+PvlvRpsyvlgGr+m+iv+++e
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  664 SHVIDLAEDREQFSNFVNSHGLKQPANGLARTKDEAHDIALRLGFPVLVRPSYVLGGRGMRIVYSQE 730 
                                                ******************************************************************* PP

                                 TIGR01369  728 eleryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppq 794 
                                                el++y++ av vs+++Pvl+dk+l++a+E+dvDa++dg +v+i ++++HiEeaG+HsGDs+++lpp 
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  731 ELRQYMDLAVLVSNDAPVLVDKFLDQAIELDVDAICDGVDVYIGSVMQHIEEAGIHSGDSACSLPPV 797 
                                                ******************************************************************* PP

                                 TIGR01369  795 klseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklav 861 
                                                +ls+e+ +++++++k+ia  l v+Gl+n+q+++ ++e+y+iEvn+RasRtvPfvska+g+pl+k+a+
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  798 SLSKELIDQVEAQTKTIALGLGVRGLMNVQYAIYQDEIYLIEVNPRASRTVPFVSKATGMPLAKVAT 864 
                                                ******************************************************************* PP

                                 TIGR01369  862 kvllgkkleele............kgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmg 916 
                                                +v++g++l+++             +g+ k   + +++vk+avf+f+kl+g+d+vlgpemkstGEvmg
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  865 RVMVGETLKNALnyydkynivmeeNGLLKPRLKGHISVKEAVFPFHKLYGADLVLGPEMKSTGEVMG 931 
                                                **********88999***999999899999999********************************** PP

                                 TIGR01369  917 igrdleeallkallaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagi 983 
                                                i++++  +++ka++a++++i ++g+++ls  d+dk+++ e+a+ l+++g+k++at+gt++ +eeagi
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  932 ISSNFGISFAKAQIAAGNRIVTEGTCFLSFVDTDKKYASEIASALHRHGFKLLATKGTQASIEEAGI 998 
                                                ******************************************************************* PP

                                 TIGR01369  984 kaevvlkvseeaekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaealle 1050
                                                 +evvlk+se +++i + +k++ i+++in+++ ++++++++ +ir+e++++++p++t+l++a+al+ 
  lcl|NCBI__GCF_000012965.1:WP_011373096.1  999 PCEVVLKISEGRPNIEDSMKNDAIDMAINTSD-NNTSKKDAIVIRQEVLKRNIPYFTTLSAARALIL 1064
                                                *****************************998.55688899*********************99887 PP

                                 TIGR01369 1051 a 1051
                                                a
  lcl|NCBI__GCF_000012965.1:WP_011373096.1 1065 A 1065
                                                6 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1085 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.09u 0.04s 00:00:00.13 Elapsed: 00:00:00.13
# Mc/sec: 8.74
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory