GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Caldicellulosiruptor kronotskyensis 2002

Align Carbamoyl-phosphate synthase pyrimidine-specific large chain; Carbamoyl-phosphate synthetase ammonia chain; EC 6.3.5.5 (characterized)
to candidate WP_013430297.1 CALKRO_RS06735 carbamoyl-phosphate synthase large subunit

Query= SwissProt::P25994
         (1071 letters)



>NCBI__GCF_000166775.1:WP_013430297.1
          Length = 1075

 Score = 1249 bits (3233), Expect = 0.0
 Identities = 624/1053 (59%), Positives = 811/1053 (77%), Gaps = 5/1053 (0%)

Query: 1    MPKRVDINKILVIGSGPIIIGQAAEFDYAGTQACLALKEEGYEVILVNSNPATIMTDTEM 60
            MP R DI K+LVIGSGPIIIGQAAEFDY+G+QAC ALKEEG EVIL+NSNPATIMTD  M
Sbjct: 1    MPLRKDIKKVLVIGSGPIIIGQAAEFDYSGSQACKALKEEGIEVILINSNPATIMTDKTM 60

Query: 61   ADRVYIEPLTPEFLTRIIRKERPDAILPTLGGQTGLNLAVELSERGVLAECGVEVLGTKL 120
            AD +YIEP+T E + +II+KER DAILPTLGGQTGLN AVEL + G+L +  V+V+GT +
Sbjct: 61   ADSIYIEPITCEIIEKIIQKERVDAILPTLGGQTGLNTAVELYKSGILDKYNVKVIGTNI 120

Query: 121  SAIQQAEDRDLFRTLMNELNEPVPESEIIHSLEEAEKFVSQIGFPVIVRPAYTLGGTGGG 180
             AI+ AEDR LF+ LM ++ EPV  SE+++ +E+   F  +IGFPVI+RPAYTLGGTGGG
Sbjct: 121  EAIEFAEDRQLFKQLMIKIGEPVVPSEVVNCVEDGLAFAKKIGFPVIIRPAYTLGGTGGG 180

Query: 181  ICSNETELKEIVENGLKLSPVHQCLLEKSIAGYKEIEYEVMRDSQDHAIVVCNMENIDPV 240
            I +NE E  EI   GL  SPVHQ L+EKSI G+KEIEYEVMRDS    I VCNMENIDPV
Sbjct: 181  IANNEEEFVEIARRGLSYSPVHQILVEKSIKGWKEIEYEVMRDSNGCLITVCNMENIDPV 240

Query: 241  GIHTGDSIVVAPSQTLSDREYQLLRNVSLKLIRALGIEGGCNVQLALDPDSFQYYIIEVN 300
            GIHTGDSIV+APSQTLSD+EYQ+LR+ +LK+I AL IEGGCNVQ AL+PDSF+Y +IEVN
Sbjct: 241  GIHTGDSIVIAPSQTLSDKEYQMLRSSALKIIDALKIEGGCNVQFALNPDSFEYAVIEVN 300

Query: 301  PRVSRSSALASKATGYPIAKLAAKIAVGLSLDEMMNPVTGKTYAAFEPALDYVVSKIPRW 360
            PRVSRSSALASKATGYPIA++AAKIA+G +LDE+ N +T  TYA+FEPALDYVV KIPRW
Sbjct: 301  PRVSRSSALASKATGYPIARIAAKIALGYTLDEIENAITKMTYASFEPALDYVVLKIPRW 360

Query: 361  PFDKFESANRKLGTQMKATGEVMAIGRTLEESLLKAVRSLEADVYHLELKDAADISDELL 420
            PFDKF  ANRKLGTQMKATGEVMAIGRT EESLLK +RSL+  + +L+L +   + +E L
Sbjct: 361  PFDKFTYANRKLGTQMKATGEVMAIGRTFEESLLKGIRSLDIGLDYLDLPELKSLDNESL 420

Query: 421  EKRIKKAGDERLFYLAEAYRRGYTVEDLHEFSAIDVFFLHKLFGIVQFEKELKANAGDTD 480
             + I +A D R+F LAEA RR Y VE L+  S +D FFLHK+  I++ E+ ++    ++ 
Sbjct: 421  SQLIIEADDRRIFALAEAIRRRYEVEYLYRISKVDRFFLHKIKNIIEMEERIRKEDLNSS 480

Query: 481  VLRRAKELGFSDQYISREWKMKESELYSLRKQAGIAPVFKMVDTCAAEFESETPYFYSTY 540
            +L  AK++GFSD+ I+   ++ E+++ SLRK   I PV+KMVDTCAAEFE++TPY+YSTY
Sbjct: 481  ILLEAKKMGFSDKTIASLKEISENDVRSLRKSLNITPVYKMVDTCAAEFEAKTPYYYSTY 540

Query: 541  EEENESVVTD----KKSVMVLGSGPIRIGQGVEFDYATVHSVWAIKQAGYEAIIVNNNPE 596
            E EN+  V+     ++ ++VLGSGPIRIGQG+EFDY +VHSV+A+ + G +++I+NNNPE
Sbjct: 541  ERENDVAVSQTSYTQRKIVVLGSGPIRIGQGIEFDYTSVHSVYALSKLGIKSVIINNNPE 600

Query: 597  TVSTDFSISDKLYFEPLTIEDVMHIIDLEQPMGVVVQFGGQTAINLADELSARGVKILGT 656
            TVSTDF  SD L+FEPLT EDV+++I+  +  GV+VQFGGQTAI L+ +L+  G+KI GT
Sbjct: 601  TVSTDFDTSDMLFFEPLTKEDVLNVIETVKAEGVIVQFGGQTAIKLSQQLAKEGIKIFGT 660

Query: 657  SLEDLDRAEDRDKFEQALGELGVPQPLGKTATSVNQAVSIASDIGYPVLVRPSYVLGGRA 716
            S E +D AEDR++F++ L +L + +P G T  ++ +A+ IA+ +GYPVLVRPSYVLGG+ 
Sbjct: 661  SAEGIDIAEDRERFDKILNKLNIKRPPGYTCYTLQEALRIANSLGYPVLVRPSYVLGGQG 720

Query: 717  MEIVYHEEELLHYMKNAVKINPQHPVLIDRYLTGKEIEVDAVSDGETVVIPGIMEHIERA 776
            M+I + +++++  +  A  +N  HP+LID+Y+ GKEIEVDA+SDGE ++IPGIMEHIERA
Sbjct: 721  MKIAFDDDDIVEMLSYAKNLN-NHPILIDKYIVGKEIEVDAISDGEDILIPGIMEHIERA 779

Query: 777  GVHSGDSIAVYPPQSLTEDIKKKIEQYTIALAKGLNIVGLLNIQFVLSQGEVYVLEVNPR 836
            G+HSGDSI++YP +++++ I++KI +YT+ +A+ L   GL+N+QF++   E+YV+EVNPR
Sbjct: 780  GIHSGDSISLYPARNISKYIEEKIVEYTLKIARELECKGLMNVQFIVQNEELYVIEVNPR 839

Query: 837  SSRTVPFLSKITGIPMANLATKIILGQKLAAFGYTEGLQPEQQGVFVKAPVFSFAKLRRV 896
             SRTVPFLSK+TG+PM  LAT + LG KL     T GL P++     K PVFSF KL  V
Sbjct: 840  GSRTVPFLSKVTGVPMVELATMVSLGYKLKDLVNTVGLLPKKDFYAFKVPVFSFEKLPDV 899

Query: 897  DITLGPEMKSTGEVMGKDSTLEKALYKALIASGIQIPNYGSVLLTVADKDKEEGLAIAKR 956
            +++LGPEMKSTGEVMG       ALYK L+ASG ++P  G VL TVAD DK E + IA++
Sbjct: 900  EVSLGPEMKSTGEVMGISKDYYVALYKGLVASGTKLPLEGGVLFTVADPDKNEIIPIAEK 959

Query: 957  FHAIGYNILATEGTAGYLKEASIPAKVVGKIGQDGPNLLDVIRNGEAQFVINTLTKGKQP 1016
            F  +G+ I AT  TA +L    + A  V K+ +  PN++D+IR GE   VINT TKG+QP
Sbjct: 960  FEKLGFKIYATSKTAKHLNFYQVAANYVKKVSEGSPNIIDLIRKGEINIVINTPTKGRQP 1019

Query: 1017 ARDGFRIRRESVENGVACLTSLDTAEAILRVLE 1049
             RDGF IRR +VEN V   TS+DTA+A++ ++E
Sbjct: 1020 QRDGFLIRRFAVENKVPIFTSVDTAKAVVEIIE 1052


Lambda     K      H
   0.316    0.135    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2961
Number of extensions: 123
Number of successful extensions: 13
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1071
Length of database: 1075
Length adjustment: 45
Effective length of query: 1026
Effective length of database: 1030
Effective search space:  1056780
Effective search space used:  1056780
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate WP_013430297.1 CALKRO_RS06735 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.815045.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1556.3   7.3          0 1556.1   7.3    1.0  1  NCBI__GCF_000166775.1:WP_013430297.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000166775.1:WP_013430297.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1556.1   7.3         0         0       2    1051 ..       3    1050 ..       2    1051 .. 0.99

  Alignments for each domain:
  == domain 1  score: 1556.1 bits;  conditional E-value: 0
                             TIGR01369    2 kredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltvea 72  
                                             r+dikkvlviGsGpi+igqAaEFDYsGsqa+kalkeegiev+L+nsn+At+mtd+++ad++YieP+t e+
  NCBI__GCF_000166775.1:WP_013430297.1    3 LRKDIKKVLVIGSGPIIIGQAAEFDYSGSQACKALKEEGIEVILINSNPATIMTDKTMADSIYIEPITCEI 73  
                                            689******************************************************************** PP

                             TIGR01369   73 vekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineeva 143 
                                            +ekii+kEr+Dail+tlGGqt+Ln avel ++G+L+ky+vk++Gt++eai+ aedR++Fk+++ +i+e+v 
  NCBI__GCF_000166775.1:WP_013430297.1   74 IEKIIQKERVDAILPTLGGQTGLNTAVELYKSGILDKYNVKVIGTNIEAIEFAEDRQLFKQLMIKIGEPVV 144 
                                            *********************************************************************** PP

                             TIGR01369  144 kseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqvlvekslagwkE 214 
                                             se+v+ ve+ l++a++ig+Pvi+R+a+tlgGtG+gia+neee+ e+++++l+ sp++q+lveks++gwkE
  NCBI__GCF_000166775.1:WP_013430297.1  145 PSEVVNCVEDGLAFAKKIGFPVIIRPAYTLGGTGGGIANNEEEFVEIARRGLSYSPVHQILVEKSIKGWKE 215 
                                            *********************************************************************** PP

                             TIGR01369  215 iEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkiirelgvegecnvqfa 285 
                                            iEyEv+RDs++  i+vcn+En+Dp+G+HtGdsiv+aPsqtL+dkeyq+lR+++lkii +l++eg+cnvqfa
  NCBI__GCF_000166775.1:WP_013430297.1  216 IEYEVMRDSNGCLITVCNMENIDPVGIHTGDSIVIAPSQTLSDKEYQMLRSSALKIIDALKIEGGCNVQFA 286 
                                            *********************************************************************** PP

                             TIGR01369  286 ldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtketvAsfEPslDYvvvki 356 
                                            l+P+s +y viEvnpRvsRssALAskAtGyPiA++aak+a+Gy+Lde++n +tk t+AsfEP+lDYvv ki
  NCBI__GCF_000166775.1:WP_013430297.1  287 LNPDSFEYAVIEVNPRVSRSSALASKATGYPIARIAAKIALGYTLDEIENAITKMTYASFEPALDYVVLKI 357 
                                            *********************************************************************** PP

                             TIGR01369  357 PrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekllglklkekeaesdeeleealkkpn 427 
                                            Prw++dkf+ ++rklgtqmk++GEvmaigrtfee+l+k++rsl+++l  l+l+e +  ++e+l + + +++
  NCBI__GCF_000166775.1:WP_013430297.1  358 PRWPFDKFTYANRKLGTQMKATGEVMAIGRTFEESLLKGIRSLDIGLDYLDLPELKSLDNESLSQLIIEAD 428 
                                            *********************************************************************** PP

                             TIGR01369  428 drRlfaiaealrrgvsveevyeltkidrffleklkklvelekeleeeklkelkkellkkakklGfsdeqia 498 
                                            drR+fa+aea+rr ++ve +y+++k+drffl+k+k+++e+e++++   +++l++  l +akk+Gfsd++ia
  NCBI__GCF_000166775.1:WP_013430297.1  429 DRRIFALAEAIRRRYEVEYLYRISKVDRFFLHKIKNIIEMEERIR---KEDLNSSILLEAKKMGFSDKTIA 496 
                                            ********************************************7...67788****************** PP

                             TIGR01369  499 klvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpYlYstyeeekddvevtek....kkvlvlGsGpi 565 
                                            +l + se++vr+lrk+l+i+pv+k+vDt+aaEfeaktpY+Ystye+e +dv+v+++    +k++vlGsGpi
  NCBI__GCF_000166775.1:WP_013430297.1  497 SLKEISENDVRSLRKSLNITPVYKMVDTCAAEFEAKTPYYYSTYERE-NDVAVSQTsytqRKIVVLGSGPI 566 
                                            ***********************************************.8877766545568********** PP

                             TIGR01369  566 RigqgvEFDycavhavlalreagyktilinynPEtvstDydiadrLyFeeltvedvldiiekekvegvivq 636 
                                            Rigqg+EFDy++vh+v al++ g+k+++in+nPEtvstD+d++d L+Fe+lt edvl++ie+ k egvivq
  NCBI__GCF_000166775.1:WP_013430297.1  567 RIGQGIEFDYTSVHSVYALSKLGIKSVIINNNPETVSTDFDTSDMLFFEPLTKEDVLNVIETVKAEGVIVQ 637 
                                            *********************************************************************** PP

                             TIGR01369  637 lgGqtalnlakeleeagvkilGtsaesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPv 707 
                                            +gGqta++l+++l+++g+ki+Gtsae id+aEdRe+F+k+l++l+ik+p g +  +++ea +ia+++gyPv
  NCBI__GCF_000166775.1:WP_013430297.1  638 FGGQTAIKLSQQLAKEGIKIFGTSAEGIDIAEDRERFDKILNKLNIKRPPGYTCYTLQEALRIANSLGYPV 708 
                                            *********************************************************************** PP

                             TIGR01369  708 lvRpsyvlgGrameiveneeeleryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiE 778 
                                            lvRpsyvlgG++m+i+ +++++ ++l+ a +++ ++P+lidky+  ++E++vDa++dge++li+gi+eHiE
  NCBI__GCF_000166775.1:WP_013430297.1  709 LVRPSYVLGGQGMKIAFDDDDIVEMLSYAKNLN-NHPILIDKYIV-GKEIEVDAISDGEDILIPGIMEHIE 777 
                                            ***************************998776.9**********.************************* PP

                             TIGR01369  779 eaGvHsGDstlvlppqklseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvs 849 
                                            +aG+HsGDs+ ++p++++s+ +++ki e++ kia+el+ kGl+n+qf+v++ee+yviEvn+R sRtvPf+s
  NCBI__GCF_000166775.1:WP_013430297.1  778 RAGIHSGDSISLYPARNISKYIEEKIVEYTLKIARELECKGLMNVQFIVQNEELYVIEVNPRGSRTVPFLS 848 
                                            *********************************************************************** PP

                             TIGR01369  850 kalgvplvklavkvllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrd 920 
                                            k++gvp+v+la+ v lg kl++l ++v   +k++++a k++vfsf+kl +v+v lgpemkstGEvmgi++d
  NCBI__GCF_000166775.1:WP_013430297.1  849 KVTGVPMVELATMVSLGYKLKDLVNTVGLLPKKDFYAFKVPVFSFEKLPDVEVSLGPEMKSTGEVMGISKD 919 
                                            *********************************************************************** PP

                             TIGR01369  921 leeallkallaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkv 991 
                                             + al+k l+as++k++ +g vl +v+d dk+e+ ++a+k++++g+k+yat++tak l+  ++ a+ v+kv
  NCBI__GCF_000166775.1:WP_013430297.1  920 YYVALYKGLVASGTKLPLEGGVLFTVADPDKNEIIPIAEKFEKLGFKIYATSKTAKHLNFYQVAANYVKKV 990 
                                            *********************************************************************** PP

                             TIGR01369  992 seeaekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaeallea 1051
                                            se +++i++l++++ei++vin+++k+++ +++g+ irr ave+kvp++t+++ta+a++e+
  NCBI__GCF_000166775.1:WP_013430297.1  991 SEGSPNIIDLIRKGEINIVINTPTKGRQPQRDGFLIRRFAVENKVPIFTSVDTAKAVVEI 1050
                                            ********************************************************9986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1075 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.02s 00:00:00.03 Elapsed: 00:00:00.03
# Mc/sec: 32.12
//
[ok]

This GapMind analysis is from Jul 26 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory