GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Caldicellulosiruptor kronotskyensis 2002

Align Carbamoyl-phosphate synthase pyrimidine-specific large chain; Carbamoyl-phosphate synthetase ammonia chain; EC 6.3.5.5 (characterized)
to candidate WP_013430148.1 CALKRO_RS05915 carbamoyl-phosphate synthase (glutamine-hydrolyzing) large subunit

Query= SwissProt::P25994
         (1071 letters)



>NCBI__GCF_000166775.1:WP_013430148.1
          Length = 1077

 Score = 1221 bits (3158), Expect = 0.0
 Identities = 626/1056 (59%), Positives = 791/1056 (74%), Gaps = 13/1056 (1%)

Query: 1    MPKRVDINKILVIGSGPIIIGQAAEFDYAGTQACLALKEEGYEVILVNSNPATIMTDTEM 60
            MPKR DI K+L+IGSGPI+IGQAAEFDY+GTQAC ALKEEG EV+LVNSNPATIMTDTE+
Sbjct: 1    MPKRKDIKKVLIIGSGPIVIGQAAEFDYSGTQACRALKEEGIEVVLVNSNPATIMTDTEI 60

Query: 61   ADRVYIEPLTPEFLTRIIRKERPDAILPTLGGQTGLNLAVELSERGVLAECGVEVLGTKL 120
            ADRVYIEP++ +++  II+KERP  +L  LGGQT LN+A EL+E G+L + GV +LGT L
Sbjct: 61   ADRVYIEPISVDYIEEIIKKERPQGLLAGLGGQTALNMAFELAEAGILEKYGVCLLGTSL 120

Query: 121  SAIQQAEDRDLFRTLMNELNEPVPESEIIHSLEEAEKFVSQIGFPVIVRPAYTLGGTGGG 180
              I++AEDR+LF+  M E+ EPVP+S I HS++EA +F  ++G+PVIVRPAYTLGGTGGG
Sbjct: 121  ETIKKAEDRELFKKTMIEIGEPVPKSIIAHSVQEAIEFAREVGYPVIVRPAYTLGGTGGG 180

Query: 181  ICSNETELKEIVENGLKLSPVHQCLLEKSIAGYKEIEYEVMRDSQDHAIVVCNMENIDPV 240
            I  NE EL+ I   GLKLS +HQ L+E+S+ G+KEIEYEVMRDS D+ I VCNMENIDPV
Sbjct: 181  IAYNEEELRYIASKGLKLSLIHQVLIEQSVLGWKEIEYEVMRDSNDNCITVCNMENIDPV 240

Query: 241  GIHTGDSIVVAPSQTLSDREYQLLRNVSLKLIRALGIEGGCNVQLALDPDSFQYYIIEVN 300
            GIHTGDSIVVAPSQTLSD+EYQ+LR+ SL +IR+L IEGGCNVQ AL+P++ +Y +IEVN
Sbjct: 241  GIHTGDSIVVAPSQTLSDKEYQMLRSASLNIIRSLKIEGGCNVQFALNPNNMEYVVIEVN 300

Query: 301  PRVSRSSALASKATGYPIAKLAAKIAVGLSLDEMMNPVTGKTYAAFEPALDYVVSKIPRW 360
            PRVSRSSALASKATGYPIA++AAKIA+GL+LDE++NP+T  TYA+FEP++DYVV K+PRW
Sbjct: 301  PRVSRSSALASKATGYPIARIAAKIAIGLTLDEIINPITQNTYASFEPSIDYVVVKVPRW 360

Query: 361  PFDKFESANRKLGTQMKATGEVMAIGRTLEESLLKAVRSLEADV-YHLELKDAADISDEL 419
            PFDKFE A+R+LGTQMK+TGEVMAIGRT EE+ LKA+ SL+  + Y L LK   ++ D+ 
Sbjct: 361  PFDKFEKADRRLGTQMKSTGEVMAIGRTFEEAFLKAIDSLDVKINYQLGLKKFEEMPDDQ 420

Query: 420  LEKRIKKAGDERLFYLAEAYRRGYTVEDLHEFSAIDVFFLHKLFGIVQFEKELKANAGDT 479
            L + IK   DER+F + EA  R Y  + + + S ID FF+ K   IV   K+LK    ++
Sbjct: 421  LLEYIKTPNDERVFAICEALSRNYDCKFISDLSKIDYFFIEKFKNIVDMSKQLKKYDIES 480

Query: 480  ---DVLRRAKELGFSDQYISREWKMKESELYSLRKQAGIAPVFKMVDTCAAEFESETPYF 536
               D+L++AK LGF D YI+   K    E+  +R++  + P FKMVDTCA EFE++TPYF
Sbjct: 481  LPYDLLQKAKRLGFGDSYIANLLKEDVDEVIEIREKCKLKPSFKMVDTCAGEFEAKTPYF 540

Query: 537  YSTYEEENESVVTDKKSVMVLGSGPIRIGQGVEFDYATVHSVWAIKQAGYEAIIVNNNPE 596
            YSTYE+E + VV+ K   +V+GSGPIRIGQG+EFDY  VHS++A+K+ G EAII+NNNPE
Sbjct: 541  YSTYEKETDLVVSSKPKAIVIGSGPIRIGQGIEFDYCCVHSIFALKEEGVEAIIINNNPE 600

Query: 597  TVSTDFSISDKLYFEPLTIEDVMHIIDLEQPMGVVVQFGGQTAINLADELSARGVKILGT 656
            TVSTDF  SDKL+FEPLT E V+ II  E+PMGV+VQFGGQTAIN+A  L+  GVKILGT
Sbjct: 601  TVSTDFDTSDKLFFEPLTKECVLDIIKQEKPMGVIVQFGGQTAINMASYLAKNGVKILGT 660

Query: 657  SLEDLDRAEDRDKFEQALGELGVPQPLGKTATSVNQAVSIASDIGYPVLVRPSYVLGGRA 716
            S+E +D AEDRDKF   L  L +P P G  A S+  AV +A  IGYPVLVRPSYVLGGRA
Sbjct: 661  SMESIDTAEDRDKFLNLLKNLNIPYPPGGAAYSLEDAVKVAQQIGYPVLVRPSYVLGGRA 720

Query: 717  MEIVYHEEELLHYMKNAVKINPQHPVLIDRYLTGKEIEVDAVSDGETVVIPGIMEHIERA 776
            MEIVY  EEL  Y+K A++I+ +HP+LID+Y+ GKE EVD +SDGE V+IPGIMEHIERA
Sbjct: 721  MEIVYSREELEKYIKAAIEISIKHPILIDKYILGKEAEVDGISDGEDVLIPGIMEHIERA 780

Query: 777  GVHSGDSIAVYPPQSLTEDIKKKIEQYTIALAKGLNIVGLLNIQFVLSQGE-VYVLEVNP 835
            GVHSGDS+AV+PP +L+E +K+KI  YTI LA+ L +VGL NIQFV+ + E VYV+EVNP
Sbjct: 781  GVHSGDSMAVFPPHTLSEKVKEKIIDYTIKLARALRVVGLFNIQFVIDKDENVYVIEVNP 840

Query: 836  RSSRTVPFLSKITGIPMANLATKIILGQKLAAFGYTEGLQPEQQGVFVKAPVFSFAKLRR 895
            R+SRTVP LSK+TGIPM  +ATK+ILG+KL   GY  GL  E     VKAPVFSF+KL +
Sbjct: 841  RASRTVPILSKVTGIPMIKIATKLILGKKLKDLGYQTGLVKEPDFFAVKAPVFSFSKLSK 900

Query: 896  VDITLGPEMKSTGEVMGKDSTLEKALYKALIASGIQIPNYGSVLLTVADKDKEEGLAIAK 955
            VD  LGPEMKSTGEV+G    L+ ALYKA I+S  +    GS L+   + +K+    I +
Sbjct: 901  VDAYLGPEMKSTGEVLGISKNLKVALYKAFISSNHKFMKNGSCLILAPESEKDAIQQIIR 960

Query: 956  RFHAIGYNILATEGTAGYLKEASIPAKVVGKIGQDGPNLLDVIRNGEAQFVINTLTKGKQ 1015
            + + + Y +   +    Y+K  ++          D      ++   +  FVIN  +K K 
Sbjct: 961  KLYEVNYKVFLVDSMKDYIKGLNVEF-------IDKETAQKLLLEDKFSFVINIPSKDKM 1013

Query: 1016 PARDGFRIRRESVENGVACLTSLDTAEAILRVLESM 1051
                GF +RR SVE G+  LTS+DTA   + VL S+
Sbjct: 1014 -QEFGFVLRRLSVEFGITTLTSIDTALYYVDVLSSL 1048


Lambda     K      H
   0.316    0.135    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2980
Number of extensions: 108
Number of successful extensions: 16
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1071
Length of database: 1077
Length adjustment: 45
Effective length of query: 1026
Effective length of database: 1032
Effective search space:  1058832
Effective search space used:  1058832
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate WP_013430148.1 CALKRO_RS05915 (carbamoyl-phosphate synthase (glutamine-hydrolyzing) large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.493526.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1567.2   5.4          0 1567.0   5.4    1.0  1  NCBI__GCF_000166775.1:WP_013430148.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000166775.1:WP_013430148.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1567.0   5.4         0         0       1    1050 [.       2    1043 ..       2    1045 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1567.0 bits;  conditional E-value: 0
                             TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltve 71  
                                            pkr+dikkvl+iGsGpivigqAaEFDYsG+qa++alkeegievvLvnsn+At+mtd+e+ad+vYieP+ v 
  NCBI__GCF_000166775.1:WP_013430148.1    2 PKRKDIKKVLIIGSGPIVIGQAAEFDYSGTQACRALKEEGIEVVLVNSNPATIMTDTEIADRVYIEPISVD 72  
                                            689******************************************************************** PP

                             TIGR01369   72 avekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineev 142 
                                            ++e+ii+kErp ++l++lGGqtaLn+a el e+G+Lekygv llGt++e+ikkaedRe+Fk+++ ei+e+v
  NCBI__GCF_000166775.1:WP_013430148.1   73 YIEEIIKKERPQGLLAGLGGQTALNMAFELAEAGILEKYGVCLLGTSLETIKKAEDRELFKKTMIEIGEPV 143 
                                            *********************************************************************** PP

                             TIGR01369  143 akseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqvlvekslagwk 213 
                                            +ks i++sv+ea+e+a+e+gyPvivR+a+tlgGtG+gia+neeel+ +++k+lk+s i+qvl+e+s+ gwk
  NCBI__GCF_000166775.1:WP_013430148.1  144 PKSIIAHSVQEAIEFAREVGYPVIVRPAYTLGGTGGGIAYNEEELRYIASKGLKLSLIHQVLIEQSVLGWK 214 
                                            *********************************************************************** PP

                             TIGR01369  214 EiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkiirelgvegecnvqf 284 
                                            EiEyEv+RDs+dnci+vcn+En+Dp+G+HtGdsivvaPsqtL+dkeyq+lR+asl+iir+l++eg+cnvqf
  NCBI__GCF_000166775.1:WP_013430148.1  215 EIEYEVMRDSNDNCITVCNMENIDPVGIHTGDSIVVAPSQTLSDKEYQMLRSASLNIIRSLKIEGGCNVQF 285 
                                            *********************************************************************** PP

                             TIGR01369  285 aldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtketvAsfEPslDYvvvk 355 
                                            al+P++ +yvviEvnpRvsRssALAskAtGyPiA++aak+a+G++Lde+ n++t++t+AsfEPs+DYvvvk
  NCBI__GCF_000166775.1:WP_013430148.1  286 ALNPNNMEYVVIEVNPRVSRSSALASKATGYPIARIAAKIAIGLTLDEIINPITQNTYASFEPSIDYVVVK 356 
                                            *********************************************************************** PP

                             TIGR01369  356 iPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekllg.lklkekeaesdeeleealkk 425 
                                            +Prw++dkfek+dr+lgtqmks+GEvmaigrtfeea++ka+ sl+ k+   l lk+ e+++d++l e +k+
  NCBI__GCF_000166775.1:WP_013430148.1  357 VPRWPFDKFEKADRRLGTQMKSTGEVMAIGRTFEEAFLKAIDSLDVKINYqLGLKKFEEMPDDQLLEYIKT 427 
                                            **********************************************98764889999999*********** PP

                             TIGR01369  426 pndrRlfaiaealrrgvsveevyeltkidrffleklkklvelekeleeeklkelkkellkkakklGfsdeq 496 
                                            pnd+R+fai+eal+r+++ + + +l+kid ff+ek+k++v+++k+l++  ++ l+ +ll+kak+lGf d+ 
  NCBI__GCF_000166775.1:WP_013430148.1  428 PNDERVFAICEALSRNYDCKFISDLSKIDYFFIEKFKNIVDMSKQLKKYDIESLPYDLLQKAKRLGFGDSY 498 
                                            *********************************************************************** PP

                             TIGR01369  497 iaklvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpYlYstyeeekddvevtekkkvlvlGsGpiRi 567 
                                            ia+l+k + +ev ++r++ ++ p++k+vDt+a+EfeaktpY+Ystye+e +d  v++k k +v+GsGpiRi
  NCBI__GCF_000166775.1:WP_013430148.1  499 IANLLKEDVDEVIEIREKCKLKPSFKMVDTCAGEFEAKTPYFYSTYEKE-TDLVVSSKPKAIVIGSGPIRI 568 
                                            *************************************************.999999999************ PP

                             TIGR01369  568 gqgvEFDycavhavlalreagyktilinynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlg 638 
                                            gqg+EFDyc+vh++ al+e+g ++i+in+nPEtvstD+d++d+L+Fe+lt e+vldii++ek++gvivq+g
  NCBI__GCF_000166775.1:WP_013430148.1  569 GQGIEFDYCCVHSIFALKEEGVEAIIINNNPETVSTDFDTSDKLFFEPLTKECVLDIIKQEKPMGVIVQFG 639 
                                            *********************************************************************** PP

                             TIGR01369  639 GqtalnlakeleeagvkilGtsaesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlv 709 
                                            Gqta+n+a+ l+++gvkilGts+esid+aEdR+kF +ll++l+i+ p g +a s+e+a+++a++igyPvlv
  NCBI__GCF_000166775.1:WP_013430148.1  640 GQTAINMASYLAKNGVKILGTSMESIDTAEDRDKFLNLLKNLNIPYPPGGAAYSLEDAVKVAQQIGYPVLV 710 
                                            *********************************************************************** PP

                             TIGR01369  710 RpsyvlgGrameiveneeeleryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEea 780 
                                            RpsyvlgGrameiv+++eele+y++ a+e+s ++P+lidky+  ++E++vD ++dge+vli+gi+eHiE+a
  NCBI__GCF_000166775.1:WP_013430148.1  711 RPSYVLGGRAMEIVYSREELEKYIKAAIEISIKHPILIDKYIL-GKEAEVDGISDGEDVLIPGIMEHIERA 780 
                                            ******************************************9.*************************** PP

                             TIGR01369  781 GvHsGDstlvlppqklseevkkkikeivkkiakelkvkGllniqfvvkd.eevyviEvnvRasRtvPfvsk 850 
                                            GvHsGDs++v+pp++lse+vk+ki +++ k+a++l+v+Gl+niqfv+++ e+vyviEvn+RasRtvP++sk
  NCBI__GCF_000166775.1:WP_013430148.1  781 GVHSGDSMAVFPPHTLSEKVKEKIIDYTIKLARALRVVGLFNIQFVIDKdENVYVIEVNPRASRTVPILSK 851 
                                            **********************************************976599******************* PP

                             TIGR01369  851 algvplvklavkvllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdl 921 
                                            ++g+p++k+a+k++lgkkl++l++     k+++++avka+vfsfskl++vd +lgpemkstGEv gi+++l
  NCBI__GCF_000166775.1:WP_013430148.1  852 VTGIPMIKIATKLILGKKLKDLGYQTGLVKEPDFFAVKAPVFSFSKLSKVDAYLGPEMKSTGEVLGISKNL 922 
                                            *********************************************************************** PP

                             TIGR01369  922 eeallkallaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvs 992 
                                            + al+ka+++s++k+ k+gs+l+   +++k++++++++kl e  +kv+  + ++++++    ++e +    
  NCBI__GCF_000166775.1:WP_013430148.1  923 KVALYKAFISSNHKFMKNGSCLILAPESEKDAIQQIIRKLYEVNYKVFLVDSMKDYIKG--LNVEFI---- 987 
                                            ***************************************************88766543..333333.... PP

                             TIGR01369  993 eeaekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaealle 1050
                                             ++e++++ll e+++++vin++sk+k ++e g+++rr +ve++++++t+++ta  ++ 
  NCBI__GCF_000166775.1:WP_013430148.1  988 -DKETAQKLLLEDKFSFVINIPSKDK-MQEFGFVLRRLSVEFGITTLTSIDTALYYVD 1043
                                            .3456778899***********9655.8999********************9988776 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1077 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 41.25
//
[ok]

This GapMind analysis is from Jul 26 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory