GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Clostridium kluyveri DSM 555

Align Carbamoyl-phosphate synthase pyrimidine-specific large chain; Carbamoyl-phosphate synthetase ammonia chain; EC 6.3.5.5 (characterized)
to candidate WP_012102719.1 CKL_RS11625 carbamoyl-phosphate synthase (glutamine-hydrolyzing) large subunit

Query= SwissProt::P25994
         (1071 letters)



>NCBI__GCF_000016505.1:WP_012102719.1
          Length = 1072

 Score = 1165 bits (3013), Expect = 0.0
 Identities = 568/1042 (54%), Positives = 780/1042 (74%), Gaps = 5/1042 (0%)

Query: 1    MPKRVDINKILVIGSGPIIIGQAAEFDYAGTQACLALKEEGYEVILVNSNPATIMTDTEM 60
            MP R ++ K+L+IGSGPIIIGQAAEFDY+GTQAC A+K+EG E +LVNSNPATIMTD  +
Sbjct: 1    MPLRENLKKVLIIGSGPIIIGQAAEFDYSGTQACEAIKKEGIETVLVNSNPATIMTDKNI 60

Query: 61   ADRVYIEPLTPEFLTRIIRKERPDAILPTLGGQTGLNLAVELSERGVLAECGVEVLGTKL 120
            A + Y+EPLT E L  II++ERPD +L   GGQT LNLA+EL + G+L +  VE+LG K 
Sbjct: 61   AHKTYVEPLTVESLEAIIKRERPDGVLAGFGGQTALNLAMELGKLGILKKYNVELLGIKT 120

Query: 121  SAIQQAEDRDLFRTLMNELNEPVPESEIIHSLEEAEKFVSQIGFPVIVRPAYTLGGTGGG 180
             +I+ AEDR+ F+ LM E+ EP+  S I   LE+ + F+ ++  P+I+RPAYTLGGTGGG
Sbjct: 121  ESIKNAEDRESFKNLMEEIEEPIALSTIATDLEQCKSFLDKVSLPIIIRPAYTLGGTGGG 180

Query: 181  ICSNETELKEIVENGLKLSPVHQCLLEKSIAGYKEIEYEVMRDSQDHAIVVCNMENIDPV 240
            I  N  E  EI +NGL+ SP++Q LLE+S+AG+KE+EYE+MRD +D+ +VVCNMEN+DPV
Sbjct: 181  IADNYEEYLEICKNGLEESPINQILLEQSLAGWKELEYEIMRDKKDNCMVVCNMENLDPV 240

Query: 241  GIHTGDSIVVAPSQTLSDREYQLLRNVSLKLIRALGIEGGCNVQLALDPDSFQYYIIEVN 300
            GIHTGDSIVVAPSQTL+DREYQ+LR  S+K+IR L IEGGCN+Q AL+P   +Y +IEVN
Sbjct: 241  GIHTGDSIVVAPSQTLTDREYQMLRRSSIKIIRKLKIEGGCNIQFALNPSGNEYMVIEVN 300

Query: 301  PRVSRSSALASKATGYPIAKLAAKIAVGLSLDEMMNPVTGKTYAAFEPALDYVVSKIPRW 360
            PRVSRSSALASKA GYPIAK+AAKIA+G +LDE+ N VTG + A FEPALDY V K+P+W
Sbjct: 301  PRVSRSSALASKAAGYPIAKIAAKIALGYTLDELKNYVTGNSSALFEPALDYCVVKMPKW 360

Query: 361  PFDKFESANRKLGTQMKATGEVMAIGRTLEESLLKAVRSLEADVYHLELKDAADISDELL 420
            PFDKF++ANR L TQMKATGEVMAI R+ E +LLKAV SLE  +  L+L    D++   +
Sbjct: 361  PFDKFKTANRTLKTQMKATGEVMAIDRSFESALLKAVISLEGKIVGLKLDKFEDMNLSQI 420

Query: 421  EKRIKKAGDERLFYLAEAYRRGYTVEDLHEFSAIDVFFLHKLFGIVQFEKELKANAGDTD 480
              ++KK  DERLF LAEA R+G +V++L+E + ID +F++ +  I+  E +L +N  + D
Sbjct: 421  IDKLKKEDDERLFALAEALRKGISVDELYEITKIDKWFIYGVKNIIDMENKLVSNVPNVD 480

Query: 481  VLRRAKELGFSDQYISREWKMKESELYSLRKQAGIAPVFKMVDTCAAEFESETPYFYSTY 540
            ++ +A+ +GF+D+YI     MK  +L  LR+  GI  V+KMVDTC+ EFE++T Y+YS Y
Sbjct: 481  IIHQAELMGFTDEYICNLMGMKLEDLKQLREVNGIRVVYKMVDTCSGEFEAKTSYYYSCY 540

Query: 541  EEENESVVTDKKSVMVLGSGPIRIGQGVEFDYATVHSVWAIKQAGYEAIIVNNNPETVST 600
            + EN++V++D K ++V+GSGPIRIGQG+EFDY  V+ VWAIK+AGYEAII+NNNPETVST
Sbjct: 541  DLENDNVISDNKKILVIGSGPIRIGQGIEFDYCCVNGVWAIKKAGYEAIIINNNPETVST 600

Query: 601  DFSISDKLYFEPLTIEDVMHIIDLEQPMGVVVQFGGQTAINLADELSARGVKILGTSLED 660
            DF ISDKLYF+PL I+DVM++I+ E+  GV+VQFGGQTA+NL+ +L+ RGV +LGTS E 
Sbjct: 601  DFDISDKLYFDPLYIDDVMNVINEEKVDGVIVQFGGQTALNLSKKLNDRGVNLLGTSFES 660

Query: 661  LDRAEDRDKFEQALGELGVPQPLGKTATSVNQAVSIASDIGYPVLVRPSYVLGGRAMEIV 720
            +D AEDR+KF   L +L +  P+G + TS+ +A  + S+IGYPV+VRPSYV+GGRAM++V
Sbjct: 661  IDLAEDREKFRILLKKLNINSPIGGSVTSLKEAYKLVSEIGYPVIVRPSYVIGGRAMKVV 720

Query: 721  YHEEELLHYMKNAVKINPQHPVLIDRYLTGKEIEVDAVSDGETVVIPGIMEHIERAGVHS 780
            Y+ EEL  Y+K AV ++ +HPVL+D+Y+ G+EIEVDA+SDG+ ++IPGIMEH+ER GVHS
Sbjct: 721  YNPEELERYLKEAVNLSKEHPVLVDKYILGREIEVDAISDGQDLIIPGIMEHVERTGVHS 780

Query: 781  GDSIAVYPPQSLTEDIKKKIEQYTIALAKGLNIVGLLNIQFVLSQGEVYVLEVNPRSSRT 840
            GDSIA+YP   L E + ++IE+YT+ +A+ LN+ GLLN+Q+     ++YV+EVNPR+SRT
Sbjct: 781  GDSIAIYPASDLPEKVCQRIEEYTVNIARELNVKGLLNVQYAFDGDKIYVIEVNPRASRT 840

Query: 841  VPFLSKITGIPMANLATKIILGQKLAAFGYTEGLQPEQQGVFVKAPVFSFAKLRRVDITL 900
            VP LSK+T +PM  +A +++LG+K+  F Y + +        VK PVFS  KL  VD+ L
Sbjct: 841  VPILSKVTDVPMVEIAVEVMLGKKIKEFNYKQDMYKYSNIFAVKMPVFSSKKLPGVDVAL 900

Query: 901  GPEMKSTGEVMGKDSTLEKALYKALIASGIQIPNYGSVLLTVADKDKEEGLAIAKRFHAI 960
            GPEMKSTGEV+G D   +KA+YKA  A+G++I   G++ + + D+DK   L + K+++++
Sbjct: 901  GPEMKSTGEVLGVDYDKDKAIYKAFKAAGVEIFKKGNLYVCINDRDKCSSLEVIKKYNSL 960

Query: 961  GYNILATEGTAGYLKEASIPAKVVGKIGQDGPNLLDVIRNGEAQFVINTLTKGKQPARDG 1020
             +NI+A+ GT  +LKE  I    +        + +  I+  +   VIN  T+G    R+G
Sbjct: 961  NFNIIASSGTFKFLKENGIKCSKLSI-----EDAISYIKEDKIDIVINIPTQGYDGTREG 1015

Query: 1021 FRIRRESVENGVACLTSLDTAE 1042
            F++R  ++ +     T +DTA+
Sbjct: 1016 FKLRHMALAHDKVVFTCIDTAD 1037


Lambda     K      H
   0.316    0.135    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2874
Number of extensions: 112
Number of successful extensions: 12
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1071
Length of database: 1072
Length adjustment: 45
Effective length of query: 1026
Effective length of database: 1027
Effective search space:  1053702
Effective search space used:  1053702
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate WP_012102719.1 CKL_RS11625 (carbamoyl-phosphate synthase (glutamine-hydrolyzing) large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.30583.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1523.4   9.5          0 1523.3   9.5    1.0  1  lcl|NCBI__GCF_000016505.1:WP_012102719.1  CKL_RS11625 carbamoyl-phosphate 


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000016505.1:WP_012102719.1  CKL_RS11625 carbamoyl-phosphate synthase (glutamine-hydrolyzing) large subu
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1523.3   9.5         0         0       2    1051 ..       3    1042 ..       2    1043 .. 0.99

  Alignments for each domain:
  == domain 1  score: 1523.3 bits;  conditional E-value: 0
                                 TIGR01369    2 kredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePl 68  
                                                 re++kkvl+iGsGpi+igqAaEFDYsG+qa+ a+k+egie+vLvnsn+At+mtd+++a+k+Y+ePl
  lcl|NCBI__GCF_000016505.1:WP_012102719.1    3 LRENLKKVLIIGSGPIIIGQAAEFDYSGTQACEAIKKEGIETVLVNSNPATIMTDKNIAHKTYVEPL 69  
                                                5899*************************************************************** PP

                                 TIGR01369   69 tveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkeal 135 
                                                tve++e ii++ErpD++l+++GGqtaLnla+el + G+L+ky+v+llG+k e+ik+aedRe Fk+++
  lcl|NCBI__GCF_000016505.1:WP_012102719.1   70 TVESLEAIIKRERPDGVLAGFGGQTALNLAMELGKLGILKKYNVELLGIKTESIKNAEDRESFKNLM 136 
                                                ******************************************************************* PP

                                 TIGR01369  136 keineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikq 202 
                                                +ei+e++a s+i+++ e++ ++ +++ +P+i+R+a+tlgGtG+gia+n ee  e+++++l+ spi+q
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  137 EEIEEPIALSTIATDLEQCKSFLDKVSLPIIIRPAYTLGGTGGGIADNYEEYLEICKNGLEESPINQ 203 
                                                ******************************************************************* PP

                                 TIGR01369  203 vlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslk 269 
                                                +l+e+slagwkE+EyE++RD+kdnc++vcn+EnlDp+G+HtGdsivvaPsqtLtd+eyq+lR +s+k
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  204 ILLEQSLAGWKELEYEIMRDKKDNCMVVCNMENLDPVGIHTGDSIVVAPSQTLTDREYQMLRRSSIK 270 
                                                ******************************************************************* PP

                                 TIGR01369  270 iirelgvegecnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelknd 336 
                                                iir+l++eg+cn+qfal+P+ ++y+viEvnpRvsRssALAskA+GyPiAk+aak+a+Gy+Ldelkn 
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  271 IIRKLKIEGGCNIQFALNPSGNEYMVIEVNPRVSRSSALASKAAGYPIAKIAAKIALGYTLDELKNY 337 
                                                ******************************************************************* PP

                                 TIGR01369  337 vtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekl 403 
                                                vt++++A fEP+lDY+vvk+P+w++dkf++++r+l tqmk++GEvmai+r+fe+al+ka+ sle k+
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  338 VTGNSSALFEPALDYCVVKMPKWPFDKFKTANRTLKTQMKATGEVMAIDRSFESALLKAVISLEGKI 404 
                                                ******************************************************************* PP

                                 TIGR01369  404 lglklkekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklveleke 470 
                                                +glkl++ e ++ +++ ++lkk +d+Rlfa+aealr+g+sv+e+ye+tkid++f++ +k+++++e++
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  405 VGLKLDKFEDMNLSQIIDKLKKEDDERLFALAEALRKGISVDELYEITKIDKWFIYGVKNIIDMENK 471 
                                                ******************************************************************* PP

                                 TIGR01369  471 leeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpY 537 
                                                l ++     + + +++a  +Gf+de i +l++++ +++++lr+ +gi  v+k+vDt+++Efeakt+Y
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  472 LVSNVP---NVDIIHQAELMGFTDEYICNLMGMKLEDLKQLREVNGIRVVYKMVDTCSGEFEAKTSY 535 
                                                *85444...589******************************************************* PP

                                 TIGR01369  538 lYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvstD 604 
                                                +Ys y  e +d+ ++++kk+lv+GsGpiRigqg+EFDyc+v++v+a+++agy++i+in+nPEtvstD
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  536 YYSCYDLE-NDNVISDNKKILVIGSGPIRIGQGIEFDYCCVNGVWAIKKAGYEAIIINNNPETVSTD 601 
                                                ********.999999999************************************************* PP

                                 TIGR01369  605 ydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGtsaesidraEdRe 671 
                                                +di+d+LyF++l+++dv+++i++ekv+gvivq+gGqtalnl+k+l+++gv++lGts+esid aEdRe
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  602 FDISDKLYFDPLYIDDVMNVINEEKVDGVIVQFGGQTALNLSKKLNDRGVNLLGTSFESIDLAEDRE 668 
                                                ******************************************************************* PP

                                 TIGR01369  672 kFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeeleryleeave 738 
                                                kF  ll++l+i+ p g ++ts++ea+++++eigyPv+vRpsyv+gGram++v+n eeleryl+eav+
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  669 KFRILLKKLNINSPIGGSVTSLKEAYKLVSEIGYPVIVRPSYVIGGRAMKVVYNPEELERYLKEAVN 735 
                                                ******************************************************************* PP

                                 TIGR01369  739 vskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkkik 805 
                                                +ske+Pvl+dky+  + E++vDa++dg++++i+gi+eH+E++GvHsGDs++++p+++l e+v ++i+
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  736 LSKEHPVLVDKYIL-GREIEVDAISDGQDLIIPGIMEHVERTGVHSGDSIAIYPASDLPEKVCQRIE 801 
                                                *************9.**************************************************** PP

                                 TIGR01369  806 eivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkleel 872 
                                                e++ +ia+el+vkGlln+q++ +++++yviEvn+RasRtvP++sk+++vp+v++av+v+lgkk++e 
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  802 EYTVNIARELNVKGLLNVQYAFDGDKIYVIEVNPRASRTVPILSKVTDVPMVEIAVEVMLGKKIKEF 868 
                                                ******************************************************************* PP

                                 TIGR01369  873 ekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakikkk 939 
                                                ++     k s+++avk++vfs +kl gvdv lgpemkstGEv g+++d+++a++ka++a++ +i kk
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  869 NYKQDMYKYSNIFAVKMPVFSSKKLPGVDVALGPEMKSTGEVLGVDYDKDKAIYKAFKAAGVEIFKK 935 
                                                **9999999********************************************************** PP

                                 TIGR01369  940 gsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvseeaekilellkeee 1006
                                                g++++ ++d+dk + le++kk++++ ++++a++gt k+l+e+gik++++       e++++ +ke++
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  936 GNLYVCINDRDKCSSLEVIKKYNSLNFNIIASSGTFKFLKENGIKCSKLS-----IEDAISYIKEDK 997 
                                                ***********************************************887.....4567889***** PP

                                 TIGR01369 1007 ielvinltskkkkaaekgykirreaveykvplvteletaeallea 1051
                                                i++vin+++++ + +++g+k+r+ a+ ++  ++t+++ta +++ a
  lcl|NCBI__GCF_000016505.1:WP_012102719.1  998 IDIVINIPTQGYDGTREGFKLRHMALAHDKVVFTCIDTADVYADA 1042
                                                **************************************9998876 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1072 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.06u 0.03s 00:00:00.09 Elapsed: 00:00:00.08
# Mc/sec: 13.04
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory