GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Desulfallas geothermicus DSM 3669

Align Carbamoyl-phosphate synthase pyrimidine-specific large chain; Carbamoyl-phosphate synthetase ammonia chain; EC 6.3.5.5 (characterized)
to candidate WP_092483169.1 BM299_RS08945 carbamoyl-phosphate synthase large subunit

Query= SwissProt::P25994
         (1071 letters)



>NCBI__GCF_900115975.1:WP_092483169.1
          Length = 1070

 Score = 1352 bits (3499), Expect = 0.0
 Identities = 663/1051 (63%), Positives = 839/1051 (79%), Gaps = 3/1051 (0%)

Query: 1    MPKRVDINKILVIGSGPIIIGQAAEFDYAGTQACLALKEEGYEVILVNSNPATIMTDTEM 60
            MP +  + K+LV+GSGPIIIGQAAEFDYAGTQAC AL+EEG EV+L+NSNPATIMTD  M
Sbjct: 1    MPIKKGLQKVLVVGSGPIIIGQAAEFDYAGTQACRALREEGLEVVLINSNPATIMTDANM 60

Query: 61   ADRVYIEPLTPEFLTRIIRKERPDAILPTLGGQTGLNLAVELSERGVLAECGVEVLGTKL 120
            ADR+YIEP+TPEF+TR+I KE+PD  LP+LGGQ GLN+A++LSE GVL + GV++LGT L
Sbjct: 61   ADRIYIEPITPEFVTRVIAKEKPDGFLPSLGGQVGLNMALQLSEMGVLEQYGVQLLGTPL 120

Query: 121  SAIQQAEDRDLFRTLMNELNEPVPESEIIHSLEEAEKFVSQIGFPVIVRPAYTLGGTGGG 180
             AI++AEDR+ F+  M  +NEPVPES I+ S++EA +F  QIGFP++VRPAYTLGGTGGG
Sbjct: 121  DAIKRAEDREQFKDTMERINEPVPESTIVSSVDEAVEFAKQIGFPLVVRPAYTLGGTGGG 180

Query: 181  ICSNETELKEIVENGLKLSPVHQCLLEKSIAGYKEIEYEVMRDSQDHAIVVCNMENIDPV 240
            +  N  EL +    GLK S +HQ L+E+S+ G+KEIE+EVMRD  D+ I +C+MEN+DP+
Sbjct: 181  MVYNMNELIDTCTRGLKASIIHQALIERSVVGWKEIEFEVMRDGADNCITICSMENLDPM 240

Query: 241  GIHTGDSIVVAPSQTLSDREYQLLRNVSLKLIRALGIEGGCNVQLALDPDSFQYYIIEVN 300
            GIHTGDSIVVAP+QTLSDREYQ+LR+ +LK+IRALG+EGGCN+Q ALDP+S+QYY+IEVN
Sbjct: 241  GIHTGDSIVVAPTQTLSDREYQMLRSAALKIIRALGVEGGCNIQYALDPNSYQYYVIEVN 300

Query: 301  PRVSRSSALASKATGYPIAKLAAKIAVGLSLDEMMNPVTGKTYAAFEPALDYVVSKIPRW 360
            PRVSRSSALASKATGYPIAK+A+KIA+GL+LDE+ N VTGKTYA FEP++DY V K PRW
Sbjct: 301  PRVSRSSALASKATGYPIAKVASKIAIGLNLDEIKNAVTGKTYACFEPSIDYTVIKFPRW 360

Query: 361  PFDKFESANRKLGTQMKATGEVMAIGRTLEESLLKAVRSLEADVYHLELKDAADISDELL 420
            PFDKF +A+R LGTQMKATGEVMAI RTLE +LLKAVRSLE  V  L       ++DE +
Sbjct: 361  PFDKFAAADRTLGTQMKATGEVMAIDRTLEGALLKAVRSLEIGVPGLIYPGLERLTDEEI 420

Query: 421  EKRIKKAGDERLFYLAEAYRRGYTVEDLHEFSAIDVFFLHKLFGIVQFEKELKANAG--- 477
            E+++ +A DERLF LAEA RRG   E +H  + +D FF+HK++ +VQ E  L+   G   
Sbjct: 421  EQKLARANDERLFVLAEALRRGMLFERIHNLTKMDYFFIHKIYNVVQLEDRLRREGGAAL 480

Query: 478  DTDVLRRAKELGFSDQYISREWKMKESELYSLRKQAGIAPVFKMVDTCAAEFESETPYFY 537
               +LR AK++G +D Y+++   +   E+ +LRK  G+ PV+KMVDTCAAEFE+ TPY+Y
Sbjct: 481  TAGLLREAKQMGMADAYLAQAAGVTVQEVRTLRKAHGVEPVYKMVDTCAAEFEAVTPYYY 540

Query: 538  STYEEENESVVTDKKSVMVLGSGPIRIGQGVEFDYATVHSVWAIKQAGYEAIIVNNNPET 597
            S Y+ E+E+  T  + V+VLG GPIRIGQG+EFDY +VHS WA+K+ G EAII+NNNPET
Sbjct: 541  SCYDSEDEAEPTGNRKVVVLGGGPIRIGQGIEFDYCSVHSTWALKELGIEAIIINNNPET 600

Query: 598  VSTDFSISDKLYFEPLTIEDVMHIIDLEQPMGVVVQFGGQTAINLADELSARGVKILGTS 657
            VSTDF  +D+LYFEPL  EDV++I++ E+P GV+VQFGGQT INLA  L   G+KILGTS
Sbjct: 601  VSTDFDTADRLYFEPLLPEDVLNILEKEKPEGVIVQFGGQTPINLAGHLDRAGIKILGTS 660

Query: 658  LEDLDRAEDRDKFEQALGELGVPQPLGKTATSVNQAVSIASDIGYPVLVRPSYVLGGRAM 717
            ++D+DRAEDR +F+  L +L +P+P G TATSV +A +I+ +IG+PVLVRPSYVLGGRAM
Sbjct: 661  MDDIDRAEDRKRFDAMLNDLDIPRPPGGTATSVAEAEAISRNIGFPVLVRPSYVLGGRAM 720

Query: 718  EIVYHEEELLHYMKNAVKINPQHPVLIDRYLTGKEIEVDAVSDGETVVIPGIMEHIERAG 777
            EIVY+ E+L +YM+NAVK+ P+HPVL+D+Y  G+EIEVDA++DGETV+IPGIM+H+ERAG
Sbjct: 721  EIVYNIEDLRNYMENAVKVTPEHPVLVDKYFLGEEIEVDAIADGETVLIPGIMKHVERAG 780

Query: 778  VHSGDSIAVYPPQSLTEDIKKKIEQYTIALAKGLNIVGLLNIQFVLSQGEVYVLEVNPRS 837
            VHSGDSIAVYP   L   ++++I  YT  LA  LN+ G++NIQ+VL  G++YVLEVNPR+
Sbjct: 781  VHSGDSIAVYPANHLDRTVREQIVDYTTRLALELNVRGMINIQYVLYNGQIYVLEVNPRA 840

Query: 838  SRTVPFLSKITGIPMANLATKIILGQKLAAFGYTEGLQPEQQGVFVKAPVFSFAKLRRVD 897
            SRTVP++SKITGIPM NLATKII+GQKL+  GY  GL PE + V VKAPVFSFAKL +VD
Sbjct: 841  SRTVPYMSKITGIPMINLATKIIMGQKLSDMGYRGGLYPETKLVGVKAPVFSFAKLLQVD 900

Query: 898  ITLGPEMKSTGEVMGKDSTLEKALYKALIASGIQIPNYGSVLLTVADKDKEEGLAIAKRF 957
            I+LGPEMKSTGEV+G D++   ALYKAL+ASG+  P  G++L+TVADKDKEE L I K F
Sbjct: 901  ISLGPEMKSTGEVLGVDASYPVALYKALLASGVVFPRQGNILVTVADKDKEEALPIVKGF 960

Query: 958  HAIGYNILATEGTAGYLKEASIPAKVVGKIGQDGPNLLDVIRNGEAQFVINTLTKGKQPA 1017
              +GYNI AT GTA YL+E  +    V K+ +  P++ D++R G+   VINTLT+GK P 
Sbjct: 961  AGLGYNIFATAGTARYLEEHGVAVTRVNKVREGSPHIDDLLRKGDIHLVINTLTRGKAPE 1020

Query: 1018 RDGFRIRRESVENGVACLTSLDTAEAILRVL 1048
            RDGF IRR +VE  V CLTSLDTA AIL VL
Sbjct: 1021 RDGFVIRRATVELAVPCLTSLDTARAILEVL 1051


Lambda     K      H
   0.316    0.135    0.377 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2989
Number of extensions: 114
Number of successful extensions: 12
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1071
Length of database: 1070
Length adjustment: 45
Effective length of query: 1026
Effective length of database: 1025
Effective search space:  1051650
Effective search space used:  1051650
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.6 bits)
S2: 58 (26.9 bits)

Align candidate WP_092483169.1 BM299_RS08945 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.19556.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1562.1   0.0          0 1561.9   0.0    1.0  1  lcl|NCBI__GCF_900115975.1:WP_092483169.1  BM299_RS08945 carbamoyl-phosphat


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_900115975.1:WP_092483169.1  BM299_RS08945 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1561.9   0.0         0         0       3    1051 ..       4    1050 ..       2    1051 .. 1.00

  Alignments for each domain:
  == domain 1  score: 1561.9 bits;  conditional E-value: 0
                                 TIGR01369    3 redikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePlt 69  
                                                ++ ++kvlv+GsGpi+igqAaEFDY+G+qa++al+eeg+evvL+nsn+At+mtd ++ad++YieP+t
  lcl|NCBI__GCF_900115975.1:WP_092483169.1    4 KKGLQKVLVVGSGPIIIGQAAEFDYAGTQACRALREEGLEVVLINSNPATIMTDANMADRIYIEPIT 70  
                                                67899************************************************************** PP

                                 TIGR01369   70 veavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealk 136 
                                                +e+v+++i kE+pD+ l++lGGq +Ln+a++l+e+GvLe+ygv+llGt+++aik+aedRe+Fk++++
  lcl|NCBI__GCF_900115975.1:WP_092483169.1   71 PEFVTRVIAKEKPDGFLPSLGGQVGLNMALQLSEMGVLEQYGVQLLGTPLDAIKRAEDREQFKDTME 137 
                                                ******************************************************************* PP

                                 TIGR01369  137 eineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqv 203 
                                                 ine+v++s+iv+sv+ea+e+a++ig+P++vR+a+tlgGtG+g+++n++el  ++ ++lkas i+q 
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  138 RINEPVPESTIVSSVDEAVEFAKQIGFPLVVRPAYTLGGTGGGMVYNMNELIDTCTRGLKASIIHQA 204 
                                                ******************************************************************* PP

                                 TIGR01369  204 lvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslki 270 
                                                l+e+s+ gwkEiE+Ev+RD +dnci++c++EnlDp+G+HtGdsivvaP+qtL+d+eyq+lR+a+lki
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  205 LIERSVVGWKEIEFEVMRDGADNCITICSMENLDPMGIHTGDSIVVAPTQTLSDREYQMLRSAALKI 271 
                                                ******************************************************************* PP

                                 TIGR01369  271 irelgvegecnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkndv 337 
                                                ir+lgveg+cn+q+aldP+s++y+viEvnpRvsRssALAskAtGyPiAkva+k+a+G++Lde+kn v
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  272 IRALGVEGGCNIQYALDPNSYQYYVIEVNPRVSRSSALASKATGYPIAKVASKIAIGLNLDEIKNAV 338 
                                                ******************************************************************* PP

                                 TIGR01369  338 tketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekll 404 
                                                t++t+A+fEPs+DY v+k+Prw++dkf+ +dr+lgtqmk++GEvmai+rt+e al+ka+rsle+++ 
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  339 TGKTYACFEPSIDYTVIKFPRWPFDKFAAADRTLGTQMKATGEVMAIDRTLEGALLKAVRSLEIGVP 405 
                                                ******************************************************************* PP

                                 TIGR01369  405 glklkekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvelekel 471 
                                                gl  +  e  +dee+e++l ++nd+Rlf++aealrrg+  e++++ltk+d ff++k+ ++v+le +l
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  406 GLIYPGLERLTDEEIEQKLARANDERLFVLAEALRRGMLFERIHNLTKMDYFFIHKIYNVVQLEDRL 472 
                                                ******************************************************************* PP

                                 TIGR01369  472 eeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpYl 538 
                                                ++e    l++ ll++ak++G++d+ +a++++v+ +evr+lrk+ g+ pv+k+vDt+aaEfea tpY+
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  473 RREGGAALTAGLLREAKQMGMADAYLAQAAGVTVQEVRTLRKAHGVEPVYKMVDTCAAEFEAVTPYY 539 
                                                ******************************************************************* PP

                                 TIGR01369  539 YstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvstDy 605 
                                                Ys y +e d++e t ++kv+vlG+GpiRigqg+EFDyc+vh+++al+e g+++i+in+nPEtvstD+
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  540 YSCYDSE-DEAEPTGNRKVVVLGGGPIRIGQGIEFDYCSVHSTWALKELGIEAIIINNNPETVSTDF 605 
                                                *******.999999999************************************************** PP

                                 TIGR01369  606 diadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGtsaesidraEdRek 672 
                                                d+adrLyFe+l  edvl+i+ekek+egvivq+gGqt++nla +l++ag+kilGts+++idraEdR++
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  606 DTADRLYFEPLLPEDVLNILEKEKPEGVIVQFGGQTPINLAGHLDRAGIKILGTSMDDIDRAEDRKR 672 
                                                ******************************************************************* PP

                                 TIGR01369  673 FsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeeleryleeavev 739 
                                                F+++l++l+i++p g +atsv ea+ i ++ig+PvlvRpsyvlgGrameiv+n e+l++y+e+av+v
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  673 FDAMLNDLDIPRPPGGTATSVAEAEAISRNIGFPVLVRPSYVLGGRAMEIVYNIEDLRNYMENAVKV 739 
                                                ******************************************************************* PP

                                 TIGR01369  740 skekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkkike 806 
                                                ++e+Pvl+dky+  + E++vDa+adge+vli+gi++H+E+aGvHsGDs++v+p+++l++ v ++i +
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  740 TPEHPVLVDKYFL-GEEIEVDAIADGETVLIPGIMKHVERAGVHSGDSIAVYPANHLDRTVREQIVD 805 
                                                ************9.***************************************************** PP

                                 TIGR01369  807 ivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkleele 873 
                                                +++++a el+v+G++niq+v+ ++++yv+Evn+RasRtvP++sk++g+p+++la+k+++g+kl++++
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  806 YTTRLALELNVRGMINIQYVLYNGQIYVLEVNPRASRTVPYMSKITGIPMINLATKIIMGQKLSDMG 872 
                                                ******************************************************************* PP

                                 TIGR01369  874 kgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakikkkg 940 
                                                +     +++klv+vka+vfsf+kl +vd+ lgpemkstGEv g++ +   al+kallas+ +++++g
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  873 YRGGLYPETKLVGVKAPVFSFAKLLQVDISLGPEMKSTGEVLGVDASYPVALYKALLASGVVFPRQG 939 
                                                *999*************************************************************** PP

                                 TIGR01369  941 svllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvseeaekilellkeeei 1007
                                                ++l++v+dkdkee+l+++k +a +g++++at+gta++lee+g+ ++ v+kv+e +++i +ll++++i
  lcl|NCBI__GCF_900115975.1:WP_092483169.1  940 NILVTVADKDKEEALPIVKGFAGLGYNIFATAGTARYLEEHGVAVTRVNKVREGSPHIDDLLRKGDI 1006
                                                ******************************************************************* PP

                                 TIGR01369 1008 elvinltskkkkaaekgykirreaveykvplvteletaeallea 1051
                                                +lvin+ +++k+ +++g++irr++ve +vp++t+l+ta+a+le+
  lcl|NCBI__GCF_900115975.1:WP_092483169.1 1007 HLVINTLTRGKAPERDGFVIRRATVELAVPCLTSLDTARAILEV 1050
                                                ****************************************9987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1070 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.07u 0.03s 00:00:00.10 Elapsed: 00:00:00.09
# Mc/sec: 12.11
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory