GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Acidimicrobium ferrooxidans DSM 10331

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate WP_015799094.1 AFER_RS08800 carbamoyl-phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>NCBI__GCF_000023265.1:WP_015799094.1
          Length = 1100

 Score = 1048 bits (2709), Expect = 0.0
 Identities = 573/1090 (52%), Positives = 740/1090 (67%), Gaps = 34/1090 (3%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MP+   ++S+L++G+GPIVIGQA EFDYSG QAC+ LREEG RVIL NSNPATIMTDPE 
Sbjct: 1    MPRDPKVESVLVIGSGPIVIGQASEFDYSGVQACRVLREEGLRVILANSNPATIMTDPEF 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            ADATYIEP+  EV+ +IIE ERPDAVLPT+GGQTALN A+EL+  GVLE  GV M+GA  
Sbjct: 61   ADATYIEPLTLEVLERIIEAERPDAVLPTLGGQTALNLAMELDASGVLERSGVRMLGARP 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETA-RSGIAHTMEEALAVAADVGFPCIIRPSFTMGGSGG 179
             +I+ AE+R  F   +  I  + A R  +  ++EE   VA ++G+P ++RPS+ +GG+G 
Sbjct: 121  ASIELAENRDAFRQLLMSIDEQLAVRGRLVRSLEEGRDVADELGYPLMLRPSYILGGAGT 180

Query: 180  GIAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCSIENFDA 239
            GIA +   FE +   GL  SP  E+L++ES+ GWKE+E+EV+RD NDNC++VCSIEN D 
Sbjct: 181  GIATDPSSFEAMLRAGLIASPVGEVLVEESIAGWKEFELEVMRDANDNCVVVCSIENVDP 240

Query: 240  MGIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNGRLIVIE 299
            MG+HTGDSITVAPAQTLTD EYQ MR+ S  +LR +GVETGGSNVQFAV P +GR++V+E
Sbjct: 241  MGVHTGDSITVAPAQTLTDLEYQRMRSLSFEILRRVGVETGGSNVQFAVEPTSGRMVVVE 300

Query: 300  MNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSIDYVVTKI 359
            MNPRVSRSSALASKATGFPIAK+A +LA+GYTLDE+MNDIT G TPASFEP++DYVV K+
Sbjct: 301  MNPRVSRSSALASKATGFPIAKIATRLAIGYTLDEIMNDIT-GVTPASFEPALDYVVVKV 359

Query: 360  PRFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGF--DPKVSLDD 417
            PR+ FEKF GA   L T+M+SVGE MAIGR+  E+LQKALRG+E    GF  DP      
Sbjct: 360  PRWVFEKFEGAEGILGTRMQSVGETMAIGRSFAEALQKALRGIERSRGGFGADPAEVTWQ 419

Query: 418  PEALTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLEEKVA 477
              +   +   +      R++ + +A R G S++ V  L+ ID WF+ ++  +V     + 
Sbjct: 420  AYSDDALATLVAVPTEQRVFAVGEALRRGWSIERVAELSRIDPWFIGEMAGIVARAADIR 479

Query: 478  EVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCAAEFA 537
               +  L AD L  LKR GF+D +LA L GV E  +R+ R    +  VYK VDTCA EF 
Sbjct: 480  GRDLASLGADELLDLKRWGFSDLQLAWLLGVDETAVREHRHTVGVRAVYKAVDTCAGEFP 539

Query: 538  TDTAYMYSTYEEECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALREDGYETI 597
              T Y Y TYEEE E    ++R  ++++G GPNRIGQGIEFDYCCVHA+ ALRE G + I
Sbjct: 540  ARTPYYYGTYEEESE-TVGSNRPSVIIIGAGPNRIGQGIEFDYCCVHAAFALREAGVDAI 598

Query: 598  MVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIV----RIEKPKGVIVQYGGQTPLKLARA 653
            MVN NPETVSTDYDTS RLY EP+  E VL+++    R+   +GVIV  GGQTPLKLAR 
Sbjct: 599  MVNSNPETVSTDYDTSSRLYVEPLVTEHVLDVIAEEQRLGSLQGVIVSLGGQTPLKLARD 658

Query: 654  LEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAKEIGYPLV 713
            ++ +   V+GTSPD+ID AEDR R+    ERL ++QP   TVT++  A      IG P++
Sbjct: 659  IDPS--LVLGTSPDSIDVAEDRRRWSALCERLGIRQPPGGTVTSLAEAEAVVAAIGLPVL 716

Query: 714  VRPSYVLGGRAMEIVYDEADLRRYFQTAV------SVSNDAPVLLDHFLDDAVEVDVDAI 767
            VRPSYVLGGRAMEIVY E +LR  F   V      ++S D P+L+D FL+ A+EVDVDA+
Sbjct: 717  VRPSYVLGGRAMEIVYSEDELRSAFSRLVDLAAEGAISQDRPILIDRFLEGAIEVDVDAV 776

Query: 768  CDGE-MVLIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLM 826
             D E    IG +MEH+E+AGVHSGDSAC++P  +L+  +   +  Q + +A  L V GL+
Sbjct: 777  RDREGACWIGAVMEHVEEAGVHSGDSACTIPPVSLAPALVAEIEAQTRAIADALDVVGLI 836

Query: 827  NVQFAVKNNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAE---QGV-TKE 882
            NVQFAV +  V++IE NPRA+RTVPFV+KATGV L K+A R+M G +L +   +G+    
Sbjct: 837  NVQFAVADGTVFVIEANPRASRTVPFVAKATGVALVKIATRLMLGSTLRDLEREGLFVPR 896

Query: 883  VIPPYYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKH 942
             +  Y +VKE VLPF +F G D +LGPEMRSTGEVMG+ RT   AFAKAQL + + + + 
Sbjct: 897  RVTSYVAVKEAVLPFGRFRGADSILGPEMRSTGEVMGIDRTMPMAFAKAQLAAGTRLPQQ 956

Query: 943  GRALLSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGIN-PRLVNKV------- 994
            G  L++V + DKE +V +A +L+  GF+L AT GTA  L   G+   RLV KV       
Sbjct: 957  GTVLVTVADRDKEALVPIAERLVAHGFDLAATEGTASWLQAHGVPVARLVAKVTEEEDLG 1016

Query: 995  ----HEGRPHIQDRIKNGEYTYIINTTSGRRAIEDSRVIRRSALQYKVHYDTTLNGGFAT 1050
                 EG       +++GE   IINT  G     D   IR +AL+ K+   TTL    A 
Sbjct: 1017 ERAAPEGFVDAVRLVRSGEVALIINTPRGSGPRRDGYRIRTAALEAKIPLITTLEAARAA 1076

Query: 1051 AMALNADATE 1060
              A+ A A +
Sbjct: 1077 VAAVEALARQ 1086


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 2828
Number of extensions: 140
Number of successful extensions: 23
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1100
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1054
Effective search space:  1082458
Effective search space used:  1082458
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_015799094.1 AFER_RS08800 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.16457.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1383.1   0.0          0 1382.9   0.0    1.0  1  lcl|NCBI__GCF_000023265.1:WP_015799094.1  AFER_RS08800 carbamoyl-phosphate


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000023265.1:WP_015799094.1  AFER_RS08800 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1382.9   0.0         0         0       3    1051 ..       4    1079 ..       2    1080 .. 0.95

  Alignments for each domain:
  == domain 1  score: 1382.9 bits;  conditional E-value: 0
                                 TIGR01369    3 redikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePlt 69  
                                                + ++++vlviGsGpivigqA+EFDYsG qa++ l+eeg++v+L nsn+At+mtd+e+ad++YiePlt
  lcl|NCBI__GCF_000023265.1:WP_015799094.1    4 DPKVESVLVIGSGPIVIGQASEFDYSGVQACRVLREEGLRVILANSNPATIMTDPEFADATYIEPLT 70  
                                                67899************************************************************** PP

                                 TIGR01369   70 veavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealk 136 
                                                 e++e+iie ErpDa+l+tlGGqtaLnla+el+ +GvLe+ gv++lG++  +i+ ae+R++F+++l 
  lcl|NCBI__GCF_000023265.1:WP_015799094.1   71 LEVLERIIEAERPDAVLPTLGGQTALNLAMELDASGVLERSGVRMLGARPASIELAENRDAFRQLLM 137 
                                                ******************************************************************* PP

                                 TIGR01369  137 eineeva.kseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikq 202 
                                                 i+e++a +++ v+s ee  ++a+e+gyP+++R+++ lgG+G+gia++ + ++++ + +l asp+ +
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  138 SIDEQLAvRGRLVRSLEEGRDVADELGYPLMLRPSYILGGAGTGIATDPSSFEAMLRAGLIASPVGE 204 
                                                ****98747899******************************************************* PP

                                 TIGR01369  203 vlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslk 269 
                                                vlve+s+agwkE+E+Ev+RD++dnc++vc+iEn+Dp+GvHtGdsi+vaP+qtLtd eyq++R+ s++
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  205 VLVEESIAGWKEFELEVMRDANDNCVVVCSIENVDPMGVHTGDSITVAPAQTLTDLEYQRMRSLSFE 271 
                                                ******************************************************************* PP

                                 TIGR01369  270 iirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkn 335 
                                                i+r++gve++ +nvqfa++P+s r+vv+E+npRvsRssALAskAtG+PiAk+a++la+Gy+Lde++n
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  272 ILRRVGVETGgSNVQFAVEPTSGRMVVVEMNPRVSRSSALASKATGFPIAKIATRLAIGYTLDEIMN 338 
                                                ********988******************************************************** PP

                                 TIGR01369  336 dvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleek 402 
                                                d+t+ t+AsfEP+lDYvvvk+Prw ++kfe+++  lgt+m+svGE maigr+f ealqkalr +e +
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  339 DITGVTPASFEPALDYVVVKVPRWVFEKFEGAEGILGTRMQSVGETMAIGRSFAEALQKALRGIERS 405 
                                                ******************************************************************* PP

                                 TIGR01369  403 llglklkek....eaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklv 465 
                                                  g+  + +    +a sd++l + +  p+++R+fa+ ealrrg s+e+v el++id +f+ +++ +v
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  406 RGGFGADPAevtwQAYSDDALATLVAVPTEQRVFAVGEALRRGWSIERVAELSRIDPWFIGEMAGIV 472 
                                                99965543301114567788899999***************************************** PP

                                 TIGR01369  466 elekeleeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfe 532 
                                                + + +++   l  l +++l  +k+ Gfsd q+a l++v+e++vr+ r+++g+  v+k vDt+a+Ef+
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  473 ARAADIRGRDLASLGADELLDLKRWGFSDLQLAWLLGVDETAVREHRHTVGVRAVYKAVDTCAGEFP 539 
                                                ******************************************************************* PP

                                 TIGR01369  533 aktpYlYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPE 599 
                                                a+tpY+Y tyeee  ++  +++ +v+++G+Gp+Rigqg+EFDyc+vha+ alreag  +i++n+nPE
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  540 ARTPYYYGTYEEE-SETVGSNRPSVIIIGAGPNRIGQGIEFDYCCVHAAFALREAGVDAIMVNSNPE 605 
                                                *************.777777788******************************************** PP

                                 TIGR01369  600 tvstDydiadrLyFeeltvedvldiiekek....vegvivqlgGqtalnlakeleeagvkilGtsae 662 
                                                tvstDyd++ rLy e+l +e+vld+i +e+     +gviv+lgGqt+l+la++++ +   +lGts++
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  606 TVSTDYDTSSRLYVEPLVTEHVLDVIAEEQrlgsLQGVIVSLGGQTPLKLARDIDPS--LVLGTSPD 670 
                                                **************************99974444699*****************987..58****** PP

                                 TIGR01369  663 sidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeel 729 
                                                sid aEdR+++s+l+++lgi qp g ++ts+ ea+ +++ ig+PvlvRpsyvlgGrameiv++e+el
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  671 SIDVAEDRRRWSALCERLGIRQPPGGTVTSLAEAEAVVAAIGLPVLVRPSYVLGGRAMEIVYSEDEL 737 
                                                ******************************************************************* PP

                                 TIGR01369  730 eryle......eavevskekPvlidkyledavEvdvDavadge.evliagileHiEeaGvHsGDstl 789 
                                                ++ ++       + ++s++ P+lid++le a+EvdvDav d e    i +++eH+EeaGvHsGDs++
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  738 RSAFSrlvdlaAEGAISQDRPILIDRFLEGAIEVDVDAVRDREgACWIGAVMEHVEEAGVHSGDSAC 804 
                                                9987744443344579************************9772567889***************** PP

                                 TIGR01369  790 vlppqklseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvpl 856 
                                                ++pp +l   ++ +i++++++ia +l+v+Gl+n+qf+v d++v viE+n+RasRtvPfv+ka+gv l
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  805 TIPPVSLAPALVAEIEAQTRAIADALDVVGLINVQFAVADGTVFVIEANPRASRTVPFVAKATGVAL 871 
                                                ******************************************************************* PP

                                 TIGR01369  857 vklavkvllgkkleele.kgv.kkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdl 921 
                                                vk+a++++lg++l++le +g+   +  +++vavk+av++f +++g+d +lgpem+stGEvmgi+r++
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  872 VKIATRLMLGSTLRDLErEGLfVPRRVTSYVAVKEAVLPFGRFRGADSILGPEMRSTGEVMGIDRTM 938 
                                                *************99874344145667799************************************* PP

                                 TIGR01369  922 eeallkallaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaev. 987 
                                                  a++ka+la++++++++g+vl++v+d+dke+l+++a++l+++g+ + ategta+ l+ +g+ +   
  lcl|NCBI__GCF_000023265.1:WP_015799094.1  939 PMAFAKAQLAAGTRLPQQGTVLVTVADRDKEALVPIAERLVAHGFDLAATEGTASWLQAHGVPVARl 1005
                                                **************************************************************98651 PP

                                 TIGR01369  988 vlkvseea...........ekilellkeeeielvinltskkkkaaekgykirreaveykvplvtele 1043
                                                v kv+ee+            +++ l++++e+ l+in+++ ++  +++gy+ir +a+e k+pl+t+le
  lcl|NCBI__GCF_000023265.1:WP_015799094.1 1006 VAKVTEEEdlgeraapegfVDAVRLVRSGEVALIINTPR-GSGPRRDGYRIRTAALEAKIPLITTLE 1071
                                                56666666899999997776677899**********996.999************************ PP

                                 TIGR01369 1044 taeallea 1051
                                                +a+a++ a
  lcl|NCBI__GCF_000023265.1:WP_015799094.1 1072 AARAAVAA 1079
                                                **998876 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1100 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.08u 0.03s 00:00:00.11 Elapsed: 00:00:00.10
# Mc/sec: 10.75
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory