GapMind for Amino acid biosynthesis

 

Aligments for a candidate for carB in Paraburkholderia bryophila 376MFSha3.1

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate H281DRAFT_04403 H281DRAFT_04403 carbamoyl-phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>lcl|FitnessBrowser__Burk376:H281DRAFT_04403 H281DRAFT_04403
            carbamoyl-phosphate synthase large subunit
          Length = 1084

 Score = 1454 bits (3765), Expect = 0.0
 Identities = 749/1089 (68%), Positives = 871/1089 (79%), Gaps = 23/1089 (2%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDIKSILI+GAGPI+IGQACEFDYSGAQACKALREEGY+VILVNSNPATIMTDP  
Sbjct: 1    MPKRTDIKSILIIGAGPIIIGQACEFDYSGAQACKALREEGYKVILVNSNPATIMTDPNT 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            AD TYIEPI WEVV +II KERPDA+LPTMGGQTALNCAL+L   GVLE++ V +IGA+ 
Sbjct: 61   ADVTYIEPITWEVVERIIAKERPDAILPTMGGQTALNCALDLHAHGVLEKYKVELIGASP 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADV-------GFPCIIRPSFT 173
            +AIDKAEDR++F  AM KIGL +A+SG AH+MEEAL V A +       G+P +IRPSFT
Sbjct: 121  EAIDKAEDRQKFKDAMTKIGLGSAKSGTAHSMEEALQVQARIAVETGSGGYPVVIRPSFT 180

Query: 174  MGGSGGGIAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCS 233
            +GGSGGGIAYNREEFEEIC RGLDLSPT+ELLI+ESL+GWKEYEMEVVRDK DNCIIVCS
Sbjct: 181  LGGSGGGIAYNREEFEEICKRGLDLSPTRELLIEESLLGWKEYEMEVVRDKKDNCIIVCS 240

Query: 234  IENFDAMGIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNG 293
            IEN D MGIHTGDSITVAPAQTLTDKEYQ++RNAS+AVLREIGV+TGGSNVQF++NPK+G
Sbjct: 241  IENLDPMGIHTGDSITVAPAQTLTDKEYQVLRNASLAVLREIGVDTGGSNVQFSINPKDG 300

Query: 294  RLIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSID 353
            R+IVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDEL N+ITGG+TPASFEP+ID
Sbjct: 301  RMIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELKNEITGGQTPASFEPTID 360

Query: 354  YVVTKIPRFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPKV 413
            YVVTKIPRF FEKF  A+ RLTTQMKSVGEVMAIGRT QES QKALRGLEVG  G D K 
Sbjct: 361  YVVTKIPRFAFEKFREADSRLTTQMKSVGEVMAIGRTFQESFQKALRGLEVGVDGLDEKT 420

Query: 414  SLDDPEALTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLE 473
               D     +I RE+ +AG DRIWY+ DAFR G++   +F  T+ID WFL QIE+++  E
Sbjct: 421  DNRD-----EIIREIGEAGPDRIWYVGDAFRIGMTAQEIFEETSIDPWFLAQIEQIILKE 475

Query: 474  EKVAEVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCA 533
            + +A   +  L+ + L+ LK+ GF+D RLAKL G + AE+R  R + ++ PVYKRVDTCA
Sbjct: 476  KALAGRTLASLSKEELKYLKQSGFSDRRLAKLVGAKPAEVRARRIELNVRPVYKRVDTCA 535

Query: 534  AEFATDTAYMYSTYEEECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALREDG 593
            AEFAT TAYMYSTYEEECEANP T+ +KIMVLGGGPNRIGQGIEFDYCCVHA+LA+REDG
Sbjct: 536  AEFATKTAYMYSTYEEECEANP-TNNKKIMVLGGGPNRIGQGIEFDYCCVHAALAMREDG 594

Query: 594  YETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGVIVQYGGQTPLKLARA 653
            YETIMVNCNPETVSTDYDTSDRLYFE +TLEDVLEIV  EKP GVIVQYGGQTPLKLA  
Sbjct: 595  YETIMVNCNPETVSTDYDTSDRLYFESLTLEDVLEIVDKEKPVGVIVQYGGQTPLKLALD 654

Query: 654  LEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAKEIGYPLV 713
            LEA GVP+IGTSPD ID AEDRERFQ  ++ L L+QP N T  A + A++ A EIGYPLV
Sbjct: 655  LEANGVPIIGTSPDMIDAAEDRERFQKLLQDLGLRQPPNRTARAEDEALKLADEIGYPLV 714

Query: 714  VRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDAICDGEMV 773
            VRPSYVLGGRAMEIV++  DL RY + AV VSND+PVLLD FL+DA+E DVD I DGE V
Sbjct: 715  VRPSYVLGGRAMEIVHEPRDLERYMREAVKVSNDSPVLLDRFLNDAIECDVDCISDGEAV 774

Query: 774  LIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLMNVQFAV- 832
             IGG+MEHIEQAGVHSGDSACSLP Y+LS E    +++Q   +A  L V GLMNVQFA+ 
Sbjct: 775  FIGGVMEHIEQAGVHSGDSACSLPPYSLSGETIAELKRQTGAMAKALNVVGLMNVQFAIQ 834

Query: 833  --------KNNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQGVTKEVI 884
                    K + +Y++EVNPRA+RTVP+VSKAT +PLAK+AAR M G+ LA QGVTKE+ 
Sbjct: 835  QVPQADGSKQDIIYVLEVNPRASRTVPYVSKATSLPLAKIAARAMVGQKLAAQGVTKEID 894

Query: 885  PPYYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGR 944
            PPY+SVKE V PF KFP VDP+LGPEMRSTGEVMGVG+TF EA  K+QL + S + + G 
Sbjct: 895  PPYFSVKEAVFPFVKFPAVDPVLGPEMRSTGEVMGVGQTFGEALFKSQLAAGSRLPESGT 954

Query: 945  ALLSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDR 1004
             LL+V + DK + V++A  L + G+ + AT GTA  +  AG+  ++VNKV +GRPHI D 
Sbjct: 955  VLLTVMDADKPKAVEVARMLHELGYPIVATKGTAAAIEAAGVPVKVVNKVKDGRPHIVDM 1014

Query: 1005 IKNGEYTYIINTT-SGRRAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALNADATEKVI 1063
            IKNGE   +  T    R AI DSR IR SA   KV Y TT++G  A    L      +V 
Sbjct: 1015 IKNGEIALVFTTVDETRAAIADSRSIRMSAQANKVTYYTTMSGARAAVEGLRYLKDLEVY 1074

Query: 1064 SVQEMHAQI 1072
             +Q +HA++
Sbjct: 1075 DLQGLHARL 1083


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3133
Number of extensions: 134
Number of successful extensions: 18
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1084
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1038
Effective search space:  1066026
Effective search space used:  1066026
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate H281DRAFT_04403 H281DRAFT_04403 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.15932.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                    Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                    -----------
          0 1538.4   0.0          0 1538.2   0.0    1.0  1  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  H281DRAFT_04403 carbamoyl-phosph


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Burk376:H281DRAFT_04403  H281DRAFT_04403 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1538.2   0.0         0         0       1    1050 [.       2    1063 ..       2    1065 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1538.2 bits;  conditional E-value: 0
                                    TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvY 64  
                                                   pkr+dik++l+iG+Gpi+igqA+EFDYsG+qa+kal+eeg++v+Lvnsn+At+mtd++ ad +Y
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403    2 PKRTDIKSILIIGAGPIIIGQACEFDYSGAQACKALREEGYKVILVNSNPATIMTDPNTADVTY 65  
                                                   689************************************************************* PP

                                    TIGR01369   65 iePltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedR 128 
                                                   ieP+t+e+ve+ii kErpDail+t+GGqtaLn+a++l+ +GvLeky v+l+G++ eai+kaedR
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403   66 IEPITWEVVERIIAKERPDAILPTMGGQTALNCALDLHAHGVLEKYKVELIGASPEAIDKAEDR 129 
                                                   **************************************************************** PP

                                    TIGR01369  129 ekFkealkeineevakseivesveealeaaeei.......gyPvivRaaftlgGtGsgiaenee 185 
                                                   +kFk+a+++i++  aks +++s+eeal+++++i       gyPv++R++ftlgG+G+gia+n+e
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  130 QKFKDAMTKIGLGSAKSGTAHSMEEALQVQARIavetgsgGYPVVIRPSFTLGGSGGGIAYNRE 193 
                                                   ****************************9998766666668*********************** PP

                                    TIGR01369  186 elkelvekalkaspikqvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivv 249 
                                                   e++e+++++l++sp++++l+e+sl gwkE+E+EvvRD+kdnciivc+iEnlDp+G+HtGdsi+v
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  194 EFEEICKRGLDLSPTRELLIEESLLGWKEYEMEVVRDKKDNCIIVCSIENLDPMGIHTGDSITV 257 
                                                   **************************************************************** PP

                                    TIGR01369  250 aPsqtLtdkeyqllRdaslkiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskA 312 
                                                   aP+qtLtdkeyq lR+asl+++re+gv+++ +nvqf+++P++ r++viE+npRvsRssALAskA
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  258 APAQTLTDKEYQVLRNASLAVLREIGVDTGgSNVQFSINPKDGRMIVIEMNPRVSRSSALASKA 321 
                                                   ***************************9988********************************* PP

                                    TIGR01369  313 tGyPiAkvaaklavGysLdelkndvtk.etvAsfEPslDYvvvkiPrwdldkfekvdrklgtqm 375 
                                                   tG+PiAkvaaklavGy+Ldelkn++t+ +t+AsfEP++DYvv+kiPr++++kf+++d++l+tqm
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  322 TGFPIAKVAAKLAVGYTLDELKNEITGgQTPASFEPTIDYVVTKIPRFAFEKFREADSRLTTQM 385 
                                                   **************************879*********************************** PP

                                    TIGR01369  376 ksvGEvmaigrtfeealqkalrsleekllglklkekeaesdeeleealkkpndrRlfaiaealr 439 
                                                   ksvGEvmaigrtf+e++qkalr le ++ gl++k   + +++e+ +++ ++ ++R++++ +a+r
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  386 KSVGEVMAIGRTFQESFQKALRGLEVGVDGLDEK---TDNRDEIIREIGEAGPDRIWYVGDAFR 446 
                                                   **************************99998886...667778899****************** PP

                                    TIGR01369  440 rgvsveevyeltkidrffleklkklvelekeleeeklkelkkellkkakklGfsdeqiaklvkv 503 
                                                    g++ +e++e t id +fl ++++++  ek+l+  +l  l+ke+lk +k+ Gfsd+++aklv+ 
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  447 IGMTAQEIFEETSIDPWFLAQIEQIILKEKALAGRTLASLSKEELKYLKQSGFSDRRLAKLVGA 510 
                                                   **************************************************************** PP

                                    TIGR01369  504 seaevrklrkelgivpvvkrvDtvaaEfeaktpYlYstyeeekddvevtekkkvlvlGsGpiRi 567 
                                                   + aevr+ r el++ pv+krvDt+aaEf +kt+Y+Ystyeee++ +++ ++kk++vlG+Gp+Ri
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  511 KPAEVRARRIELNVRPVYKRVDTCAAEFATKTAYMYSTYEEECEANPT-NNKKIMVLGGGPNRI 573 
                                                   ******************************************666655.555************ PP

                                    TIGR01369  568 gqgvEFDycavhavlalreagyktilinynPEtvstDydiadrLyFeeltvedvldiiekekve 631 
                                                   gqg+EFDyc+vha+la+re gy+ti++n+nPEtvstDyd++drLyFe+lt+edvl+i++kek+ 
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  574 GQGIEFDYCCVHAALAMREDGYETIMVNCNPETVSTDYDTSDRLYFESLTLEDVLEIVDKEKPV 637 
                                                   **************************************************************** PP

                                    TIGR01369  632 gvivqlgGqtalnlakeleeagvkilGtsaesidraEdRekFsklldelgikqpkgkeatsvee 695 
                                                   gvivq+gGqt+l+la +le++gv+i+Gts++ id aEdRe+F+kll++lg+ qp +++a++ +e
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  638 GVIVQYGGQTPLKLALDLEANGVPIIGTSPDMIDAAEDRERFQKLLQDLGLRQPPNRTARAEDE 701 
                                                   **************************************************************** PP

                                    TIGR01369  696 akeiakeigyPvlvRpsyvlgGrameiveneeeleryleeavevskekPvlidkyledavEvdv 759 
                                                   a ++a+eigyP++vRpsyvlgGrameiv++  +lery++eav+vs+++Pvl+d++l+da+E+dv
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  702 ALKLADEIGYPLVVRPSYVLGGRAMEIVHEPRDLERYMREAVKVSNDSPVLLDRFLNDAIECDV 765 
                                                   **************************************************************** PP

                                    TIGR01369  760 DavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkkikeivkkiakelkvkGllni 823 
                                                   D ++dge v+i g++eHiE+aGvHsGDs+++lpp +ls e+  ++k+++ ++ak+l+v+Gl+n+
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  766 DCISDGEAVFIGGVMEHIEQAGVHSGDSACSLPPYSLSGETIAELKRQTGAMAKALNVVGLMNV 829 
                                                   **************************************************************** PP

                                    TIGR01369  824 qfvv.........kdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkleelekgvkk 878 
                                                   qf++         k++ +yv+Evn+RasRtvP+vska+++pl+k+a+++++g+kl+   +gv+k
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  830 QFAIqqvpqadgsKQDIIYVLEVNPRASRTVPYVSKATSLPLAKIAARAMVGQKLAA--QGVTK 891 
                                                   **9833333222234669**************************************9..889** PP

                                    TIGR01369  879 ekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakikkkgsv 942 
                                                   e ++ +++vk+avf+f k+  vd+vlgpem+stGEvmg+g+++ eal+k++la+++++++ g+v
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  892 EIDPPYFSVKEAVFPFVKFPAVDPVLGPEMRSTGEVMGVGQTFGEALFKSQLAAGSRLPESGTV 955 
                                                   **************************************************************** PP

                                    TIGR01369  943 llsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvseeaekilellkeee 1006
                                                   ll+v d+dk +++e+a++l+e+g+ ++at+gta+++e ag+ ++vv+kv++ +++i++++k++e
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403  956 LLTVMDADKPKAVEVARMLHELGYPIVATKGTAAAIEAAGVPVKVVNKVKDGRPHIVDMIKNGE 1019
                                                   **************************************************************** PP

                                    TIGR01369 1007 ielvinltskkkkaaekgykirreaveykvplvteletaealle 1050
                                                   i lv+++ +++++a  ++ +ir +a  +kv++ t++++a+a++e
  lcl|FitnessBrowser__Burk376:H281DRAFT_04403 1020 IALVFTTVDETRAAIADSRSIRMSAQANKVTYYTTMSGARAAVE 1063
                                                   **************************************999876 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1084 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.08u 0.02s 00:00:00.10 Elapsed: 00:00:00.11
# Mc/sec: 10.34
//
[ok]

This GapMind analysis is from Aug 03 2021. The underlying query database was built on Aug 03 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see the paper from 2019 on GapMind for amino acid biosynthesis, the paper from 2022 on GapMind for carbon sources, or view the source code, or see changes to Amino acid biosynthesis since the publication.

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory