GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Burkholderia phytofirmans PsJN

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate BPHYT_RS14190 BPHYT_RS14190 carbamoyl phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>FitnessBrowser__BFirm:BPHYT_RS14190
          Length = 1084

 Score = 1460 bits (3780), Expect = 0.0
 Identities = 747/1089 (68%), Positives = 876/1089 (80%), Gaps = 23/1089 (2%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDIKSILI+GAGPI+IGQACEFDYSGAQACKALREEGY+VILVNSNPATIMTDP  
Sbjct: 1    MPKRTDIKSILIIGAGPIIIGQACEFDYSGAQACKALREEGYKVILVNSNPATIMTDPNT 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            AD TYIEPI WEVV +II KERPDA+LPTMGGQTALNCAL+L   GVLE++ V +IGA+ 
Sbjct: 61   ADVTYIEPITWEVVERIIAKERPDAILPTMGGQTALNCALDLHAHGVLEKYKVELIGASP 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADV-------GFPCIIRPSFT 173
            +AIDKAEDR++F  AM KIGL +A+SG AH+MEEALAV AD+       G+P +IRPSFT
Sbjct: 121  EAIDKAEDRQKFKDAMTKIGLGSAKSGTAHSMEEALAVQADIASQTGSGGYPVVIRPSFT 180

Query: 174  MGGSGGGIAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCS 233
            +GGSGGGIAYNR+EFEEIC RGLDLSPT+ELLI+ESL+GWKEYEMEVVRDK DNCIIVCS
Sbjct: 181  LGGSGGGIAYNRDEFEEICKRGLDLSPTRELLIEESLLGWKEYEMEVVRDKKDNCIIVCS 240

Query: 234  IENFDAMGIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNG 293
            IEN D MGIHTGDSITVAPAQTLTDKEYQI+RNAS+AVLREIGV+TGGSNVQF++NPK+G
Sbjct: 241  IENLDPMGIHTGDSITVAPAQTLTDKEYQILRNASLAVLREIGVDTGGSNVQFSINPKDG 300

Query: 294  RLIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSID 353
            R++VIEMNPRVSRSSALASKATGFPIAK+AAKLAVGY+LDEL N+ITGG+TPASFEP+ID
Sbjct: 301  RMVVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYSLDELKNEITGGQTPASFEPTID 360

Query: 354  YVVTKIPRFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDPKV 413
            YVVTKIPRF FEKF  A+ RLTTQMKSVGEVMAIGRT QES QKALRGLEVG  G D K 
Sbjct: 361  YVVTKIPRFAFEKFREADSRLTTQMKSVGEVMAIGRTFQESFQKALRGLEVGVDGLDEKT 420

Query: 414  SLDDPEALTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLE 473
               D     +I RE+ +AG DRIWY+ DAFR G++ + +F  T+ID WFL QIE+++  E
Sbjct: 421  DNRD-----EIIREIGEAGPDRIWYVGDAFRVGMTAEEIFEETSIDPWFLAQIEQIILKE 475

Query: 474  EKVAEVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCA 533
            + +A   +  L+ + L+ LK+ GF+D RLAKL G + AE+R+ R + ++ PVYKRVDTCA
Sbjct: 476  KALAGRTLASLSKEELKYLKQSGFSDRRLAKLLGAKPAEVRQRRIELNVRPVYKRVDTCA 535

Query: 534  AEFATDTAYMYSTYEEECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLALREDG 593
            AEFAT TAYMYSTYEEECEANP T  +KIMVLGGGPNRIGQGIEFDYCCVHA+LA+REDG
Sbjct: 536  AEFATKTAYMYSTYEEECEANP-TSNKKIMVLGGGPNRIGQGIEFDYCCVHAALAMREDG 594

Query: 594  YETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGVIVQYGGQTPLKLARA 653
            YETIMVNCNPETVSTDYDTSDRLYFE +TLEDVLEIV  EKP GVIVQYGGQTPLKLA  
Sbjct: 595  YETIMVNCNPETVSTDYDTSDRLYFESLTLEDVLEIVDKEKPVGVIVQYGGQTPLKLALD 654

Query: 654  LEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKAKEIGYPLV 713
            LEA GVP++GTSPD ID AEDRERFQ  ++ L L+QP N T  A + A++ A EIGYPLV
Sbjct: 655  LEANGVPIVGTSPDMIDAAEDRERFQKLLQDLGLRQPPNRTARAEDEALKLADEIGYPLV 714

Query: 714  VRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVDAICDGEMV 773
            VRPSYVLGGRAMEIV++  DL RY + AV VSND+PVLLD FL+DA+E DVD I DGE V
Sbjct: 715  VRPSYVLGGRAMEIVHEPRDLERYMREAVKVSNDSPVLLDRFLNDAIECDVDCISDGEAV 774

Query: 774  LIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGLMNVQFAV- 832
             IGG+MEHIEQAGVHSGDSACSLP Y+LS+E    +++Q   +A  L V GLMNVQFA+ 
Sbjct: 775  FIGGVMEHIEQAGVHSGDSACSLPPYSLSKETVAELKRQTGAMAKALNVIGLMNVQFAIQ 834

Query: 833  --------KNNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQGVTKEVI 884
                    K + +Y++EVNPRA+RTVP+VSKAT +PLAK+AAR M G+ LA+QGVTKE+ 
Sbjct: 835  QVPQADGSKEDVIYVLEVNPRASRTVPYVSKATSLPLAKIAARAMVGQKLADQGVTKEID 894

Query: 885  PPYYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGR 944
            PPY+SVKE V PF KFP VDP+LGPEMRSTGEVMGVG+TF EA  K+QL + S + + G 
Sbjct: 895  PPYFSVKEAVFPFVKFPAVDPVLGPEMRSTGEVMGVGQTFGEALFKSQLAAGSRLPESGT 954

Query: 945  ALLSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDR 1004
             LL+V + DK + V++A  L + G+ + AT GTA  +  AG+  ++VNKV +GRPHI D 
Sbjct: 955  VLLTVMDADKPKAVEVARMLHELGYPIVATKGTAAAIEAAGVPVKVVNKVKDGRPHIVDM 1014

Query: 1005 IKNGEYTYIINTT-SGRRAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALNADATEKVI 1063
            IKNGE   +  T    R AI DSR IR SA   KV Y TT++G  A    L      +V 
Sbjct: 1015 IKNGEIALVFTTVDETRAAIADSRSIRMSAQANKVTYYTTMSGARAAVEGLRYLKDLEVY 1074

Query: 1064 SVQEMHAQI 1072
             +Q +HA++
Sbjct: 1075 DLQGLHARL 1083


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3146
Number of extensions: 132
Number of successful extensions: 18
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1084
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1038
Effective search space:  1066026
Effective search space used:  1066026
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate BPHYT_RS14190 BPHYT_RS14190 (carbamoyl phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.13869.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                -----------
          0 1539.5   0.0          0 1539.3   0.0    1.0  1  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  BPHYT_RS14190 carbamoyl phosphat


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__BFirm:BPHYT_RS14190  BPHYT_RS14190 carbamoyl phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1539.3   0.0         0         0       1    1050 [.       2    1063 ..       2    1065 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1539.3 bits;  conditional E-value: 0
                                TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePl 68  
                                               pkr+dik++l+iG+Gpi+igqA+EFDYsG+qa+kal+eeg++v+Lvnsn+At+mtd++ ad +YieP+
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190    2 PKRTDIKSILIIGAGPIIIGQACEFDYSGAQACKALREEGYKVILVNSNPATIMTDPNTADVTYIEPI 69  
                                               689***************************************************************** PP

                                TIGR01369   69 tveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealk 136 
                                               t+e+ve+ii kErpDail+t+GGqtaLn+a++l+ +GvLeky v+l+G++ eai+kaedR+kFk+a++
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190   70 TWEVVERIIAKERPDAILPTMGGQTALNCALDLHAHGVLEKYKVELIGASPEAIDKAEDRQKFKDAMT 137 
                                               ******************************************************************** PP

                                TIGR01369  137 eineevakseivesveealeaaeei.......gyPvivRaaftlgGtGsgiaeneeelkelvekalka 197 
                                               +i++  aks +++s+eeal+++++i       gyPv++R++ftlgG+G+gia+n++e++e+++++l++
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  138 KIGLGSAKSGTAHSMEEALAVQADIasqtgsgGYPVVIRPSFTLGGSGGGIAYNRDEFEEICKRGLDL 205 
                                               *******************99887766666668*********************************** PP

                                TIGR01369  198 spikqvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRd 265 
                                               sp++++l+e+sl gwkE+E+EvvRD+kdnciivc+iEnlDp+G+HtGdsi+vaP+qtLtdkeyq+lR+
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  206 SPTRELLIEESLLGWKEYEMEVVRDKKDNCIIVCSIENLDPMGIHTGDSITVAPAQTLTDKEYQILRN 273 
                                               ******************************************************************** PP

                                TIGR01369  266 aslkiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLde 332 
                                               asl+++re+gv+++ +nvqf+++P++ r+vviE+npRvsRssALAskAtG+PiAk+aaklavGysLde
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  274 ASLAVLREIGVDTGgSNVQFSINPKDGRMVVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYSLDE 341 
                                               ***********9988***************************************************** PP

                                TIGR01369  333 lkndvtk.etvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsl 399 
                                               lkn++t+ +t+AsfEP++DYvv+kiPr++++kf+++d++l+tqmksvGEvmaigrtf+e++qkalr l
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  342 LKNEITGgQTPASFEPTIDYVVTKIPRFAFEKFREADSRLTTQMKSVGEVMAIGRTFQESFQKALRGL 409 
                                               ******879*********************************************************** PP

                                TIGR01369  400 eekllglklkekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvel 467 
                                               e ++ gl++k   + +++e+ +++ ++ ++R++++ +a+r g++ ee++e t id +fl ++++++  
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  410 EVGVDGLDEK---TDNRDEIIREIGEAGPDRIWYVGDAFRVGMTAEEIFEETSIDPWFLAQIEQIILK 474 
                                               **99998886...667778899********************************************** PP

                                TIGR01369  468 ekeleeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeakt 535 
                                               ek+l+  +l  l+ke+lk +k+ Gfsd+++akl++ + aevr+ r el++ pv+krvDt+aaEf +kt
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  475 EKALAGRTLASLSKEELKYLKQSGFSDRRLAKLLGAKPAEVRQRRIELNVRPVYKRVDTCAAEFATKT 542 
                                               ******************************************************************** PP

                                TIGR01369  536 pYlYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvst 603 
                                               +Y+Ystyeee++ +++++k k++vlG+Gp+Rigqg+EFDyc+vha+la+re gy+ti++n+nPEtvst
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  543 AYMYSTYEEECEANPTSNK-KIMVLGGGPNRIGQGIEFDYCCVHAALAMREDGYETIMVNCNPETVST 609 
                                               **********766666655.************************************************ PP

                                TIGR01369  604 DydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGtsaesidraEdRe 671 
                                               Dyd++drLyFe+lt+edvl+i++kek+ gvivq+gGqt+l+la +le++gv+i+Gts++ id aEdRe
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  610 DYDTSDRLYFESLTLEDVLEIVDKEKPVGVIVQYGGQTPLKLALDLEANGVPIVGTSPDMIDAAEDRE 677 
                                               ******************************************************************** PP

                                TIGR01369  672 kFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeeleryleeavev 739 
                                               +F+kll++lg+ qp +++a++ +ea ++a+eigyP++vRpsyvlgGrameiv++  +lery++eav+v
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  678 RFQKLLQDLGLRQPPNRTARAEDEALKLADEIGYPLVVRPSYVLGGRAMEIVHEPRDLERYMREAVKV 745 
                                               ******************************************************************** PP

                                TIGR01369  740 skekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkkikei 807 
                                               s+++Pvl+d++l+da+E+dvD ++dge v+i g++eHiE+aGvHsGDs+++lpp +ls+e++ ++k++
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  746 SNDSPVLLDRFLNDAIECDVDCISDGEAVFIGGVMEHIEQAGVHSGDSACSLPPYSLSKETVAELKRQ 813 
                                               ******************************************************************** PP

                                TIGR01369  808 vkkiakelkvkGllniqfvv.........kdeevyviEvnvRasRtvPfvskalgvplvklavkvllg 866 
                                               + ++ak+l+v Gl+n+qf++         k++ +yv+Evn+RasRtvP+vska+++pl+k+a+++++g
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  814 TGAMAKALNVIGLMNVQFAIqqvpqadgsKEDVIYVLEVNPRASRTVPYVSKATSLPLAKIAARAMVG 881 
                                               ******************9833333222234679********************************** PP

                                TIGR01369  867 kkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaska 934 
                                               +kl++  +gv+ke ++ +++vk+avf+f k+  vd+vlgpem+stGEvmg+g+++ eal+k++la+++
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  882 QKLAD--QGVTKEIDPPYFSVKEAVFPFVKFPAVDPVLGPEMRSTGEVMGVGQTFGEALFKSQLAAGS 947 
                                               ****9..889********************************************************** PP

                                TIGR01369  935 kikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvseeaekilell 1002
                                               ++++ g+vll+v d+dk +++e+a++l+e+g+ ++at+gta+++e ag+ ++vv+kv++ +++i++++
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190  948 RLPESGTVLLTVMDADKPKAVEVARMLHELGYPIVATKGTAAAIEAAGVPVKVVNKVKDGRPHIVDMI 1015
                                               ******************************************************************** PP

                                TIGR01369 1003 keeeielvinltskkkkaaekgykirreaveykvplvteletaealle 1050
                                               k++ei lv+++ +++++a  ++ +ir +a  +kv++ t++++a+a++e
  lcl|FitnessBrowser__BFirm:BPHYT_RS14190 1016 KNGEIALVFTTVDETRAAIADSRSIRMSAQANKVTYYTTMSGARAAVE 1063
                                               ******************************************999876 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1084 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.08u 0.03s 00:00:00.11 Elapsed: 00:00:00.10
# Mc/sec: 10.94
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory