GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Epibacterium ulvae U95

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate WP_090215936.1 CV091_RS06665 carbamoyl-phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>NCBI__GCF_002796795.1:WP_090215936.1
          Length = 1120

 Score = 1261 bits (3264), Expect = 0.0
 Identities = 674/1112 (60%), Positives = 818/1112 (73%), Gaps = 57/1112 (5%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDI+SI+I+GAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDP +
Sbjct: 1    MPKRTDIQSIMIIGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPGL 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            ADATYIEPI  EVV KIIEKERPDA+LPTMGGQT LN +L LE  GVLE+F V MIGAT 
Sbjct: 61   ADATYIEPITPEVVAKIIEKERPDALLPTMGGQTGLNTSLALEEMGVLEKFDVEMIGATR 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEE-------------ALAVAADVGFPCI 167
            DAI+ AEDR+ F  AM ++G+E  R+ I    ++             AL    D+G P I
Sbjct: 121  DAIEMAEDRKLFREAMDRLGIENPRASIVTAPKKDNGDADLDEGIRLALQELEDIGLPAI 180

Query: 168  IRPSFTMGGSGGGIAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDN 227
            IRP+FTMGG+GGG+AYNRE++   C  G+D SP  ++L+DESL+GWKEYEMEVVRDK DN
Sbjct: 181  IRPAFTMGGTGGGVAYNREDYIHYCRSGMDASPVNQILVDESLLGWKEYEMEVVRDKADN 240

Query: 228  CIIVCSIENFDAMGIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFA 287
             IIVCSIEN D MG+HTGDSITVAPA TLTDKEYQIMR  S+ VLREIGVETGGSNVQ+A
Sbjct: 241  AIIVCSIENVDPMGVHTGDSITVAPALTLTDKEYQIMRTHSINVLREIGVETGGSNVQWA 300

Query: 288  VNPKNGRLIVIEMNPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPAS 347
            VNP +GR++VIEMNPRVSRSSALASKATGFPIAK+AAKLAVGYTLDEL NDIT   TPAS
Sbjct: 301  VNPADGRMVVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELDNDITKV-TPAS 359

Query: 348  FEPSIDYVVTKIPRFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGAT 407
            FEPSIDYVVTKIP+F FEKF G+   LTT MKSVGE M+IGRT  ESLQKAL  +E G T
Sbjct: 360  FEPSIDYVVTKIPKFAFEKFPGSEPYLTTAMKSVGEAMSIGRTIHESLQKALASMESGLT 419

Query: 408  GFD----PKVSL---DDPEALTK--IRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNI 458
            GFD    P V++   +D  A  K  + + +     DR+  IA A R GL+ D + N+T  
Sbjct: 420  GFDEIAIPGVTVGLWEDAAATDKAAVIKAISQTTPDRMRTIAQAMRHGLTNDEINNVTAF 479

Query: 459  DRWFLVQIEELVRLEEKVAEVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRD 518
            D WFL +I E+V +E ++ + G+  +  D LR +K  GF DARL  L G  E  +R+ R 
Sbjct: 480  DPWFLDRIREIVDMEREIRKNGLP-VREDELRAVKMLGFTDARLGALTGRDEDNVRRARH 538

Query: 519  QYDLHPVYKRVDTCAAEFATDTAYMYSTYEE------ECEANPSTDREKIMVLGGGPNRI 572
               +  V+KR+DTCAAEF   T YMYSTYE       ECEA PS DR+K+++LGGGPNRI
Sbjct: 539  NLGVKAVFKRIDTCAAEFEAQTPYMYSTYETPMMGEAECEARPS-DRKKVVILGGGPNRI 597

Query: 573  GQGIEFDYCCVHASLALREDGYETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRI 632
            GQGIEFDYCC HA  AL + GYETIMVNCNPETVSTDYDTSDRLYFEP+TLE V+EI+R+
Sbjct: 598  GQGIEFDYCCCHACYALTDAGYETIMVNCNPETVSTDYDTSDRLYFEPLTLEHVMEILRV 657

Query: 633  EKPKG----VIVQYGGQTPLKLARALEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLK 688
            E+ KG    VIVQ+GGQTPLKLA ALE+ G+P++GTSPDAID AEDRERFQ  V  L LK
Sbjct: 658  EQEKGTLHGVIVQFGGQTPLKLANALESEGIPILGTSPDAIDLAEDRERFQALVNELGLK 717

Query: 689  QPANATVTAIEMAVEKAKEIGYPLVVRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDA 748
            QP N   +    A++ A++IG+PLV+RPSYVLGGRAMEIV D   L+RY + AV VS D+
Sbjct: 718  QPKNGIASTDAQAIDIAEKIGFPLVIRPSYVLGGRAMEIVRDMDQLKRYIKEAVVVSGDS 777

Query: 749  PVLLDHFLDDAVEVDVDAICDGEMVLIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDV 808
            PVLLD +L  AVE+DVDAICDG+ V + GIM+HIE+AGVHSGDSACSLP Y+LS+++ D 
Sbjct: 778  PVLLDSYLAGAVELDVDAICDGKDVHVAGIMQHIEEAGVHSGDSACSLPPYSLSKDVIDR 837

Query: 809  MRQQVQKLAFELQVRGLMNVQFAVKNNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARV 868
            ++ Q   LA  L V GLMNVQFA+K++E+YLIEVNPRA+RTVPFV+KAT   +A ++ARV
Sbjct: 838  IKDQAHSLATALNVVGLMNVQFAIKDDEIYLIEVNPRASRTVPFVAKATDSAIASISARV 897

Query: 869  MAGKSLAEQGVTKEVIP---------------------PYYSVKEVVLPFNKFPGVDPLL 907
            MAG+ L+   +     P                     P++SVKE VLPF +FPGVD +L
Sbjct: 898  MAGEPLSNFPMRAPYGPDAGYDVNTPIADPMTLADPDMPWFSVKEAVLPFARFPGVDTIL 957

Query: 908  GPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGRALLSVREGDK-ERVVDLAAKLLK 966
            GPEMRSTGEVMG  R+FA AF KAQ+G+   + + G A +S+++ DK +++ + A  L+ 
Sbjct: 958  GPEMRSTGEVMGWDRSFAGAFLKAQMGAGMVLPRKGCAFVSIKDSDKSDQMREAAQTLVD 1017

Query: 967  QGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDRIKNGEYTYIINTTSGRRAIEDS 1026
             GF L AT GT   L   GI   L NKV+EGRPH+ D +K+G+   ++NTT G +A+EDS
Sbjct: 1018 LGFTLVATQGTQAWLDGQGIPCGLTNKVYEGRPHVVDLLKDGKVQILMNTTEGTQAVEDS 1077

Query: 1027 RVIRRSALQYKVHYDTTLNGGFATAMALNADA 1058
            + +R  AL  ++ Y TT  G  A A+A+ A +
Sbjct: 1078 KEMRSVALYGRIPYFTTAAGAHAAALAIKAQS 1109


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3137
Number of extensions: 157
Number of successful extensions: 22
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1120
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1074
Effective search space:  1102998
Effective search space used:  1102998
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_090215936.1 CV091_RS06665 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.3108060.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1438.2   0.0          0 1438.0   0.0    1.0  1  NCBI__GCF_002796795.1:WP_090215936.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_002796795.1:WP_090215936.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1438.0   0.0         0         0       1    1050 [.       2    1103 ..       2    1105 .. 0.95

  Alignments for each domain:
  == domain 1  score: 1438.0 bits;  conditional E-value: 0
                             TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltve 71  
                                            pkr+di+++++iG+GpivigqA+EFDYsG+qa+kal+eeg++v+Lvnsn+At+mtd+ lad++YieP+t+e
  NCBI__GCF_002796795.1:WP_090215936.1    2 PKRTDIQSIMIIGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPGLADATYIEPITPE 72  
                                            689******************************************************************** PP

                             TIGR01369   72 avekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineev 142 
                                            +v+kiiekErpDa+l+t+GGqt+Ln ++ lee+GvLek++v+++G++ +ai+ aedR++F+ea++ +++e 
  NCBI__GCF_002796795.1:WP_090215936.1   73 VVAKIIEKERPDALLPTMGGQTGLNTSLALEEMGVLEKFDVEMIGATRDAIEMAEDRKLFREAMDRLGIEN 143 
                                            *********************************************************************** PP

                             TIGR01369  143 akseivesvee.............aleaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspi 200 
                                            ++++iv+  ++             al+  e+ig+P i+R+aft+gGtG+g+a+n+e+  + ++++++asp+
  NCBI__GCF_002796795.1:WP_090215936.1  144 PRASIVTAPKKdngdadldegirlALQELEDIGLPAIIRPAFTMGGTGGGVAYNREDYIHYCRSGMDASPV 214 
                                            **9999754321122222223333677789***************************************** PP

                             TIGR01369  201 kqvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkii 271 
                                            +q+lv++sl gwkE+E+EvvRD++dn+iivc+iEn+Dp+GvHtGdsi+vaP+ tLtdkeyq++R+ s++++
  NCBI__GCF_002796795.1:WP_090215936.1  215 NQILVDESLLGWKEYEMEVVRDKADNAIIVCSIENVDPMGVHTGDSITVAPALTLTDKEYQIMRTHSINVL 285 
                                            *********************************************************************** PP

                             TIGR01369  272 relgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtket 341 
                                            re+gve++ +nvq+a++P + r+vviE+npRvsRssALAskAtG+PiAk+aaklavGy+Ldel nd+tk t
  NCBI__GCF_002796795.1:WP_090215936.1  286 REIGVETGgSNVQWAVNPADGRMVVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELDNDITKVT 356 
                                            ******988************************************************************** PP

                             TIGR01369  342 vAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekllglkl.... 408 
                                            +AsfEPs+DYvv+kiP+++++kf + +  l+t mksvGE m+igrt++e+lqkal+s+e++l+g+++    
  NCBI__GCF_002796795.1:WP_090215936.1  357 PASFEPSIDYVVTKIPKFAFEKFPGSEPYLTTAMKSVGEAMSIGRTIHESLQKALASMESGLTGFDEiaip 427 
                                            **************************************************************995532222 PP

                             TIGR01369  409 .......kekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvelekele 472 
                                                   ++++a+++ ++ +a+ +++++R+ +ia+a+r+g++ +e++++t +d +fl++++++v++e+e++
  NCBI__GCF_002796795.1:WP_090215936.1  428 gvtvglwEDAAATDKAAVIKAISQTTPDRMRTIAQAMRHGLTNDEINNVTAFDPWFLDRIREIVDMEREIR 498 
                                            2222223567788889999***************************************************9 PP

                             TIGR01369  473 eeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpYlYstye 543 
                                            ++ l  +++++l+ +k lGf+d+++++l++ +e++vr++r++lg+  v+kr+Dt+aaEfea+tpY+Ystye
  NCBI__GCF_002796795.1:WP_090215936.1  499 KNGLP-VREDELRAVKMLGFTDARLGALTGRDEDNVRRARHNLGVKAVFKRIDTCAAEFEAQTPYMYSTYE 568 
                                            77776.9***************************************************************9 PP

                             TIGR01369  544 ee.....kddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvstDydiad 609 
                                            +      + +++ +++kkv++lG+Gp+Rigqg+EFDyc+ ha+ al +agy+ti++n+nPEtvstDyd++d
  NCBI__GCF_002796795.1:WP_090215936.1  569 TPmmgeaECEARPSDRKKVVILGGGPNRIGQGIEFDYCCCHACYALTDAGYETIMVNCNPETVSTDYDTSD 639 
                                            76444334566677889****************************************************** PP

                             TIGR01369  610 rLyFeeltvedvldiiekek....vegvivqlgGqtalnlakeleeagvkilGtsaesidraEdRekFskl 676 
                                            rLyFe+lt+e+v++i++ e+     +gvivq+gGqt+l+la++le++g++ilGts+++id aEdRe+F++l
  NCBI__GCF_002796795.1:WP_090215936.1  640 RLYFEPLTLEHVMEILRVEQekgtLHGVIVQFGGQTPLKLANALESEGIPILGTSPDAIDLAEDRERFQAL 710 
                                            ***************998872233568******************************************** PP

                             TIGR01369  677 ldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeeleryleeavevskekPvli 747 
                                            ++elg+kqpk+ +a++  +a +ia++ig+P+++RpsyvlgGrameiv+++++l+ry++eav vs ++Pvl+
  NCBI__GCF_002796795.1:WP_090215936.1  711 VNELGLKQPKNGIASTDAQAIDIAEKIGFPLVIRPSYVLGGRAMEIVRDMDQLKRYIKEAVVVSGDSPVLL 781 
                                            *********************************************************************** PP

                             TIGR01369  748 dkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkkikeivkkiakelkvk 818 
                                            d yl  avE+dvDa++dg++v +agi++HiEeaGvHsGDs+++lpp +ls++v ++ik++++++a++l+v+
  NCBI__GCF_002796795.1:WP_090215936.1  782 DSYLAGAVELDVDAICDGKDVHVAGIMQHIEEAGVHSGDSACSLPPYSLSKDVIDRIKDQAHSLATALNVV 852 
                                            *********************************************************************** PP

                             TIGR01369  819 GllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkleele................ 873 
                                            Gl+n+qf++kd+e+y+iEvn+RasRtvPfv+ka++  +++++++v++g+ l++                  
  NCBI__GCF_002796795.1:WP_090215936.1  853 GLMNVQFAIKDDEIYLIEVNPRASRTVPFVAKATDSAIASISARVMAGEPLSNFPmrapygpdagydvntp 923 
                                            *****************************************************9999************99 PP

                             TIGR01369  874 ...kgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakikkkgs 941 
                                                    ++++  ++vk+av++f+++ gvd +lgpem+stGEvmg +r+++ a+lka++ +++++++kg 
  NCBI__GCF_002796795.1:WP_090215936.1  924 iadPMTLADPDMPWFSVKEAVLPFARFPGVDTILGPEMRSTGEVMGWDRSFAGAFLKAQMGAGMVLPRKGC 994 
                                            9985567889999********************************************************** PP

                             TIGR01369  942 vllsvkdkdk.eellelakklaekglkvyategtakvleeagikaevvlkvseeaekilellkeeeielvi 1011
                                            +++s+kd+dk +++ e+a++l+++g++++at+gt++ l  +gi + + +kv e ++++++llk++++++++
  NCBI__GCF_002796795.1:WP_090215936.1  995 AFVSIKDSDKsDQMREAAQTLVDLGFTLVATQGTQAWLDGQGIPCGLTNKVYEGRPHVVDLLKDGKVQILM 1065
                                            *******999678999******************************************************* PP

                             TIGR01369 1012 nltskkkkaaekgykirreaveykvplvteletaealle 1050
                                            n+t+ +++a+e+++ +r  a+  ++p++t++++a+a++ 
  NCBI__GCF_002796795.1:WP_090215936.1 1066 NTTE-GTQAVEDSKEMRSVALYGRIPYFTTAAGAHAAAL 1103
                                            *997.78899999******************99988765 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1120 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.03u 0.03s 00:00:00.06 Elapsed: 00:00:00.06
# Mc/sec: 18.28
//
[ok]

This GapMind analysis is from Jul 26 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory