GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Rhodospirillum rubrum ATCC 11170

Align carbamoyl-phosphate synthase (glutamine-hydrolysing) (EC 6.3.5.5) (characterized)
to candidate WP_011390636.1 RRU_RS14890 carbamoyl-phosphate synthase large subunit

Query= BRENDA::P00968
         (1073 letters)



>NCBI__GCF_000013085.1:WP_011390636.1
          Length = 1082

 Score = 1292 bits (3344), Expect = 0.0
 Identities = 675/1072 (62%), Positives = 819/1072 (76%), Gaps = 24/1072 (2%)

Query: 1    MPKRTDIKSILILGAGPIVIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPEM 60
            MPKRTDIKSILI+GAGPIVIGQACEFDYSGAQACKALR EGYRVILVNSNPATIMTDPE 
Sbjct: 1    MPKRTDIKSILIIGAGPIVIGQACEFDYSGAQACKALRAEGYRVILVNSNPATIMTDPET 60

Query: 61   ADATYIEPIHWEVVRKIIEKERPDAVLPTMGGQTALNCALELERQGVLEEFGVTMIGATA 120
            ADATYIEPI  E+V  IIE+ERPDA+LPTMGGQTALN A+ L  +GVL ++GV MI A  
Sbjct: 61   ADATYIEPITPEIVEAIIERERPDALLPTMGGQTALNTAMALADRGVLTKYGVEMIAANK 120

Query: 121  DAIDKAEDRRRFDVAMKKIGLETARSGIAHTMEEALAVAADVGFPCIIRPSFTMGGSGGG 180
            + I KAEDR  F  AM+KIGL+  RS + H++EE+     ++G P IIRPSFT+GG GGG
Sbjct: 121  EVIAKAEDRLLFRDAMRKIGLDCPRSALVHSIEESRQALEEIGLPVIIRPSFTLGGQGGG 180

Query: 181  IAYNREEFEEICARGLDLSPTKELLIDESLIGWKEYEMEVVRDKNDNCIIVCSIENFDAM 240
            +A+NREE++ I A GL  SP +++L++ES++GWKEYEMEVVRD+ DNCIIVCSIEN D M
Sbjct: 181  MAFNREEYDRIVASGLAASPVRQILVEESVLGWKEYEMEVVRDRADNCIIVCSIENIDPM 240

Query: 241  GIHTGDSITVAPAQTLTDKEYQIMRNASMAVLREIGVETGGSNVQFAVNPKNGRLIVIEM 300
            G+HTGDSITVAPA TLTDKEYQ+MRNAS+A LREIGVETGGSNVQFAVNPK+GRL+VIEM
Sbjct: 241  GVHTGDSITVAPALTLTDKEYQVMRNASIACLREIGVETGGSNVQFAVNPKDGRLVVIEM 300

Query: 301  NPRVSRSSALASKATGFPIAKVAAKLAVGYTLDELMNDITGGRTPASFEPSIDYVVTKIP 360
            NPRVSRSSALASKATGFPIAK+AAKLAVGYTLDEL NDIT G TPASFEP+IDYVVTK+P
Sbjct: 301  NPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELSNDIT-GVTPASFEPTIDYVVTKLP 359

Query: 361  RFNFEKFAGANDRLTTQMKSVGEVMAIGRTQQESLQKALRGLEVGATGFDP-----KVSL 415
            RF FEKF      L++ MKSVGE MAIGRT +ESLQK LR LE+G  G D          
Sbjct: 360  RFTFEKFPDTEALLSSSMKSVGEAMAIGRTFKESLQKGLRSLEIGLDGLDEVEIPGSAGQ 419

Query: 416  DDPEALTKIRRELKDAGADRIWYIADAFRAGLSVDGVFNLTNIDRWFLVQIEELVRLEEK 475
            D  +A   IR  L  A  DRI  IA A R G +V+ V  +   D WFL QI+E+V  E +
Sbjct: 420  DGKDA---IRAALSKARPDRILIIAQALRQGFTVEEVRAICYYDPWFLEQIKEIVDEERR 476

Query: 476  VAEVGITGLNADFLRQLKRKGFADARLAKLAGVREAEIRKLRDQYDLHPVYKRVDTCAAE 535
            + E G+ G +A  L ++K+ GF+DARLAKL G    E+   R   ++HPVYKR+DTCAAE
Sbjct: 477  LRENGLPG-DAVSLHRVKKMGFSDARLAKLTGKTVTEVSFRRQVLNVHPVYKRIDTCAAE 535

Query: 536  FATDTAYMYSTYE------EECEANPSTDREKIMVLGGGPNRIGQGIEFDYCCVHASLAL 589
            FA+ T YMYS YE       ECEA  S DR KI++LGGGPNRIGQGIEFDYCCVHA+ AL
Sbjct: 536  FASRTPYMYSCYEGDGLTPAECEAEVS-DRTKIIILGGGPNRIGQGIEFDYCCVHAAYAL 594

Query: 590  REDGYETIMVNCNPETVSTDYDTSDRLYFEPVTLEDVLEIVRIEKPKGV----IVQYGGQ 645
             + G+ETIMVNCNPETVSTDYDTSDRLYFEP+T+EDV+E+ R E+ +G     IVQYGGQ
Sbjct: 595  SDAGFETIMVNCNPETVSTDYDTSDRLYFEPLTIEDVVELARKEQARGTLLGCIVQYGGQ 654

Query: 646  TPLKLARALEAAGVPVIGTSPDAIDRAEDRERFQHAVERLKLKQPANATVTAIEMAVEKA 705
            TPLKLAR LEAAG+PV+GTSPDAID AEDR+RFQ  + +L L+QP N T  ++E A   A
Sbjct: 655  TPLKLARGLEAAGIPVLGTSPDAIDLAEDRDRFQKLIAKLALRQPRNGTALSVEQARAIA 714

Query: 706  KEIGYPLVVRPSYVLGGRAMEIVYDEADLRRYFQTAVSVSNDAPVLLDHFLDDAVEVDVD 765
              +GYP+V+RPSYVLGGRAM+IV+DEA L  Y   AV VS D PVL+D++L  A+EVDVD
Sbjct: 715  TRVGYPVVIRPSYVLGGRAMQIVHDEAQLNDYMVNAVKVSGDDPVLIDNYLSGAIEVDVD 774

Query: 766  AICDGEMVLIGGIMEHIEQAGVHSGDSACSLPAYTLSQEIQDVMRQQVQKLAFELQVRGL 825
            AI DGE   I GIM+HIE+AG+HSGDSACSLP Y+L +     + +Q + LA  L VRGL
Sbjct: 775  AIADGETTHIAGIMQHIEEAGIHSGDSACSLPPYSLDEATIAELTKQTEALAKGLNVRGL 834

Query: 826  MNVQFAVKNNEVYLIEVNPRAARTVPFVSKATGVPLAKVAARVMAGKSLAEQG-VTKEVI 884
            MN+QFA+K+ ++Y++EVNPRA+RTVPFV+KATGV +AK+AARVMAG+SLA  G VTK + 
Sbjct: 835  MNIQFAIKDGDIYILEVNPRASRTVPFVAKATGVAVAKIAARVMAGESLASFGLVTKRL- 893

Query: 885  PPYYSVKEVVLPFNKFPGVDPLLGPEMRSTGEVMGVGRTFAEAFAKAQLGSNSTMKKHGR 944
              + +VKE V PF +FPGVD +LGPEM+STGEVMG+  TFA AFAK+QLG+  T+ + G 
Sbjct: 894  -AHVAVKEAVFPFARFPGVDIVLGPEMKSTGEVMGIDTTFARAFAKSQLGAGVTLPEGGT 952

Query: 945  ALLSVREGDKERVVDLAAKLLKQGFELDATHGTAIVLGEAGINPRLVNKVHEGRPHIQDR 1004
            A +SVR+GDK  ++ +A +L + GF L AT GTA +L E G++  ++NKV EGRPH  D 
Sbjct: 953  AFISVRDGDKAAIMPIARELTELGFRLVATRGTAALLAENGLSVEVINKVLEGRPHCVDA 1012

Query: 1005 IKNGEYTYIINTTSGRRAIEDSRVIRRSALQYKVHYDTTLNGGFATAMALNA 1056
            + +G+   + NTT G ++ +DS  IR +AL   + + TT+ G  A   A+ A
Sbjct: 1013 MISGDIHLVFNTTEGIQSQKDSFDIRHTALMRNIPHYTTVAGATAAVKAMTA 1064


Lambda     K      H
   0.318    0.135    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3102
Number of extensions: 153
Number of successful extensions: 17
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1073
Length of database: 1082
Length adjustment: 46
Effective length of query: 1027
Effective length of database: 1036
Effective search space:  1063972
Effective search space used:  1063972
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_011390636.1 RRU_RS14890 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.2704099.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                             Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                             -----------
          0 1494.8   0.0          0 1494.6   0.0    1.0  1  NCBI__GCF_000013085.1:WP_011390636.1  


Domain annotation for each sequence (and alignments):
>> NCBI__GCF_000013085.1:WP_011390636.1  
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1494.6   0.0         0         0       1    1052 []       2    1062 ..       2    1062 .. 0.97

  Alignments for each domain:
  == domain 1  score: 1494.6 bits;  conditional E-value: 0
                             TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltve 71  
                                            pkr+dik++l+iG+GpivigqA+EFDYsG+qa+kal+ eg++v+Lvnsn+At+mtd+e ad++YieP+t+e
  NCBI__GCF_000013085.1:WP_011390636.1    2 PKRTDIKSILIIGAGPIVIGQACEFDYSGAQACKALRAEGYRVILVNSNPATIMTDPETADATYIEPITPE 72  
                                            689******************************************************************** PP

                             TIGR01369   72 avekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineev 142 
                                            +ve iie+ErpDa+l+t+GGqtaLn a+ l ++GvL kygv+++ ++ e+i+kaedR +F++a+++i+++ 
  NCBI__GCF_000013085.1:WP_011390636.1   73 IVEAIIERERPDALLPTMGGQTALNTAMALADRGVLTKYGVEMIAANKEVIAKAEDRLLFRDAMRKIGLDC 143 
                                            *********************************************************************** PP

                             TIGR01369  143 akseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqvlvekslagwk 213 
                                            ++s+ v+s+ee+ +a eeig+Pvi+R++ftlgG+G+g+a n+ee  ++v+++l+asp++q+lve+s+ gwk
  NCBI__GCF_000013085.1:WP_011390636.1  144 PRSALVHSIEESRQALEEIGLPVIIRPSFTLGGQGGGMAFNREEYDRIVASGLAASPVRQILVEESVLGWK 214 
                                            *********************************************************************** PP

                             TIGR01369  214 EiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkiirelgvege.cnvq 283 
                                            E+E+EvvRD++dnciivc+iEn+Dp+GvHtGdsi+vaP+ tLtdkeyq +R+as++ +re+gve++ +nvq
  NCBI__GCF_000013085.1:WP_011390636.1  215 EYEMEVVRDRADNCIIVCSIENIDPMGVHTGDSITVAPALTLTDKEYQVMRNASIACLREIGVETGgSNVQ 285 
                                            ****************************************************************988**** PP

                             TIGR01369  284 faldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtketvAsfEPslDYvvv 354 
                                            fa++P++ r+vviE+npRvsRssALAskAtG+PiAk+aaklavGy+Ldel+nd+t+ t+AsfEP++DYvv+
  NCBI__GCF_000013085.1:WP_011390636.1  286 FAVNPKDGRLVVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELSNDITGVTPASFEPTIDYVVT 356 
                                            *********************************************************************** PP

                             TIGR01369  355 kiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrsleekllg...lklkeke.aesdeelee 421 
                                            k+Pr+ ++kf  ++  l+++mksvGE maigrtf+e+lqk+lrsle++l g   ++++ ++ +  +++++ 
  NCBI__GCF_000013085.1:WP_011390636.1  357 KLPRFTFEKFPDTEALLSSSMKSVGEAMAIGRTFKESLQKGLRSLEIGLDGldeVEIPGSAgQDGKDAIRA 427 
                                            **************************************************944444444441344566788 PP

                             TIGR01369  422 alkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvelekeleeeklkelkkellkkakklGf 492 
                                            al k+ ++R+++ia+alr+g++veev  ++ +d +fle++k++v+ e++l+e+ l   ++ +l+++kk+Gf
  NCBI__GCF_000013085.1:WP_011390636.1  428 ALSKARPDRILIIAQALRQGFTVEEVRAICYYDPWFLEQIKEIVDEERRLRENGLP-GDAVSLHRVKKMGF 497 
                                            999***********************************************987777.89999********* PP

                             TIGR01369  493 sdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeaktpYlYstyeee.....kddvevtekkkvl 558 
                                            sd+++akl++++ +ev   r+ l+++pv+kr+Dt+aaEf ++tpY+Ys ye++     + ++ev+++ k++
  NCBI__GCF_000013085.1:WP_011390636.1  498 SDARLAKLTGKTVTEVSFRRQVLNVHPVYKRIDTCAAEFASRTPYMYSCYEGDgltpaECEAEVSDRTKII 568 
                                            ***************************************************9988877779999999**** PP

                             TIGR01369  559 vlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvstDydiadrLyFeeltvedvldiiekek 629 
                                            +lG+Gp+Rigqg+EFDyc+vha+ al++ag++ti++n+nPEtvstDyd++drLyFe+lt+edv+++ +ke+
  NCBI__GCF_000013085.1:WP_011390636.1  569 ILGGGPNRIGQGIEFDYCCVHAAYALSDAGFETIMVNCNPETVSTDYDTSDRLYFEPLTIEDVVELARKEQ 639 
                                            *********************************************************************** PP

                             TIGR01369  630 ve....gvivqlgGqtalnlakeleeagvkilGtsaesidraEdRekFsklldelgikqpkgkeatsveea 696 
                                             +    g+ivq+gGqt+l+la+ le+ag+++lGts+++id aEdR++F+kl+ +l + qp++ +a sve+a
  NCBI__GCF_000013085.1:WP_011390636.1  640 ARgtllGCIVQYGGQTPLKLARGLEAAGIPVLGTSPDAIDLAEDRDRFQKLIAKLALRQPRNGTALSVEQA 710 
                                            98333358*************************************************************** PP

                             TIGR01369  697 keiakeigyPvlvRpsyvlgGrameiveneeeleryleeavevskekPvlidkyledavEvdvDavadgee 767 
                                            + ia+++gyPv++RpsyvlgGram+iv++e++l++y+ +av+vs + Pvlid+yl+ a+EvdvDa+adge+
  NCBI__GCF_000013085.1:WP_011390636.1  711 RAIATRVGYPVVIRPSYVLGGRAMQIVHDEAQLNDYMVNAVKVSGDDPVLIDNYLSGAIEVDVDAIADGET 781 
                                            *********************************************************************** PP

                             TIGR01369  768 vliagileHiEeaGvHsGDstlvlppqklseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvn 838 
                                              iagi++HiEeaG+HsGDs+++lpp +l+e +  +++++++++ak l+v+Gl+niqf++kd+++y++Evn
  NCBI__GCF_000013085.1:WP_011390636.1  782 THIAGIMQHIEEAGIHSGDSACSLPPYSLDEATIAELTKQTEALAKGLNVRGLMNIQFAIKDGDIYILEVN 852 
                                            *********************************************************************** PP

                             TIGR01369  839 vRasRtvPfvskalgvplvklavkvllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemk 909 
                                            +RasRtvPfv+ka+gv ++k+a++v++g++l++ +     +k+  +vavk+avf+f+++ gvd+vlgpemk
  NCBI__GCF_000013085.1:WP_011390636.1  853 PRASRTVPFVAKATGVAVAKIAARVMAGESLASFGL---VTKRLAHVAVKEAVFPFARFPGVDIVLGPEMK 920 
                                            *********************************886...666778************************** PP

                             TIGR01369  910 stGEvmgigrdleeallkallaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvlee 980 
                                            stGEvmgi++++++a++k++l ++ ++++ g++++sv+d dk +++++a++l+e+g++++at+gta+ l+e
  NCBI__GCF_000013085.1:WP_011390636.1  921 STGEVMGIDTTFARAFAKSQLGAGVTLPEGGTAFISVRDGDKAAIMPIARELTELGFRLVATRGTAALLAE 991 
                                            *********************************************************************** PP

                             TIGR01369  981 agikaevvlkvseeaekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaeallea 1051
                                            +g ++ev++kv e ++++++++ +++i+lv+n+t+ + +++++++ ir++a+++++p+ t++++a+a+++a
  NCBI__GCF_000013085.1:WP_011390636.1  992 NGLSVEVINKVLEGRPHCVDAMISGDIHLVFNTTE-GIQSQKDSFDIRHTALMRNIPHYTTVAGATAAVKA 1061
                                            ********************************997.888999***********************999988 PP

                             TIGR01369 1052 l 1052
                                            +
  NCBI__GCF_000013085.1:WP_011390636.1 1062 M 1062
                                            5 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1082 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.04u 0.08s 00:00:00.12 Elapsed: 00:00:00.11
# Mc/sec: 9.84
//
[ok]

This GapMind analysis is from Jul 25 2024. The underlying query database was built on Jul 25 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory