Align Carbamoyl-phosphate synthase large chain, chloroplastic; Carbamoyl-phosphate synthetase ammonia chain; Protein VENOSA 3; EC 6.3.5.5 (characterized)
to candidate 3609253 Dshi_2639 carbamoyl-phosphate synthase, large subunit (RefSeq)
Query= SwissProt::Q42601 (1187 letters) >FitnessBrowser__Dino:3609253 Length = 1105 Score = 1178 bits (3047), Expect = 0.0 Identities = 627/1120 (55%), Positives = 795/1120 (70%), Gaps = 57/1120 (5%) Query: 94 KRTDLKKIMILGAGPIVIGQACEFDYSGTQACKALREEGYEVILINSNPATIMTDPETAN 153 KRTD+K IMI+GAGPI+IGQACEFDYSG QACKALREEGY VIL+NSNPATIMTDP A+ Sbjct: 3 KRTDIKSIMIIGAGPIIIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPGLAD 62 Query: 154 RTYIAPMTPELVEQVIEKERPDALLPTMGGQTALNLAVALAESGALEKYGVELIGAKLGA 213 TYI P+TPE+V ++IEKERPDALLPTMGGQT LN ++AL E G L KYGVE+IGAK A Sbjct: 63 ATYIEPITPEIVAKIIEKERPDALLPTMGGQTGLNTSLALEEMGVLAKYGVEMIGAKREA 122 Query: 214 IKKAEDRELFKDAMKNIGLKTPPSGIGTTLDECFDIAEKIGEFPLIIRPAFTLGGTGGGI 273 I+ AEDR+LF++AM +G++ P + I TT+DEC + IG P IIRPAFTLGGTGGG+ Sbjct: 123 IEMAEDRKLFREAMDRLGIENPRATIATTMDECMAALDDIG-LPAIIRPAFTLGGTGGGV 181 Query: 274 AYNKEEFESICKSGLAASATSQVLVEKSLLGWKEYELEVMRDLADNVVIICSIENIDPMG 333 AYN++++E CKSGL AS +Q+L+++SLLGWKE+E+EV+RD ADN +I+C+IEN+DPMG Sbjct: 182 AYNRDDYEHFCKSGLDASPVNQILIDESLLGWKEFEMEVVRDKADNAIIVCAIENVDPMG 241 Query: 334 VHTGDSITVAPAQTLTDREYQRLRDYSIAIIREIGVECGGSNVQFAVNPVDGEVMIIEMN 393 VHTGDSITVAPA TLTD+EYQ +R+ SIA++REIGVE GGSNVQ+AVNP DG +++IEMN Sbjct: 242 VHTGDSITVAPALTLTDKEYQIMRNGSIAVLREIGVETGGSNVQWAVNPADGRMVVIEMN 301 Query: 394 PRVSRSSALASKATGFPIAKMAAKLSVGYTLDQIPNDITRKTPASFEPSIDYVVTKIPRF 453 PRVSRSSALASKATGFPIAK+AAKL+VGYTLD++ NDIT+ TPASFEP+IDYVVTKIPRF Sbjct: 302 PRVSRSSALASKATGFPIAKIAAKLAVGYTLDELDNDITKVTPASFEPTIDYVVTKIPRF 361 Query: 454 AFEKFPGSQPLLTTQMKSVGESMALGRTFQESFQKALRSLECGFSGWGCAKIKELDWDWD 513 AFEKFPG++P LTT MKSVGE+M++GRTF ES QKAL S+E G +G+ I + D Sbjct: 362 AFEKFPGAEPNLTTAMKSVGEAMSIGRTFHESVQKALASMETGLTGFDEIAIPGISADHR 421 Query: 514 Q-------LKYSLRVPNPDRIHAIYAAMKKGMKIDEIYELSMVDKWFLTQLKELVDVEQY 566 + +L PDR+ I AM+ G+ DEI + D WFL +++E+V+ E Sbjct: 422 SDAPDTAAVVKALARQTPDRLRVIAQAMRHGLSDDEIQAATSYDPWFLARIREIVETEAQ 481 Query: 567 LMSGTLSEITKEDLYEVKKRGFSDKQIAFATKTTEEEVRTKRISLGVVPSYKRVDTCAAE 626 + L + E L ++K GF+D ++A T E +VR R LGV +KR+DTCAAE Sbjct: 482 VRRDGL-PLEAEGLRKLKMMGFTDARLAKLTGRDEGQVRRARTRLGVTAQFKRIDTCAAE 540 Query: 627 FEAHTPYMYSSY------DVECESAPNNKKKVLILGGGPNRIGQGIEFDYCCCHTSFALQ 680 FEA TPYMYS+Y + ECES P + KV+ILGGGPNRIGQGIEFDYCCCH FAL Sbjct: 541 FEAQTPYMYSTYETPVMGEAECESRPTDATKVVILGGGPNRIGQGIEFDYCCCHACFALT 600 Query: 681 DAGYETIMLNSNPETVSTDYDTSDRLYFEPLTIEDVLNVIDLEKPD----GIIVQFGGQT 736 +AGYETIM+N NPETVSTDYDTSDRLYFEPLT E V+ ++ E+ + G+IVQFGGQT Sbjct: 601 EAGYETIMVNCNPETVSTDYDTSDRLYFEPLTFEHVMEILRAEQENGTLHGVIVQFGGQT 660 Query: 737 PLKLALPIKHYLDKHMPMSLSGAGPVRIWGTSPDSIDAAEDRERFNAILDELKIEQPKGG 796 PLKLA ++ A + I GT+PD+ID AEDRERF A++++L ++QP Sbjct: 661 PLKLANALE-------------AEGIPILGTTPDAIDLAEDRERFQALVNDLGLKQPHNA 707 Query: 797 IAKSEADALAIAKEVGYPVVVRPSYVLGGRAMEIVYDDSRLITYLENAVQVDPERPVLVD 856 IA ++A+A A A ++G+P+V+RPSYVLGGRAMEIV D +L Y+ AV V + PVL+D Sbjct: 708 IASTDAEAFAAAGDIGFPLVIRPSYVLGGRAMEIVRDMGQLERYIAEAVVVSGDSPVLLD 767 Query: 857 KYLSDAIEIDVDTLTDSYGNVVIGGIMEHIEQAGVHSGDSACMLPTQTIPASCLQTIRTW 916 YL+ A+E+DVD L D NV + GIM+HIE+AGVHSGDSAC LP ++ L IR Sbjct: 768 SYLAGAVELDVDALCDG-ENVHVAGIMQHIEEAGVHSGDSACSLPPYSLSDDVLARIRVQ 826 Query: 917 TTKLAKKLNVCGLMNCQYAITTSGDVFLLEANPRASRTVPFVSKAIGHPLAKYAALVMSG 976 T LA+ L V GLMN Q+AI +++L+E NPRASRTVPFV+KA +A AA +M+G Sbjct: 827 TEALARALRVKGLMNVQFAI-KDDEIYLIEVNPRASRTVPFVAKATDSAIASIAARLMAG 885 Query: 977 KSLKDLNF---------EKEVIP------------KHVSVKEAVFPFEKFQGCDVILGPE 1015 + L + E + +P SVKEAV PF +F G D ILGPE Sbjct: 886 EPLSNFPLRDPLPHDAPEDQHLPIGDPMTLAHPDTPWFSVKEAVLPFARFPGVDTILGPE 945 Query: 1016 MRSTGEVMSISSEFSSAFAMAQIAAGQKLPLSGTVFLSLNDMTKPH-LEKIAVSFLELGF 1074 MRSTGEVM F AF AQ+ AG LP GTVFLS+ + K L + A ELG Sbjct: 946 MRSTGEVMGWDRSFPRAFLKAQMGAGTVLPTEGTVFLSIKEADKTEMLVETAAMLTELGL 1005 Query: 1075 KIVATSGTAHFLELKGIPVERVLKLHEGRPHAADMVANGQIHLMLITSSGDALDQKDGRQ 1134 IVAT GTA FL+ GI + V K++EGRP DM+ +G+I L++ T+ G A D R+ Sbjct: 1006 DIVATRGTAAFLKDHGIASKVVNKVYEGRPDVVDMLKDGRIALVMNTTEG-AQAVNDSRE 1064 Query: 1135 LRQMALAYKVPVITTVAGALATAEGIKSLKSSAIKMTALQ 1174 +R +AL ++P TT+A + A A+ + + + I + ALQ Sbjct: 1065 IRSVALYDRIPYFTTLAASHAAAQAMIARREGEIGVRALQ 1104 Score = 230 bits (587), Expect = 4e-64 Identities = 151/417 (36%), Positives = 220/417 (52%), Gaps = 28/417 (6%) Query: 90 EIVGKRTDLKKIMILGAGPIVIGQACEFDYSGTQACKALREEGYEVILINSNPATIMTDP 149 E + TD K++ILG GP IGQ EFDY AC AL E GYE I++N NP T+ TD Sbjct: 561 ECESRPTDATKVVILGGGPNRIGQGIEFDYCCCHACFALTEAGYETIMVNCNPETVSTDY 620 Query: 150 ETANRTYIAPMTPELVEQVIEKERPD----ALLPTMGGQTALNLAVALAESGALEKYGVE 205 +T++R Y P+T E V +++ E+ + ++ GGQT L LA ALE G+ Sbjct: 621 DTSDRLYFEPLTFEHVMEILRAEQENGTLHGVIVQFGGQTPLKLA------NALEAEGIP 674 Query: 206 LIGAKLGAIKKAEDRELFKDAMKNIGLKTPPSGIGTTLDECFDIAEKIGEFPLIIRPAFT 265 ++G AI AEDRE F+ + ++GLK P + I +T E F A IG FPL+IRP++ Sbjct: 675 ILGTTPDAIDLAEDRERFQALVNDLGLKQPHNAIASTDAEAFAAAGDIG-FPLVIRPSYV 733 Query: 266 LGGTGGGIAYNKEEFESICKSGLAASATSQVLVEKSLLGWKEYELEVMRDLADNVVIICS 325 LGG I + + E + S S VL++ L G E +++ + D +NV + Sbjct: 734 LGGRAMEIVRDMGQLERYIAEAVVVSGDSPVLLDSYLAGAVELDVDALCD-GENVHVAGI 792 Query: 326 IENIDPMGVHTGDSITVAPAQTLTDREYQRLRDYSIAIIREIGVECGGSNVQFAVNPVDG 385 +++I+ GVH+GDS P +L+D R+R + A+ R + V+ G NVQFA+ D Sbjct: 793 MQHIEEAGVHSGDSACSLPPYSLSDDVLARIRVQTEALARALRVK-GLMNVQFAIK--DD 849 Query: 386 EVMIIEMNPRVSRSSALASKATGFPIAKMAAKLSVGYTL------DQIPNDITRKTPASF 439 E+ +IE+NPR SR+ +KAT IA +AA+L G L D +P+D Sbjct: 850 EIYLIEVNPRASRTVPFVAKATDSAIASIAARLMAGEPLSNFPLRDPLPHDAPEDQHLPI 909 Query: 440 -------EPSIDYVVTKIPRFAFEKFPGSQPLLTTQMKSVGESMALGRTFQESFQKA 489 P + K F +FPG +L +M+S GE M R+F +F KA Sbjct: 910 GDPMTLAHPDTPWFSVKEAVLPFARFPGVDTILGPEMRSTGEVMGWDRSFPRAFLKA 966 Lambda K H 0.317 0.133 0.383 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3005 Number of extensions: 135 Number of successful extensions: 22 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 2 Number of HSP's successfully gapped: 2 Length of query: 1187 Length of database: 1105 Length adjustment: 46 Effective length of query: 1141 Effective length of database: 1059 Effective search space: 1208319 Effective search space used: 1208319 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
Align candidate 3609253 Dshi_2639 (carbamoyl-phosphate synthase, large subunit (RefSeq))
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR01369.hmm # target sequence database: /tmp/gapView.27196.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01369 [M=1052] Accession: TIGR01369 Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1459.4 0.0 0 1459.2 0.0 1.0 1 lcl|FitnessBrowser__Dino:3609253 Dshi_2639 carbamoyl-phosphate sy Domain annotation for each sequence (and alignments): >> lcl|FitnessBrowser__Dino:3609253 Dshi_2639 carbamoyl-phosphate synthase, large subunit (RefSeq) # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1459.2 0.0 0 0 1 1052 [] 2 1090 .. 2 1090 .. 0.96 Alignments for each domain: == domain 1 score: 1459.2 bits; conditional E-value: 0 TIGR01369 1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYiePltveavek 75 pkr+dik++++iG+Gpi+igqA+EFDYsG+qa+kal+eeg++v+Lvnsn+At+mtd+ lad++YieP+t+e+v+k lcl|FitnessBrowser__Dino:3609253 2 PKRTDIKSIMIIGAGPIIIGQACEFDYSGAQACKALREEGYRVILVNSNPATIMTDPGLADATYIEPITPEIVAK 76 689************************************************************************ PP TIGR01369 76 iiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkealkeineevakseives 150 iiekErpDa+l+t+GGqt+Ln ++ lee+GvL+kygv+++G+k eai+ aedR++F+ea++ +++e ++++i+++ lcl|FitnessBrowser__Dino:3609253 77 IIEKERPDALLPTMGGQTGLNTSLALEEMGVLAKYGVEMIGAKREAIEMAEDRKLFREAMDRLGIENPRATIATT 151 *************************************************************************** PP TIGR01369 151 veealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspikqvlvekslagwkEiEyEvvRDskd 225 ++e+++a ++ig+P i+R+aftlgGtG+g+a+n+++ ++ ++++l+asp++q+l+++sl gwkE+E+EvvRD++d lcl|FitnessBrowser__Dino:3609253 152 MDECMAALDDIGLPAIIRPAFTLGGTGGGVAYNRDDYEHFCKSGLDASPVNQILIDESLLGWKEFEMEVVRDKAD 226 *************************************************************************** PP TIGR01369 226 nciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdaslkiirelgvege.cnvqfaldPeskryvviEvn 299 n+iivc+iEn+Dp+GvHtGdsi+vaP+ tLtdkeyq++R+ s++++re+gve++ +nvq+a++P + r+vviE+n lcl|FitnessBrowser__Dino:3609253 227 NAIIVCAIENVDPMGVHTGDSITVAPALTLTDKEYQIMRNGSIAVLREIGVETGgSNVQWAVNPADGRMVVIEMN 301 ****************************************************988******************** PP TIGR01369 300 pRvsRssALAskAtGyPiAkvaaklavGysLdelkndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtq 374 pRvsRssALAskAtG+PiAk+aaklavGy+Ldel nd+tk t+AsfEP++DYvv+kiPr++++kf +++ +l+t lcl|FitnessBrowser__Dino:3609253 302 PRVSRSSALASKATGFPIAKIAAKLAVGYTLDELDNDITKVTPASFEPTIDYVVTKIPRFAFEKFPGAEPNLTTA 376 *************************************************************************** PP TIGR01369 375 mksvGEvmaigrtfeealqkalrsleekllg........lklkek.eaesdeeleealkkpndrRlfaiaealrr 440 mksvGE m+igrtf+e++qkal+s+e++l+g ++ +++ a ++ ++ +al +++++Rl +ia+a+r+ lcl|FitnessBrowser__Dino:3609253 377 MKSVGEAMSIGRTFHESVQKALASMETGLTGfdeiaipgISADHRsDAPDTAAVVKALARQTPDRLRVIAQAMRH 451 *****************************9944433333222333033445667899****************** PP TIGR01369 441 gvsveevyeltkidrffleklkklvelekeleeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkel 515 g+s +e+ +t +d +fl +++++ve+e ++++ l l++e l+k+k +Gf+d+++akl++ +e +vr++r++l lcl|FitnessBrowser__Dino:3609253 452 GLSDDEIQAATSYDPWFLARIREIVETEAQVRRDGLP-LEAEGLRKLKMMGFTDARLAKLTGRDEGQVRRARTRL 525 *******************************977776.************************************* PP TIGR01369 516 givpvvkrvDtvaaEfeaktpYlYstyeee.....kddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalr 585 g++ +kr+Dt+aaEfea+tpY+Ystye+ + +++ t+ kv++lG+Gp+Rigqg+EFDyc+ ha+ al lcl|FitnessBrowser__Dino:3609253 526 GVTAQFKRIDTCAAEFEAQTPYMYSTYETPvmgeaECESRPTDATKVVILGGGPNRIGQGIEFDYCCCHACFALT 600 ***************************986544334455666779****************************** PP TIGR01369 586 eagyktilinynPEtvstDydiadrLyFeeltvedvldiiekek....vegvivqlgGqtalnlakeleeagvki 656 eagy+ti++n+nPEtvstDyd++drLyFe+lt+e+v++i++ e+ +gvivq+gGqt+l+la++le++g++i lcl|FitnessBrowser__Dino:3609253 601 EAGYETIMVNCNPETVSTDYDTSDRLYFEPLTFEHVMEILRAEQengtLHGVIVQFGGQTPLKLANALEAEGIPI 675 ****************************************99873333578************************ PP TIGR01369 657 lGtsaesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeeler 731 lGt++++id aEdRe+F++l+++lg+kqp++++a++ ea +a +ig+P+++RpsyvlgGrameiv+++ +ler lcl|FitnessBrowser__Dino:3609253 676 LGTTPDAIDLAEDRERFQALVNDLGLKQPHNAIASTDAEAFAAAGDIGFPLVIRPSYVLGGRAMEIVRDMGQLER 750 *************************************************************************** PP TIGR01369 732 yleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkkike 806 y+ eav vs ++Pvl+d yl avE+dvDa++dge+v +agi++HiEeaGvHsGDs+++lpp +ls++v+ +i+ lcl|FitnessBrowser__Dino:3609253 751 YIAEAVVVSGDSPVLLDSYLAGAVELDVDALCDGENVHVAGIMQHIEEAGVHSGDSACSLPPYSLSDDVLARIRV 825 *************************************************************************** PP TIGR01369 807 ivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkleele........ 873 +++++a++l+vkGl+n+qf++kd+e+y+iEvn+RasRtvPfv+ka++ ++++a+++++g+ l++ lcl|FitnessBrowser__Dino:3609253 826 QTEALARALRVKGLMNVQFAIKDDEIYLIEVNPRASRTVPFVAKATDSAIASIAARLMAGEPLSNFPlrdplphd 900 *****************************************************************9999****** PP TIGR01369 874 ...........kgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakik 937 ++++ ++vk+av++f+++ gvd +lgpem+stGEvmg +r++ +a+lka++ ++++++ lcl|FitnessBrowser__Dino:3609253 901 apedqhlpigdPMTLAHPDTPWFSVKEAVLPFARFPGVDTILGPEMRSTGEVMGWDRSFPRAFLKAQMGAGTVLP 975 ********9985577889999****************************************************** PP TIGR01369 938 kkgsvllsvkdkdkee.llelakklaekglkvyategtakvleeagikaevvlkvseeaekilellkeeeielvi 1011 ++g+v+ls+k++dk+e l+e+a++l+e+gl ++at+gta++l+++gi ++vv+kv e +++++++lk++ i lv+ lcl|FitnessBrowser__Dino:3609253 976 TEGTVFLSIKEADKTEmLVETAAMLTELGLDIVATRGTAAFLKDHGIASKVVNKVYEGRPDVVDMLKDGRIALVM 1050 ***********9996516899****************************************************** PP TIGR01369 1012 nltskkkkaaekgykirreaveykvplvteletaealleal 1052 n+t+ +++a++++ ir a+ ++p++t+l++ +a+++a+ lcl|FitnessBrowser__Dino:3609253 1051 NTTE-GAQAVNDSREIRSVALYDRIPYFTTLAASHAAAQAM 1090 *987.8889999*******************9999988875 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (1052 nodes) Target sequences: 1 (1105 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.06u 0.03s 00:00:00.09 Elapsed: 00:00:00.09 # Mc/sec: 12.89 // [ok]
This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory