Align Carbamoyl-phosphate synthase large chain, chloroplastic; Carbamoyl-phosphate synthetase ammonia chain; Protein VENOSA 3; EC 6.3.5.5 (characterized)
to candidate WP_078716838.1 B5D49_RS06310 carbamoyl-phosphate synthase large subunit
Query= SwissProt::Q42601 (1187 letters) >NCBI__GCF_900167125.1:WP_078716838.1 Length = 1081 Score = 1236 bits (3198), Expect = 0.0 Identities = 642/1095 (58%), Positives = 809/1095 (73%), Gaps = 26/1095 (2%) Query: 94 KRTDLKKIMILGAGPIVIGQACEFDYSGTQACKALREEGYEVILINSNPATIMTDPETAN 153 KRTDLKKIM++G+GPIVIGQACEFDYSGTQA KAL+EEGYEV+L+NSNPATIMTDPE A+ Sbjct: 3 KRTDLKKIMLIGSGPIVIGQACEFDYSGTQALKALKEEGYEVVLVNSNPATIMTDPELAD 62 Query: 154 RTYIAPMTPELVEQVIEKERPDALLPTMGGQTALNLAVALAESGALEKYGVELIGAKLGA 213 RTY+ P+ PE V ++I +ERPDALLPT+GGQT LN A+A+AE G L++YGVELIGA Sbjct: 63 RTYVEPIEPETVARIIAQERPDALLPTLGGQTGLNTALAVAEMGVLKEYGVELIGANEAV 122 Query: 214 IKKAEDRELFKDAMKNIGLKTPPSGIGTTLDECFDIAEKIGEFPLIIRPAFTLGGTGGGI 273 I+KAE R+LF++AM+NIGLK P SGI ++D+ EKI FP+I+RPA+TLGGTGGG+ Sbjct: 123 IQKAESRQLFREAMENIGLKVPASGIARSMDDVRAWGEKIS-FPIIVRPAYTLGGTGGGV 181 Query: 274 AYNKEEFESICKSGLAASATSQVLVEKSLLGWKEYELEVMRDLADNVVIICSIENIDPMG 333 AYN EE E+I GLA S +++++E+S+LGWKE+ELEVMRD DN VIICSIEN+D MG Sbjct: 182 AYNMEELEAISSKGLALSMKNEIMLEQSVLGWKEFELEVMRDKKDNCVIICSIENLDAMG 241 Query: 334 VHTGDSITVAPAQTLTDREYQRLRDYSIAIIREIGVECGGSNVQFAVNPVDGEVMIIEMN 393 VHTGDSITVAPAQTLTDREYQ++RD ++AI+REIGVE GGSNVQFAVNP DGE++IIEMN Sbjct: 242 VHTGDSITVAPAQTLTDREYQQMRDAALAIMREIGVETGGSNVQFAVNPEDGELVIIEMN 301 Query: 394 PRVSRSSALASKATGFPIAKMAAKLSVGYTLDQIPNDITRKTPASFEPSIDYVVTKIPRF 453 PRVSRSSALASKATGFPIAK+AAKL+VGYTLD++PNDITR+T ASFEP+IDY V KIPRF Sbjct: 302 PRVSRSSALASKATGFPIAKIAAKLAVGYTLDELPNDITRETMASFEPTIDYCVIKIPRF 361 Query: 454 AFEKFPGSQPLLTTQMKSVGESMALGRTFQESFQKALRSLECGFSGWGCAKIKELDWDWD 513 FEKFPGS+ LTT MKSVGE+MA+GRTF+E+ QK LRSLE G G G D D Sbjct: 362 TFEKFPGSEDHLTTAMKSVGETMAIGRTFKEALQKGLRSLEVGMPGLG-KHFAPCPLDKD 420 Query: 514 QLKYSLRVPNPDRIHAIYAAMKKGMKIDEIYELSMVDKWFLTQLKELVDVEQYLMS-GTL 572 +L LR PN R++A+ AM+ G+ +EIY S +D WFL Q+++++++E L G Sbjct: 421 ELLTELRNPNSQRLYAVRNAMRCGVDDEEIYATSFIDPWFLRQIRQVLEMENTLQEFGKQ 480 Query: 573 SEITKED------LYEVKKRGFSDKQIAFATKTTEEEVRTKRISLGVVPSYKRVDTCAAE 626 I +D L K+ G+SD+Q+A KT+ ++R+ R +VP+Y VDTCAAE Sbjct: 481 HGIENKDQELADILRRAKEYGYSDQQLATLWKTSPRKIRSLRKEWDIVPTYYLVDTCAAE 540 Query: 627 FEAHTPYMYSSYDVECESAPNNKKKVLILGGGPNRIGQGIEFDYCCCHTSFALQDAGYET 686 FEAHTPY YS+Y+ E P KK++ILGGGPNRIGQGIEFDYCCCH+SF L+D G ++ Sbjct: 541 FEAHTPYYYSTYESGSEITPAPGKKIIILGGGPNRIGQGIEFDYCCCHSSFQLRDMGIQS 600 Query: 687 IMLNSNPETVSTDYDTSDRLYFEPLTIEDVLNVIDLEKPDGIIVQFGGQTPLKLALPIKH 746 IM+NSNPETVSTDYDTSDRLYFEPLT EDVLN+++ E+PDG+I+QFGGQTPL LA Sbjct: 601 IMVNSNPETVSTDYDTSDRLYFEPLTFEDVLNIVEFEQPDGVIIQFGGQTPLNLA----- 655 Query: 747 YLDKHMPMSLSGAGPVRIWGTSPDSIDAAEDRERFNAILDELKIEQPKGGIAKSEADALA 806 +SL AG V + GTSPD+ID AEDRERF +L +L + QP G A S A Sbjct: 656 -------VSLMEAG-VPMIGTSPDAIDRAEDRERFKRLLKKLHLRQPLNGTAMSLVQAQE 707 Query: 807 IAKEVGYPVVVRPSYVLGGRAMEIVYDDSRLITYLENAVQVDPERPVLVDKYLSDAIEID 866 IA ++G+P+V+RPSYVLGGR M+IVY Y + V PE PVL+DK+L A+E+D Sbjct: 708 IAGKIGFPLVLRPSYVLGGRGMDIVYSMEEFERYFRESALVSPEHPVLIDKFLEHAVEVD 767 Query: 867 VDTLTDSYGNVVIGGIMEHIEQAGVHSGDSACMLPTQTIPASCLQTIRTWTTKLAKKLNV 926 VD L D + IGG+MEHIE+AG+HSGDSAC+LP +I +Q I T +A +L V Sbjct: 768 VDALADG-EDCYIGGVMEHIEEAGIHSGDSACVLPPNSISPDLIQEIERQTKAMALELGV 826 Query: 927 CGLMNCQYAITTSGDVFLLEANPRASRTVPFVSKAIGHPLAKYAALVMSGKSLKDLNFEK 986 GLMN QYAI +V+++E NPRASRTVPFVSKA G PLAK A VM G+ L D++ Sbjct: 827 VGLMNVQYAI-KDDEVYIIEVNPRASRTVPFVSKATGVPLAKLATRVMLGEKLNDIDPWS 885 Query: 987 EVIPKHVSVKEAVFPFEKFQGCDVILGPEMRSTGEVMSISSEFSSAFAMAQIAAGQKLPL 1046 V+VKEAVFPF +F DVILGPEMRSTGEVM I EF AF AQ+AAGQ LP Sbjct: 886 MRKKGWVAVKEAVFPFNRFPNVDVILGPEMRSTGEVMGIDYEFGPAFMKAQLAAGQVLPE 945 Query: 1047 SGTVFLSLNDMTKPHLEKIAVSFLELGFKIVATSGTAHFLELKGI-PVERVLKLHEGRPH 1105 GT+F+++ND KP + I F E+GF+++AT GTA L G+ VE +LK++EGRP+ Sbjct: 946 EGTIFVAVNDWDKPLILPIVQKFREMGFRVMATRGTATHLYDNGVTDVEPLLKVYEGRPN 1005 Query: 1106 AADMVANGQIHLMLITSSGDALDQKDGRQLRQMALAYKVPVITTVAGALATAEGIKSLKS 1165 D + N +I L++ T SG D + +RQ AL Y +P +TTVAGA AT + I+ ++ Sbjct: 1006 VVDHIKNRKISLVINTVSG-RKTVHDSKDIRQAALLYNIPYVTTVAGAKATVQAIEDVRK 1064 Query: 1166 SAIKMTALQDFFEVK 1180 + +++ LQ++ + K Sbjct: 1065 AGLQVRCLQEYHDNK 1079 Lambda K H 0.317 0.133 0.383 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 3077 Number of extensions: 140 Number of successful extensions: 20 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 1187 Length of database: 1081 Length adjustment: 46 Effective length of query: 1141 Effective length of database: 1035 Effective search space: 1180935 Effective search space used: 1180935 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 58 (26.9 bits)
Align candidate WP_078716838.1 B5D49_RS06310 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/TIGR01369.hmm # target sequence database: /tmp/gapView.12030.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: TIGR01369 [M=1052] Accession: TIGR01369 Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 0 1503.4 0.0 0 1503.2 0.0 1.0 1 lcl|NCBI__GCF_900167125.1:WP_078716838.1 B5D49_RS06310 carbamoyl-phosphat Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_900167125.1:WP_078716838.1 B5D49_RS06310 carbamoyl-phosphate synthase large subunit # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ! 1503.2 0.0 0 0 1 1052 [] 2 1059 .. 2 1059 .. 0.97 Alignments for each domain: == domain 1 score: 1503.2 bits; conditional E-value: 0 TIGR01369 1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYieP 67 pkr+d+kk+++iGsGpivigqA+EFDYsG+qalkalkeeg+evvLvnsn+At+mtd+elad++Y+eP lcl|NCBI__GCF_900167125.1:WP_078716838.1 2 PKRTDLKKIMLIGSGPIVIGQACEFDYSGTQALKALKEEGYEVVLVNSNPATIMTDPELADRTYVEP 68 689**************************************************************** PP TIGR01369 68 ltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkea 134 +++e+v++ii +ErpDa+l+tlGGqt+Ln a+ + e+GvL++ygv+l+G++ +i+kae+R++F+ea lcl|NCBI__GCF_900167125.1:WP_078716838.1 69 IEPETVARIIAQERPDALLPTLGGQTGLNTALAVAEMGVLKEYGVELIGANEAVIQKAESRQLFREA 135 ******************************************************************* PP TIGR01369 135 lkeineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspik 201 +++i+++v++s i++s+++ + e+i +P+ivR+a+tlgGtG+g+a+n+eel+++ +k+l++s+ + lcl|NCBI__GCF_900167125.1:WP_078716838.1 136 MENIGLKVPASGIARSMDDVRAWGEKISFPIIVRPAYTLGGTGGGVAYNMEELEAISSKGLALSMKN 202 ******************************************************************* PP TIGR01369 202 qvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdasl 268 ++++e+s+ gwkE+E+Ev+RD+kdnc+i+c+iEnlD++GvHtGdsi+vaP+qtLtd+eyq++Rda+l lcl|NCBI__GCF_900167125.1:WP_078716838.1 203 EIMLEQSVLGWKEFELEVMRDKKDNCVIICSIENLDAMGVHTGDSITVAPAQTLTDREYQQMRDAAL 269 ******************************************************************* PP TIGR01369 269 kiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelk 334 +i+re+gve++ +nvqfa++Pe+ ++v+iE+npRvsRssALAskAtG+PiAk+aaklavGy+Ldel+ lcl|NCBI__GCF_900167125.1:WP_078716838.1 270 AIMREIGVETGgSNVQFAVNPEDGELVIIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELP 336 *********988******************************************************* PP TIGR01369 335 ndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrslee 401 nd+t+et+AsfEP++DY+v+kiPr+ ++kf + +++l+t mksvGE maigrtf+ealqk+lrsle lcl|NCBI__GCF_900167125.1:WP_078716838.1 337 NDITRETMASFEPTIDYCVIKIPRFTFEKFPGSEDHLTTAMKSVGETMAIGRTFKEALQKGLRSLEV 403 ******************************************************************* PP TIGR01369 402 kllglklk.ekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvel 467 ++ gl ++ +++el ++l++pn +Rl+a+ +a+r gv+ ee+y ++ id +fl+++++++e+ lcl|NCBI__GCF_900167125.1:WP_078716838.1 404 GMPGLGKHfAPCPLDKDELLTELRNPNSQRLYAVRNAMRCGVDDEEIYATSFIDPWFLRQIRQVLEM 470 ****654415667889999************************************************ PP TIGR01369 468 ekeleee.klk.......elkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDt 526 e++l+e k + + ++ l++ak++G+sd+q+a+l+k+s +++r+lrke +ivp++ vDt lcl|NCBI__GCF_900167125.1:WP_078716838.1 471 ENTLQEFgK-QhgienkdQELADILRRAKEYGYSDQQLATLWKTSPRKIRSLRKEWDIVPTYYLVDT 536 ****98852.323332223346789****************************************** PP TIGR01369 527 vaaEfeaktpYlYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktil 593 +aaEfea+tpY+Ystye+ +++ kk+++lG+Gp+Rigqg+EFDyc+ h++ +lr++g+++i+ lcl|NCBI__GCF_900167125.1:WP_078716838.1 537 CAAEFEAHTPYYYSTYESG-SEITPAPGKKIIILGGGPNRIGQGIEFDYCCCHSSFQLRDMGIQSIM 602 ******************9.889999999************************************** PP TIGR01369 594 inynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGts 660 +n+nPEtvstDyd++drLyFe+lt+edvl+i+e e+++gvi+q+gGqt+lnla +l eagv+++Gts lcl|NCBI__GCF_900167125.1:WP_078716838.1 603 VNSNPETVSTDYDTSDRLYFEPLTFEDVLNIVEFEQPDGVIIQFGGQTPLNLAVSLMEAGVPMIGTS 669 ******************************************************************* PP TIGR01369 661 aesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameivenee 727 +++idraEdRe+F++ll++l++ qp + +a s+ +a+eia +ig+P+++RpsyvlgGr+m+iv+++e lcl|NCBI__GCF_900167125.1:WP_078716838.1 670 PDAIDRAEDRERFKRLLKKLHLRQPLNGTAMSLVQAQEIAGKIGFPLVLRPSYVLGGRGMDIVYSME 736 ******************************************************************* PP TIGR01369 728 eleryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppq 794 e+ery++e + vs+e+Pvlidk+le+avEvdvDa+adge+ +i g++eHiEeaG+HsGDs++vlpp+ lcl|NCBI__GCF_900167125.1:WP_078716838.1 737 EFERYFRESALVSPEHPVLIDKFLEHAVEVDVDALADGEDCYIGGVMEHIEEAGIHSGDSACVLPPN 803 ******************************************************************* PP TIGR01369 795 klseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklav 861 ++s ++ ++i++++k++a el v+Gl+n+q+++kd+evy+iEvn+RasRtvPfvska+gvpl+kla+ lcl|NCBI__GCF_900167125.1:WP_078716838.1 804 SISPDLIQEIERQTKAMALELGVVGLMNVQYAIKDDEVYIIEVNPRASRTVPFVSKATGVPLAKLAT 870 ******************************************************************* PP TIGR01369 862 kvllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallka 928 +v+lg+kl++ + ++ k+ vavk+avf+f+++ +vdv+lgpem+stGEvmgi++++ a++ka lcl|NCBI__GCF_900167125.1:WP_078716838.1 871 RVMLGEKLNDIDP--WSMRKKGWVAVKEAVFPFNRFPNVDVILGPEMRSTGEVMGIDYEFGPAFMKA 935 **********887..6677778********************************************* PP TIGR01369 929 llaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagi.kaevvlkvsee 994 +la++++++++g++++ v+d dk +l++++k+ e+g++v+at+gta+ l ++g+ ++e +lkv e lcl|NCBI__GCF_900167125.1:WP_078716838.1 936 QLAAGQVLPEEGTIFVAVNDWDKPLILPIVQKFREMGFRVMATRGTATHLYDNGVtDVEPLLKVYEG 1002 ************************************************999998736899******* PP TIGR01369 995 aekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaealleal 1052 ++++++ +k+++i+lvin+ s ++k++++++ ir++a+ y++p+vt++++a+a+++a+ lcl|NCBI__GCF_900167125.1:WP_078716838.1 1003 RPNVVDHIKNRKISLVINTVS-GRKTVHDSKDIRQAALLYNIPYVTTVAGAKATVQAI 1059 ******************998.78899999***********************99985 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (1052 nodes) Target sequences: 1 (1081 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.08u 0.04s 00:00:00.12 Elapsed: 00:00:00.12 # Mc/sec: 9.35 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory