GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Desulfovibrio gracilis DSM 16080

Align Carbamoyl-phosphate synthase large chain, chloroplastic; Carbamoyl-phosphate synthetase ammonia chain; Protein VENOSA 3; EC 6.3.5.5 (characterized)
to candidate WP_078716838.1 B5D49_RS06310 carbamoyl-phosphate synthase large subunit

Query= SwissProt::Q42601
         (1187 letters)



>NCBI__GCF_900167125.1:WP_078716838.1
          Length = 1081

 Score = 1236 bits (3198), Expect = 0.0
 Identities = 642/1095 (58%), Positives = 809/1095 (73%), Gaps = 26/1095 (2%)

Query: 94   KRTDLKKIMILGAGPIVIGQACEFDYSGTQACKALREEGYEVILINSNPATIMTDPETAN 153
            KRTDLKKIM++G+GPIVIGQACEFDYSGTQA KAL+EEGYEV+L+NSNPATIMTDPE A+
Sbjct: 3    KRTDLKKIMLIGSGPIVIGQACEFDYSGTQALKALKEEGYEVVLVNSNPATIMTDPELAD 62

Query: 154  RTYIAPMTPELVEQVIEKERPDALLPTMGGQTALNLAVALAESGALEKYGVELIGAKLGA 213
            RTY+ P+ PE V ++I +ERPDALLPT+GGQT LN A+A+AE G L++YGVELIGA    
Sbjct: 63   RTYVEPIEPETVARIIAQERPDALLPTLGGQTGLNTALAVAEMGVLKEYGVELIGANEAV 122

Query: 214  IKKAEDRELFKDAMKNIGLKTPPSGIGTTLDECFDIAEKIGEFPLIIRPAFTLGGTGGGI 273
            I+KAE R+LF++AM+NIGLK P SGI  ++D+     EKI  FP+I+RPA+TLGGTGGG+
Sbjct: 123  IQKAESRQLFREAMENIGLKVPASGIARSMDDVRAWGEKIS-FPIIVRPAYTLGGTGGGV 181

Query: 274  AYNKEEFESICKSGLAASATSQVLVEKSLLGWKEYELEVMRDLADNVVIICSIENIDPMG 333
            AYN EE E+I   GLA S  +++++E+S+LGWKE+ELEVMRD  DN VIICSIEN+D MG
Sbjct: 182  AYNMEELEAISSKGLALSMKNEIMLEQSVLGWKEFELEVMRDKKDNCVIICSIENLDAMG 241

Query: 334  VHTGDSITVAPAQTLTDREYQRLRDYSIAIIREIGVECGGSNVQFAVNPVDGEVMIIEMN 393
            VHTGDSITVAPAQTLTDREYQ++RD ++AI+REIGVE GGSNVQFAVNP DGE++IIEMN
Sbjct: 242  VHTGDSITVAPAQTLTDREYQQMRDAALAIMREIGVETGGSNVQFAVNPEDGELVIIEMN 301

Query: 394  PRVSRSSALASKATGFPIAKMAAKLSVGYTLDQIPNDITRKTPASFEPSIDYVVTKIPRF 453
            PRVSRSSALASKATGFPIAK+AAKL+VGYTLD++PNDITR+T ASFEP+IDY V KIPRF
Sbjct: 302  PRVSRSSALASKATGFPIAKIAAKLAVGYTLDELPNDITRETMASFEPTIDYCVIKIPRF 361

Query: 454  AFEKFPGSQPLLTTQMKSVGESMALGRTFQESFQKALRSLECGFSGWGCAKIKELDWDWD 513
             FEKFPGS+  LTT MKSVGE+MA+GRTF+E+ QK LRSLE G  G G         D D
Sbjct: 362  TFEKFPGSEDHLTTAMKSVGETMAIGRTFKEALQKGLRSLEVGMPGLG-KHFAPCPLDKD 420

Query: 514  QLKYSLRVPNPDRIHAIYAAMKKGMKIDEIYELSMVDKWFLTQLKELVDVEQYLMS-GTL 572
            +L   LR PN  R++A+  AM+ G+  +EIY  S +D WFL Q+++++++E  L   G  
Sbjct: 421  ELLTELRNPNSQRLYAVRNAMRCGVDDEEIYATSFIDPWFLRQIRQVLEMENTLQEFGKQ 480

Query: 573  SEITKED------LYEVKKRGFSDKQIAFATKTTEEEVRTKRISLGVVPSYKRVDTCAAE 626
              I  +D      L   K+ G+SD+Q+A   KT+  ++R+ R    +VP+Y  VDTCAAE
Sbjct: 481  HGIENKDQELADILRRAKEYGYSDQQLATLWKTSPRKIRSLRKEWDIVPTYYLVDTCAAE 540

Query: 627  FEAHTPYMYSSYDVECESAPNNKKKVLILGGGPNRIGQGIEFDYCCCHTSFALQDAGYET 686
            FEAHTPY YS+Y+   E  P   KK++ILGGGPNRIGQGIEFDYCCCH+SF L+D G ++
Sbjct: 541  FEAHTPYYYSTYESGSEITPAPGKKIIILGGGPNRIGQGIEFDYCCCHSSFQLRDMGIQS 600

Query: 687  IMLNSNPETVSTDYDTSDRLYFEPLTIEDVLNVIDLEKPDGIIVQFGGQTPLKLALPIKH 746
            IM+NSNPETVSTDYDTSDRLYFEPLT EDVLN+++ E+PDG+I+QFGGQTPL LA     
Sbjct: 601  IMVNSNPETVSTDYDTSDRLYFEPLTFEDVLNIVEFEQPDGVIIQFGGQTPLNLA----- 655

Query: 747  YLDKHMPMSLSGAGPVRIWGTSPDSIDAAEDRERFNAILDELKIEQPKGGIAKSEADALA 806
                   +SL  AG V + GTSPD+ID AEDRERF  +L +L + QP  G A S   A  
Sbjct: 656  -------VSLMEAG-VPMIGTSPDAIDRAEDRERFKRLLKKLHLRQPLNGTAMSLVQAQE 707

Query: 807  IAKEVGYPVVVRPSYVLGGRAMEIVYDDSRLITYLENAVQVDPERPVLVDKYLSDAIEID 866
            IA ++G+P+V+RPSYVLGGR M+IVY       Y   +  V PE PVL+DK+L  A+E+D
Sbjct: 708  IAGKIGFPLVLRPSYVLGGRGMDIVYSMEEFERYFRESALVSPEHPVLIDKFLEHAVEVD 767

Query: 867  VDTLTDSYGNVVIGGIMEHIEQAGVHSGDSACMLPTQTIPASCLQTIRTWTTKLAKKLNV 926
            VD L D   +  IGG+MEHIE+AG+HSGDSAC+LP  +I    +Q I   T  +A +L V
Sbjct: 768  VDALADG-EDCYIGGVMEHIEEAGIHSGDSACVLPPNSISPDLIQEIERQTKAMALELGV 826

Query: 927  CGLMNCQYAITTSGDVFLLEANPRASRTVPFVSKAIGHPLAKYAALVMSGKSLKDLNFEK 986
             GLMN QYAI    +V+++E NPRASRTVPFVSKA G PLAK A  VM G+ L D++   
Sbjct: 827  VGLMNVQYAI-KDDEVYIIEVNPRASRTVPFVSKATGVPLAKLATRVMLGEKLNDIDPWS 885

Query: 987  EVIPKHVSVKEAVFPFEKFQGCDVILGPEMRSTGEVMSISSEFSSAFAMAQIAAGQKLPL 1046
                  V+VKEAVFPF +F   DVILGPEMRSTGEVM I  EF  AF  AQ+AAGQ LP 
Sbjct: 886  MRKKGWVAVKEAVFPFNRFPNVDVILGPEMRSTGEVMGIDYEFGPAFMKAQLAAGQVLPE 945

Query: 1047 SGTVFLSLNDMTKPHLEKIAVSFLELGFKIVATSGTAHFLELKGI-PVERVLKLHEGRPH 1105
             GT+F+++ND  KP +  I   F E+GF+++AT GTA  L   G+  VE +LK++EGRP+
Sbjct: 946  EGTIFVAVNDWDKPLILPIVQKFREMGFRVMATRGTATHLYDNGVTDVEPLLKVYEGRPN 1005

Query: 1106 AADMVANGQIHLMLITSSGDALDQKDGRQLRQMALAYKVPVITTVAGALATAEGIKSLKS 1165
              D + N +I L++ T SG      D + +RQ AL Y +P +TTVAGA AT + I+ ++ 
Sbjct: 1006 VVDHIKNRKISLVINTVSG-RKTVHDSKDIRQAALLYNIPYVTTVAGAKATVQAIEDVRK 1064

Query: 1166 SAIKMTALQDFFEVK 1180
            + +++  LQ++ + K
Sbjct: 1065 AGLQVRCLQEYHDNK 1079


Lambda     K      H
   0.317    0.133    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3077
Number of extensions: 140
Number of successful extensions: 20
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1187
Length of database: 1081
Length adjustment: 46
Effective length of query: 1141
Effective length of database: 1035
Effective search space:  1180935
Effective search space used:  1180935
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_078716838.1 B5D49_RS06310 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.12030.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1503.4   0.0          0 1503.2   0.0    1.0  1  lcl|NCBI__GCF_900167125.1:WP_078716838.1  B5D49_RS06310 carbamoyl-phosphat


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_900167125.1:WP_078716838.1  B5D49_RS06310 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1503.2   0.0         0         0       1    1052 []       2    1059 ..       2    1059 .. 0.97

  Alignments for each domain:
  == domain 1  score: 1503.2 bits;  conditional E-value: 0
                                 TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYieP 67  
                                                pkr+d+kk+++iGsGpivigqA+EFDYsG+qalkalkeeg+evvLvnsn+At+mtd+elad++Y+eP
  lcl|NCBI__GCF_900167125.1:WP_078716838.1    2 PKRTDLKKIMLIGSGPIVIGQACEFDYSGTQALKALKEEGYEVVLVNSNPATIMTDPELADRTYVEP 68  
                                                689**************************************************************** PP

                                 TIGR01369   68 ltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkea 134 
                                                +++e+v++ii +ErpDa+l+tlGGqt+Ln a+ + e+GvL++ygv+l+G++  +i+kae+R++F+ea
  lcl|NCBI__GCF_900167125.1:WP_078716838.1   69 IEPETVARIIAQERPDALLPTLGGQTGLNTALAVAEMGVLKEYGVELIGANEAVIQKAESRQLFREA 135 
                                                ******************************************************************* PP

                                 TIGR01369  135 lkeineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspik 201 
                                                +++i+++v++s i++s+++  +  e+i +P+ivR+a+tlgGtG+g+a+n+eel+++ +k+l++s+ +
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  136 MENIGLKVPASGIARSMDDVRAWGEKISFPIIVRPAYTLGGTGGGVAYNMEELEAISSKGLALSMKN 202 
                                                ******************************************************************* PP

                                 TIGR01369  202 qvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdasl 268 
                                                ++++e+s+ gwkE+E+Ev+RD+kdnc+i+c+iEnlD++GvHtGdsi+vaP+qtLtd+eyq++Rda+l
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  203 EIMLEQSVLGWKEFELEVMRDKKDNCVIICSIENLDAMGVHTGDSITVAPAQTLTDREYQQMRDAAL 269 
                                                ******************************************************************* PP

                                 TIGR01369  269 kiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelk 334 
                                                +i+re+gve++ +nvqfa++Pe+ ++v+iE+npRvsRssALAskAtG+PiAk+aaklavGy+Ldel+
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  270 AIMREIGVETGgSNVQFAVNPEDGELVIIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELP 336 
                                                *********988******************************************************* PP

                                 TIGR01369  335 ndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrslee 401 
                                                nd+t+et+AsfEP++DY+v+kiPr+ ++kf + +++l+t mksvGE maigrtf+ealqk+lrsle 
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  337 NDITRETMASFEPTIDYCVIKIPRFTFEKFPGSEDHLTTAMKSVGETMAIGRTFKEALQKGLRSLEV 403 
                                                ******************************************************************* PP

                                 TIGR01369  402 kllglklk.ekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvel 467 
                                                ++ gl ++      +++el ++l++pn +Rl+a+ +a+r gv+ ee+y ++ id +fl+++++++e+
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  404 GMPGLGKHfAPCPLDKDELLTELRNPNSQRLYAVRNAMRCGVDDEEIYATSFIDPWFLRQIRQVLEM 470 
                                                ****654415667889999************************************************ PP

                                 TIGR01369  468 ekeleee.klk.......elkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDt 526 
                                                e++l+e  k +       +  ++ l++ak++G+sd+q+a+l+k+s +++r+lrke +ivp++  vDt
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  471 ENTLQEFgK-QhgienkdQELADILRRAKEYGYSDQQLATLWKTSPRKIRSLRKEWDIVPTYYLVDT 536 
                                                ****98852.323332223346789****************************************** PP

                                 TIGR01369  527 vaaEfeaktpYlYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktil 593 
                                                +aaEfea+tpY+Ystye+   +++    kk+++lG+Gp+Rigqg+EFDyc+ h++ +lr++g+++i+
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  537 CAAEFEAHTPYYYSTYESG-SEITPAPGKKIIILGGGPNRIGQGIEFDYCCCHSSFQLRDMGIQSIM 602 
                                                ******************9.889999999************************************** PP

                                 TIGR01369  594 inynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGts 660 
                                                +n+nPEtvstDyd++drLyFe+lt+edvl+i+e e+++gvi+q+gGqt+lnla +l eagv+++Gts
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  603 VNSNPETVSTDYDTSDRLYFEPLTFEDVLNIVEFEQPDGVIIQFGGQTPLNLAVSLMEAGVPMIGTS 669 
                                                ******************************************************************* PP

                                 TIGR01369  661 aesidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameivenee 727 
                                                +++idraEdRe+F++ll++l++ qp + +a s+ +a+eia +ig+P+++RpsyvlgGr+m+iv+++e
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  670 PDAIDRAEDRERFKRLLKKLHLRQPLNGTAMSLVQAQEIAGKIGFPLVLRPSYVLGGRGMDIVYSME 736 
                                                ******************************************************************* PP

                                 TIGR01369  728 eleryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppq 794 
                                                e+ery++e + vs+e+Pvlidk+le+avEvdvDa+adge+ +i g++eHiEeaG+HsGDs++vlpp+
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  737 EFERYFRESALVSPEHPVLIDKFLEHAVEVDVDALADGEDCYIGGVMEHIEEAGIHSGDSACVLPPN 803 
                                                ******************************************************************* PP

                                 TIGR01369  795 klseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklav 861 
                                                ++s ++ ++i++++k++a el v+Gl+n+q+++kd+evy+iEvn+RasRtvPfvska+gvpl+kla+
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  804 SISPDLIQEIERQTKAMALELGVVGLMNVQYAIKDDEVYIIEVNPRASRTVPFVSKATGVPLAKLAT 870 
                                                ******************************************************************* PP

                                 TIGR01369  862 kvllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallka 928 
                                                +v+lg+kl++ +    ++ k+  vavk+avf+f+++ +vdv+lgpem+stGEvmgi++++  a++ka
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  871 RVMLGEKLNDIDP--WSMRKKGWVAVKEAVFPFNRFPNVDVILGPEMRSTGEVMGIDYEFGPAFMKA 935 
                                                **********887..6677778********************************************* PP

                                 TIGR01369  929 llaskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagi.kaevvlkvsee 994 
                                                +la++++++++g++++ v+d dk  +l++++k+ e+g++v+at+gta+ l ++g+ ++e +lkv e 
  lcl|NCBI__GCF_900167125.1:WP_078716838.1  936 QLAAGQVLPEEGTIFVAVNDWDKPLILPIVQKFREMGFRVMATRGTATHLYDNGVtDVEPLLKVYEG 1002
                                                ************************************************999998736899******* PP

                                 TIGR01369  995 aekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaealleal 1052
                                                ++++++ +k+++i+lvin+ s ++k++++++ ir++a+ y++p+vt++++a+a+++a+
  lcl|NCBI__GCF_900167125.1:WP_078716838.1 1003 RPNVVDHIKNRKISLVINTVS-GRKTVHDSKDIRQAALLYNIPYVTTVAGAKATVQAI 1059
                                                ******************998.78899999***********************99985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1081 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.08u 0.04s 00:00:00.12 Elapsed: 00:00:00.12
# Mc/sec: 9.35
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory