GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Desulfovibrio oxyclinae DSM 11498

Align Carbamoyl-phosphate synthase large chain, chloroplastic; Carbamoyl-phosphate synthetase ammonia chain; Protein VENOSA 3; EC 6.3.5.5 (characterized)
to candidate WP_018123537.1 B149_RS0102265 carbamoyl-phosphate synthase large subunit

Query= SwissProt::Q42601
         (1187 letters)



>NCBI__GCF_000375485.1:WP_018123537.1
          Length = 1077

 Score = 1239 bits (3207), Expect = 0.0
 Identities = 635/1092 (58%), Positives = 814/1092 (74%), Gaps = 26/1092 (2%)

Query: 94   KRTDLKKIMILGAGPIVIGQACEFDYSGTQACKALREEGYEVILINSNPATIMTDPETAN 153
            KR+DLKKIM++G+GPIVIGQACEFDYSGTQA KAL+EEGYEV+L+NSNPA+IMTDPE A+
Sbjct: 3    KRSDLKKIMLIGSGPIVIGQACEFDYSGTQALKALKEEGYEVVLVNSNPASIMTDPELAD 62

Query: 154  RTYIAPMTPELVEQVIEKERPDALLPTMGGQTALNLAVALAESGALEKYGVELIGAKLGA 213
            RTYI P+ PE V ++IEKERPDALLPT+GGQT LN A+A+AE G L+K+GVELIGA L A
Sbjct: 63   RTYIEPIEPETVARIIEKERPDALLPTLGGQTGLNTALAVAEMGVLDKFGVELIGADLDA 122

Query: 214  IKKAEDRELFKDAMKNIGLKTPPSGIGTTLDECFDIAEKIGEFPLIIRPAFTLGGTGGGI 273
            I+KAE RELF+ AM+NIGL  P SGI    D+  +  EK+  FP+IIRPAFTLGGTGGG+
Sbjct: 123  IQKAESRELFRKAMENIGLTVPLSGIARNQDDIREWGEKLS-FPIIIRPAFTLGGTGGGV 181

Query: 274  AYNKEEFESICKSGLAASATSQVLVEKSLLGWKEYELEVMRDLADNVVIICSIENIDPMG 333
            AYN EE   I   G+AAS  S+V++E+S+LGWKEYELEVMRD  DN VIICSIEN+DPMG
Sbjct: 182  AYNMEELIEIASKGIAASMQSEVMLEESILGWKEYELEVMRDSKDNCVIICSIENLDPMG 241

Query: 334  VHTGDSITVAPAQTLTDREYQRLRDYSIAIIREIGVECGGSNVQFAVNPVDGEVMIIEMN 393
            VHTGDS+TVAPAQTLTD EYQ++RD S+AI+REIGVE GGSNVQFAVNP +G++++IEMN
Sbjct: 242  VHTGDSVTVAPAQTLTDDEYQKMRDASLAIMREIGVETGGSNVQFAVNPENGDLVVIEMN 301

Query: 394  PRVSRSSALASKATGFPIAKMAAKLSVGYTLDQIPNDITRKTPASFEPSIDYVVTKIPRF 453
            PRVSRSSALASKATGFPIAK+AAKL+VGYTLD+IPNDITR+T ASFEP+IDY V KIPRF
Sbjct: 302  PRVSRSSALASKATGFPIAKIAAKLAVGYTLDEIPNDITRETMASFEPAIDYCVIKIPRF 361

Query: 454  AFEKFPGSQPLLTTQMKSVGESMALGRTFQESFQKALRSLECGFSGWGCAKIKELDWDWD 513
             FEKFPG++  L+T MKSVGE+MA+GRTF+E+ QK LRSLE G  G G  + ++ + + D
Sbjct: 362  TFEKFPGAEDYLSTAMKSVGETMAIGRTFKEALQKGLRSLETGHVGLG-KRFEKCEIERD 420

Query: 514  QLKYSLRVPNPDRIHAIYAAMKKGMKIDEIYELSMVDKWFLTQLKELVDVEQYLMSGTLS 573
            +L   LR PN  R+ A+  AM+ GM ++E+++ + +D WFL Q ++++D+E+ L+   + 
Sbjct: 421  ELLQLLRKPNSQRLFALRNAMRCGMSLEEVHDATWIDPWFLGQFRDVLDMEERLIEFGIK 480

Query: 574  EITKED-------LYEVKKRGFSDKQIAFATKTTEEEVRTKRISLGVVPSYKRVDTCAAE 626
            E  +E        L + K+ G+SD Q+A   KT+E+ VR  R+ LG+ P+Y  VDTCAAE
Sbjct: 481  EGIEESTPELPAILRKAKEYGYSDAQLATMWKTSEDAVRNLRLKLGIRPTYYLVDTCAAE 540

Query: 627  FEAHTPYMYSSYDVECESAPNNKKKVLILGGGPNRIGQGIEFDYCCCHTSFALQDAGYET 686
            FEA+TPY YS+Y+   E  P   +K++ILGGGPNRIGQGIEFDYCCCH+SF L++ G ++
Sbjct: 541  FEAYTPYYYSTYEGGDEIEPAEGRKIVILGGGPNRIGQGIEFDYCCCHSSFMLREMGIKS 600

Query: 687  IMLNSNPETVSTDYDTSDRLYFEPLTIEDVLNVIDLEKPDGIIVQFGGQTPLKLALPIKH 746
            IM+NSNPETVSTDYDTSDRLYFEPLT EDV+N+ID EKPDG++VQFGGQTPL LA+    
Sbjct: 601  IMVNSNPETVSTDYDTSDRLYFEPLTFEDVMNIIDFEKPDGVVVQFGGQTPLNLAI---- 656

Query: 747  YLDKHMPMSLSGAGPVRIWGTSPDSIDAAEDRERFNAILDELKIEQPKGGIAKSEADALA 806
                     L  AG V + GTSPD+ID AEDRERF  +L++L ++QP  G A S  +A  
Sbjct: 657  --------RLMNAG-VPLIGTSPDAIDRAEDRERFKQLLNKLHLKQPPNGTAMSMVEARE 707

Query: 807  IAKEVGYPVVVRPSYVLGGRAMEIVYDDSRLITYLENAVQVDPERPVLVDKYLSDAIEID 866
            IA+++G+P+V+RPSYVLGGR M+IVY       Y   +  V PE P L+DK+L  A+E+D
Sbjct: 708  IAEKLGFPLVLRPSYVLGGRGMDIVYSMEDFERYFRESALVSPEHPTLIDKFLEYAVEVD 767

Query: 867  VDTLTDSYGNVVIGGIMEHIEQAGVHSGDSACMLPTQTIPASCLQTIRTWTTKLAKKLNV 926
            VD L D    V IGG+MEHIE+AG+HSGDSA +LP  ++    ++ I   TT +A +L V
Sbjct: 768  VDALCDG-EQVYIGGVMEHIEEAGIHSGDSASVLPPYSLSPEMVREIERQTTAMAIELGV 826

Query: 927  CGLMNCQYAITTSGDVFLLEANPRASRTVPFVSKAIGHPLAKYAALVMSGKSLKDLNFEK 986
             GLMN QYAI    +V+++E NPRASRTVPFVSKA   P+AK A  VM G+ +KDL    
Sbjct: 827  VGLMNVQYAI-KDDEVYVIEVNPRASRTVPFVSKATAVPMAKLATRVMMGEKIKDLKPWS 885

Query: 987  EVIPKHVSVKEAVFPFEKFQGCDVILGPEMRSTGEVMSISSEFSSAFAMAQIAAGQKLPL 1046
                 H+S+KE+VFPF +F   DV+LGPEMRSTGEVM I   F  A+  +Q+AAGQKLP 
Sbjct: 886  MRKKGHISIKESVFPFNRFPNVDVLLGPEMRSTGEVMGIDESFGLAYMKSQLAAGQKLPK 945

Query: 1047 SGTVFLSLNDMTKPHLEKIAVSFLELGFKIVATSGTAHFLELKGIPVERVLKLHEG-RPH 1105
             G VF+S+ND  K  +      F ++GF I+AT GTA +LE KG+ V+RV K+HEG RP+
Sbjct: 946  GGNVFVSVNDWDKNKVLLPVRDFQDMGFSILATGGTADYLEEKGVKVQRVYKVHEGQRPN 1005

Query: 1106 AADMVANGQIHLMLITSSGDALDQKDGRQLRQMALAYKVPVITTVAGALATAEGIKSLKS 1165
              D++ NG I L+L T SG      D + +RQ  L Y +P  TT++GA A A+ I  L+ 
Sbjct: 1006 VVDLIKNGDIDLVLNTPSGKK-TVGDSKMIRQATLLYNIPYTTTISGARAVAQAIYELRE 1064

Query: 1166 SAIKMTALQDFF 1177
            + + + ++Q+++
Sbjct: 1065 TGLVVKSIQEYY 1076


Lambda     K      H
   0.317    0.133    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3124
Number of extensions: 130
Number of successful extensions: 20
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1187
Length of database: 1077
Length adjustment: 46
Effective length of query: 1141
Effective length of database: 1031
Effective search space:  1176371
Effective search space used:  1176371
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_018123537.1 B149_RS0102265 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.3544.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1511.2   0.0          0 1511.0   0.0    1.0  1  lcl|NCBI__GCF_000375485.1:WP_018123537.1  B149_RS0102265 carbamoyl-phospha


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000375485.1:WP_018123537.1  B149_RS0102265 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1511.0   0.0         0         0       1    1052 []       2    1059 ..       2    1059 .. 0.98

  Alignments for each domain:
  == domain 1  score: 1511.0 bits;  conditional E-value: 0
                                 TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYieP 67  
                                                pkr+d+kk+++iGsGpivigqA+EFDYsG+qalkalkeeg+evvLvnsn+A +mtd+elad++YieP
  lcl|NCBI__GCF_000375485.1:WP_018123537.1    2 PKRSDLKKIMLIGSGPIVIGQACEFDYSGTQALKALKEEGYEVVLVNSNPASIMTDPELADRTYIEP 68  
                                                6899*************************************************************** PP

                                 TIGR01369   68 ltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkea 134 
                                                +++e+v++iiekErpDa+l+tlGGqt+Ln a+ + e+GvL+k+gv+l+G++++ai+kae+Re+F++a
  lcl|NCBI__GCF_000375485.1:WP_018123537.1   69 IEPETVARIIEKERPDALLPTLGGQTGLNTALAVAEMGVLDKFGVELIGADLDAIQKAESRELFRKA 135 
                                                ******************************************************************* PP

                                 TIGR01369  135 lkeineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspik 201 
                                                +++i++ v+ s i++++++  e  e++ +P+i+R+aftlgGtG+g+a+n+eel e+++k+++as+ +
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  136 MENIGLTVPLSGIARNQDDIREWGEKLSFPIIIRPAFTLGGTGGGVAYNMEELIEIASKGIAASMQS 202 
                                                ******************************************************************* PP

                                 TIGR01369  202 qvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdasl 268 
                                                +v++e+s+ gwkE+E+Ev+RDskdnc+i+c+iEnlDp+GvHtGds++vaP+qtLtd+eyq++Rdasl
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  203 EVMLEESILGWKEYELEVMRDSKDNCVIICSIENLDPMGVHTGDSVTVAPAQTLTDDEYQKMRDASL 269 
                                                ******************************************************************* PP

                                 TIGR01369  269 kiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelk 334 
                                                +i+re+gve++ +nvqfa++Pe+ ++vviE+npRvsRssALAskAtG+PiAk+aaklavGy+Lde++
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  270 AIMREIGVETGgSNVQFAVNPENGDLVVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDEIP 336 
                                                *********988******************************************************* PP

                                 TIGR01369  335 ndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrslee 401 
                                                nd+t+et+AsfEP++DY+v+kiPr+ ++kf ++++ l+t mksvGE maigrtf+ealqk+lrsle+
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  337 NDITRETMASFEPAIDYCVIKIPRFTFEKFPGAEDYLSTAMKSVGETMAIGRTFKEALQKGLRSLET 403 
                                                ******************************************************************* PP

                                 TIGR01369  402 kllglklk.ekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvel 467 
                                                + +gl ++ ek +++++el + l+kpn +Rlfa+ +a+r g+s+eev+++t id +fl ++++++++
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  404 GHVGLGKRfEKCEIERDELLQLLRKPNSQRLFALRNAMRCGMSLEEVHDATWIDPWFLGQFRDVLDM 470 
                                                ****655425677888999************************************************ PP

                                 TIGR01369  468 ekeleeeklk....elkke...llkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtv 527 
                                                e++l e  +k    e + e    l+kak++G+sd+q+a+++k+se++vr+lr +lgi p++  vDt+
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  471 EERLIEFGIKegieESTPElpaILRKAKEYGYSDAQLATMWKTSEDAVRNLRLKLGIRPTYYLVDTC 537 
                                                ***998855522223333322279******************************************* PP

                                 TIGR01369  528 aaEfeaktpYlYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktili 594 
                                                aaEfea tpY+Ystye+  d++e  e +k+++lG+Gp+Rigqg+EFDyc+ h++  lre+g+k+i++
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  538 AAEFEAYTPYYYSTYEGG-DEIEPAEGRKIVILGGGPNRIGQGIEFDYCCCHSSFMLREMGIKSIMV 603 
                                                ******************.************************************************ PP

                                 TIGR01369  595 nynPEtvstDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGtsa 661 
                                                n+nPEtvstDyd++drLyFe+lt+edv++ii+ ek++gv+vq+gGqt+lnla +l +agv+++Gts+
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  604 NSNPETVSTDYDTSDRLYFEPLTFEDVMNIIDFEKPDGVVVQFGGQTPLNLAIRLMNAGVPLIGTSP 670 
                                                ******************************************************************* PP

                                 TIGR01369  662 esidraEdRekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneee 728 
                                                ++idraEdRe+F++ll++l++kqp + +a s+ ea+eia+++g+P+++RpsyvlgGr+m+iv+++e+
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  671 DAIDRAEDRERFKQLLNKLHLKQPPNGTAMSMVEAREIAEKLGFPLVLRPSYVLGGRGMDIVYSMED 737 
                                                ******************************************************************* PP

                                 TIGR01369  729 leryleeavevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqk 795 
                                                +ery++e + vs+e+P lidk+le avEvdvDa++dge+v+i g++eHiEeaG+HsGDs+ vlpp +
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  738 FERYFRESALVSPEHPTLIDKFLEYAVEVDVDALCDGEQVYIGGVMEHIEEAGIHSGDSASVLPPYS 804 
                                                ******************************************************************* PP

                                 TIGR01369  796 lseevkkkikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavk 862 
                                                ls e++++i+++++++a el v+Gl+n+q+++kd+evyviEvn+RasRtvPfvska+ vp++kla++
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  805 LSPEMVREIERQTTAMAIELGVVGLMNVQYAIKDDEVYVIEVNPRASRTVPFVSKATAVPMAKLATR 871 
                                                ******************************************************************* PP

                                 TIGR01369  863 vllgkkleelekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkal 929 
                                                v++g+k+++l+    ++ k+ ++++k++vf+f+++ +vdv+lgpem+stGEvmgi++++  a++k++
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  872 VMMGEKIKDLKP--WSMRKKGHISIKESVFPFNRFPNVDVLLGPEMRSTGEVMGIDESFGLAYMKSQ 936 
                                                *********987..66777789********************************************* PP

                                 TIGR01369  930 laskakikkkgsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvseea. 995 
                                                la+++k++k g+v++sv+d dk+++l  ++ ++++g++++at gta++lee+g+k++ v+kv+e + 
  lcl|NCBI__GCF_000375485.1:WP_018123537.1  937 LAAGQKLPKGGNVFVSVNDWDKNKVLLPVRDFQDMGFSILATGGTADYLEEKGVKVQRVYKVHEGQr 1003
                                                *************************************************************999765 PP

                                 TIGR01369  996 ekilellkeeeielvinltskkkkaaekgykirreaveykvplvteletaealleal 1052
                                                +++++l+k+++i+lv+n++s +kk++ ++++ir++++ y++p+ t++++a+a+++a+
  lcl|NCBI__GCF_000375485.1:WP_018123537.1 1004 PNVVDLIKNGDIDLVLNTPS-GKKTVGDSKMIRQATLLYNIPYTTTISGARAVAQAI 1059
                                                99***************998.77788999***********************99985 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1077 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.07u 0.02s 00:00:00.09 Elapsed: 00:00:00.09
# Mc/sec: 12.13
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory