GapMind for Amino acid biosynthesis

 

Alignments for a candidate for carB in Thermovibrio ammonificans HB-1

Align Carbamoyl-phosphate synthase large chain, chloroplastic; Carbamoyl-phosphate synthetase ammonia chain; Protein VENOSA 3; EC 6.3.5.5 (characterized)
to candidate WP_013538311.1 THEAM_RS07915 carbamoyl-phosphate synthase large subunit

Query= SwissProt::Q42601
         (1187 letters)



>NCBI__GCF_000185805.1:WP_013538311.1
          Length = 1073

 Score = 1263 bits (3267), Expect = 0.0
 Identities = 651/1089 (59%), Positives = 827/1089 (75%), Gaps = 26/1089 (2%)

Query: 94   KRTDLKKIMILGAGPIVIGQACEFDYSGTQACKALREEGYEVILINSNPATIMTDPETAN 153
            KRTDL+KI+I+G+GPIVIGQA EFDYSGTQACKAL+EEGY+V+L+NSNPATIMTDP+ A+
Sbjct: 3    KRTDLRKILIIGSGPIVIGQAAEFDYSGTQACKALKEEGYQVVLVNSNPATIMTDPDIAD 62

Query: 154  RTYIAPMTPELVEQVIEKERPDALLPTMGGQTALNLAVALAESGALEKYGVELIGAKLGA 213
            RTYI P+T E++E++IEKERPDALLPT+GGQTALNLAV L E+G LEKYGVELIGAK+ A
Sbjct: 63   RTYIEPLTVEVLEKIIEKERPDALLPTVGGQTALNLAVKLHEAGILEKYGVELIGAKVEA 122

Query: 214  IKKAEDRELFKDAMKNIGLKTPPSGIGTTLDECFDIAEKIGEFPLIIRPAFTLGGTGGGI 273
            IKKAEDRELFK+AM  IGL+ P SG   TL+E  ++ +++G  P IIRPAFTLGG GGG+
Sbjct: 123  IKKAEDRELFKEAMLKIGLEVPKSGTAHTLEEALEVVKEVG-LPAIIRPAFTLGGEGGGV 181

Query: 274  AYNKEEFESICKSGLAASATSQVLVEKSLLGWKEYELEVMRDLADNVVIICSIENIDPMG 333
            AYN EEF+ I K GLAAS  S++L+E+S+LGWKEYELEVMRDL DNVVIICSIEN+DPMG
Sbjct: 182  AYNIEEFKEIAKKGLAASPVSEILIEESVLGWKEYELEVMRDLNDNVVIICSIENVDPMG 241

Query: 334  VHTGDSITVAPAQTLTDREYQRLRDYSIAIIREIGVECGGSNVQFAVNPVDGEVMIIEMN 393
            VHTGDSITVAPAQTLTD+EYQ LRD +IAIIREIGVE GGSNVQFAVNP +G V++IEMN
Sbjct: 242  VHTGDSITVAPAQTLTDKEYQVLRDAAIAIIREIGVETGGSNVQFAVNPENGRVIVIEMN 301

Query: 394  PRVSRSSALASKATGFPIAKMAAKLSVGYTLDQIPNDITRKTPASFEPSIDYVVTKIPRF 453
            PRVSRSSALASKATGFPIAK+AAKL+VGYTLD++PNDIT+KTPASFEP+IDY V K PR+
Sbjct: 302  PRVSRSSALASKATGFPIAKIAAKLAVGYTLDELPNDITKKTPASFEPAIDYCVVKFPRW 361

Query: 454  AFEKFPGSQPLLTTQMKSVGESMALGRTFQESFQKALRSLECGFSGWGCAKIKELDWDWD 513
            AFEKFP +   LTT+MKSVGE MA+GRTF+E+  KA+RSLE G  G     ++ L     
Sbjct: 362  AFEKFPEADSTLTTRMKSVGEVMAIGRTFKEALLKAVRSLEIGRYGLTMKGVERL--SDS 419

Query: 514  QLKYSLRVPNPDRIHAIYAAMKKGMKIDEIYELSMVDKWFLTQLKELVDVEQYLMSGTLS 573
            +L+  + VPN DRI  I  A ++G  ++++YELS +D+WFL  +++LV++E  L   T+ 
Sbjct: 420  ELESRIAVPNADRIWYIAEAFRRGWSLEKLYELSRIDRWFLHNIRQLVELEGELKKHTVD 479

Query: 574  EITKEDLYEVKKRGFSDKQIAFATKTTEEEVRTKRISLGVVPSYKRVDTCAAEFEAHTPY 633
             +  E L   KK GFSD++IA    TTE+ VR KR SL  V  YK VDTCA EFEA+TPY
Sbjct: 480  TVPGELLKWAKKWGFSDREIASLLGTTEKAVREKRKSLAPV-LYKTVDTCAGEFEAYTPY 538

Query: 634  MYSSYD-VECESAPNNKKKVLILGGGPNRIGQGIEFDYCCCHTSFALQDAGYETIMLNSN 692
             YS+YD  ECE+ P+ K+KV + G GPNRIGQG+EFDYCC H  +AL++ GYE  M+N N
Sbjct: 539  YYSTYDGRECEANPSKKEKVTVFGSGPNRIGQGVEFDYCCVHAVWALRELGYEAHMVNCN 598

Query: 693  PETVSTDYDTSDRLYFEPLTIEDVLNVIDLEKPDGIIVQFGGQTPLKLALPIKHYLDKHM 752
            PETVSTDYDTSD+L+FEPLT+ED LNV++ EKP G++VQFGGQTPLKL++P++       
Sbjct: 599  PETVSTDYDTSDKLFFEPLTLEDALNVVEKEKPLGVVVQFGGQTPLKLSVPLER------ 652

Query: 753  PMSLSGAGPVRIWGTSPDSIDAAEDRERFNAILDELKIEQPKGGIAKSEADALAIAKEVG 812
                     V+I GTS +SID AEDRERF  +L+ L ++QP  GIA+S  +A  IA+E+G
Sbjct: 653  -------EGVKILGTSSESIDIAEDRERFRELLNRLGLKQPPSGIARSLEEAEKIAQEIG 705

Query: 813  YPVVVRPSYVLGGRAMEIVYDDSRLITYLENAVQVDPERPVLVDKYLSDAIEIDVDTLTD 872
            +PV++RPSYVLGGRAM IVY    L  Y+  AV+V  E+PVL+DK+L DA+E DVD + D
Sbjct: 706  FPVLMRPSYVLGGRAMRIVYSMEELRQYMAEAVEVSEEKPVLIDKFLEDAVEFDVDAVAD 765

Query: 873  SYGNVVIGGIMEHIEQAGVHSGDSACMLPTQTIPASCLQTIRTWTTKLAKKLNVCGLMNC 932
                VVIGG+MEHIE+AG+HSGDSAC+LPT ++    ++ I+  T K+A +LNV GL+N 
Sbjct: 766  G-EEVVIGGVMEHIEEAGIHSGDSACVLPTFSVSRDIVEKIKEITRKIALELNVKGLINI 824

Query: 933  QYAITTSGDVFLLEANPRASRTVPFVSKAIGHPLAKYAALVMSGKSLKDLNFEKEVIPKH 992
            Q+A+   G+++++E NPRASRTVPFVSKA G PLAK A  V  GK L++L   KEV P++
Sbjct: 825  QFAV-KDGEIYIIEVNPRASRTVPFVSKATGIPLAKVATKVAMGKKLRELGV-KEVEPEY 882

Query: 993  VSVKEAVFPFEKFQGCDVILGPEMRSTGEVMSISSEFSSAFAMAQIAAGQKLPLS---GT 1049
             SVKEAVFPF +F   D +LGPEM+STGEVM I  +   AF  AQ+AAG +LPL    G 
Sbjct: 883  YSVKEAVFPFNRFPEVDPVLGPEMKSTGEVMGIDPDVGIAFYKAQLAAGSRLPLDPSCGK 942

Query: 1050 VFLSLNDMTKPHLEKIAVSFLELGFKIVATSGTAHFLELKGIPVERVLKLHEG-RPHAAD 1108
            VF+S+ D  KP +  +A    ++GFKIV+T GT  FL  KGIPVE V KL EG RP+  D
Sbjct: 943  VFISVKDKDKPKIYSVAKQLADMGFKIVSTEGTYRFLREKGIPVELVYKLQEGRRPNIGD 1002

Query: 1109 MVANGQIHLMLITSSGDALDQKDGRQLRQMALAYKVPVITTVAGALATAEGIKSLKSSAI 1168
            ++ NG++ L++ T +G    ++D   +R++A+ Y +P  TTV GA      I +++  ++
Sbjct: 1003 LIKNGEVCLIINTPTG-TRSKRDAYSIRRLAVNYGIPYYTTVRGAEMAVRAISAMRKGSL 1061

Query: 1169 KMTALQDFF 1177
             +  LQD++
Sbjct: 1062 NVKPLQDYY 1070


Lambda     K      H
   0.317    0.133    0.383 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 3174
Number of extensions: 134
Number of successful extensions: 22
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 1187
Length of database: 1073
Length adjustment: 46
Effective length of query: 1141
Effective length of database: 1027
Effective search space:  1171807
Effective search space used:  1171807
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 58 (26.9 bits)

Align candidate WP_013538311.1 THEAM_RS07915 (carbamoyl-phosphate synthase large subunit)
to HMM TIGR01369 (carB: carbamoyl-phosphate synthase, large subunit (EC 6.3.5.5))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/TIGR01369.hmm
# target sequence database:        /tmp/gapView.6777.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR01369  [M=1052]
Accession:   TIGR01369
Description: CPSaseII_lrg: carbamoyl-phosphate synthase, large subunit
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
          0 1607.0   0.0          0 1606.8   0.0    1.0  1  lcl|NCBI__GCF_000185805.1:WP_013538311.1  THEAM_RS07915 carbamoyl-phosphat


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000185805.1:WP_013538311.1  THEAM_RS07915 carbamoyl-phosphate synthase large subunit
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ! 1606.8   0.0         0         0       1    1051 [.       2    1052 ..       2    1053 .. 0.99

  Alignments for each domain:
  == domain 1  score: 1606.8 bits;  conditional E-value: 0
                                 TIGR01369    1 pkredikkvlviGsGpivigqAaEFDYsGsqalkalkeegievvLvnsniAtvmtdeeladkvYieP 67  
                                                pkr+d++k+l+iGsGpivigqAaEFDYsG+qa+kalkeeg++vvLvnsn+At+mtd+++ad++YieP
  lcl|NCBI__GCF_000185805.1:WP_013538311.1    2 PKRTDLRKILIIGSGPIVIGQAAEFDYSGTQACKALKEEGYQVVLVNSNPATIMTDPDIADRTYIEP 68  
                                                689**************************************************************** PP

                                 TIGR01369   68 ltveavekiiekErpDailltlGGqtaLnlaveleekGvLekygvkllGtkveaikkaedRekFkea 134 
                                                ltve++ekiiekErpDa+l+t+GGqtaLnlav+l+e+G+Lekygv+l+G+kveaikkaedRe+Fkea
  lcl|NCBI__GCF_000185805.1:WP_013538311.1   69 LTVEVLEKIIEKERPDALLPTVGGQTALNLAVKLHEAGILEKYGVELIGAKVEAIKKAEDRELFKEA 135 
                                                ******************************************************************* PP

                                 TIGR01369  135 lkeineevakseivesveealeaaeeigyPvivRaaftlgGtGsgiaeneeelkelvekalkaspik 201 
                                                + +i++ev+ks ++++ eeale+++e+g+P i+R+aftlgG+G+g+a+n ee+ke+++k+l+asp++
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  136 MLKIGLEVPKSGTAHTLEEALEVVKEVGLPAIIRPAFTLGGEGGGVAYNIEEFKEIAKKGLAASPVS 202 
                                                ******************************************************************* PP

                                 TIGR01369  202 qvlvekslagwkEiEyEvvRDskdnciivcniEnlDplGvHtGdsivvaPsqtLtdkeyqllRdasl 268 
                                                ++l+e+s+ gwkE+E+Ev+RD +dn++i+c+iEn+Dp+GvHtGdsi+vaP+qtLtdkeyq lRda++
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  203 EILIEESVLGWKEYELEVMRDLNDNVVIICSIENVDPMGVHTGDSITVAPAQTLTDKEYQVLRDAAI 269 
                                                ******************************************************************* PP

                                 TIGR01369  269 kiirelgvege.cnvqfaldPeskryvviEvnpRvsRssALAskAtGyPiAkvaaklavGysLdelk 334 
                                                +iire+gve++ +nvqfa++Pe+ r++viE+npRvsRssALAskAtG+PiAk+aaklavGy+Ldel+
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  270 AIIREIGVETGgSNVQFAVNPENGRVIVIEMNPRVSRSSALASKATGFPIAKIAAKLAVGYTLDELP 336 
                                                *********988******************************************************* PP

                                 TIGR01369  335 ndvtketvAsfEPslDYvvvkiPrwdldkfekvdrklgtqmksvGEvmaigrtfeealqkalrslee 401 
                                                nd+tk+t+AsfEP++DY+vvk+Prw+++kf ++d++l+t+mksvGEvmaigrtf+eal+ka+rsle+
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  337 NDITKKTPASFEPAIDYCVVKFPRWAFEKFPEADSTLTTRMKSVGEVMAIGRTFKEALLKAVRSLEI 403 
                                                ******************************************************************* PP

                                 TIGR01369  402 kllglklkekeaesdeeleealkkpndrRlfaiaealrrgvsveevyeltkidrffleklkklvele 468 
                                                + +gl++k  e  sd+ele+++  pn++R+++iaea+rrg s+e++yel++idr+fl+++++lvele
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  404 GRYGLTMKGVERLSDSELESRIAVPNADRIWYIAEAFRRGWSLEKLYELSRIDRWFLHNIRQLVELE 470 
                                                ******************************************************************* PP

                                 TIGR01369  469 keleeeklkelkkellkkakklGfsdeqiaklvkvseaevrklrkelgivpvvkrvDtvaaEfeakt 535 
                                                 el++++++ ++ ellk akk Gfsd++ia+l++++e++vr+ rk+l    ++k+vDt+a+Efea t
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  471 GELKKHTVDTVPGELLKWAKKWGFSDREIASLLGTTEKAVREKRKSLA-PVLYKTVDTCAGEFEAYT 536 
                                                ********************************************9875.4579************** PP

                                 TIGR01369  536 pYlYstyeeekddvevtekkkvlvlGsGpiRigqgvEFDycavhavlalreagyktilinynPEtvs 602 
                                                pY+Ysty + + +++ ++k+kv v GsGp+RigqgvEFDyc+vhav+alre gy++ ++n+nPEtvs
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  537 PYYYSTYDGRECEANPSKKEKVTVFGSGPNRIGQGVEFDYCCVHAVWALRELGYEAHMVNCNPETVS 603 
                                                **********999999999************************************************ PP

                                 TIGR01369  603 tDydiadrLyFeeltvedvldiiekekvegvivqlgGqtalnlakeleeagvkilGtsaesidraEd 669 
                                                tDyd++d+L+Fe+lt+ed l+++ekek+ gv+vq+gGqt+l+l+  le++gvkilGts esid+aEd
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  604 TDYDTSDKLFFEPLTLEDALNVVEKEKPLGVVVQFGGQTPLKLSVPLEREGVKILGTSSESIDIAED 670 
                                                ******************************************************************* PP

                                 TIGR01369  670 RekFsklldelgikqpkgkeatsveeakeiakeigyPvlvRpsyvlgGrameiveneeeleryleea 736 
                                                Re+F +ll++lg+kqp   +a+s+eea++ia+eig+Pvl+RpsyvlgGram+iv+++eel++y+ ea
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  671 RERFRELLNRLGLKQPPSGIARSLEEAEKIAQEIGFPVLMRPSYVLGGRAMRIVYSMEELRQYMAEA 737 
                                                ******************************************************************* PP

                                 TIGR01369  737 vevskekPvlidkyledavEvdvDavadgeevliagileHiEeaGvHsGDstlvlppqklseevkkk 803 
                                                vevs+ekPvlidk+ledavE+dvDavadgeev+i g++eHiEeaG+HsGDs++vlp+ ++s+++++k
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  738 VEVSEEKPVLIDKFLEDAVEFDVDAVADGEEVVIGGVMEHIEEAGIHSGDSACVLPTFSVSRDIVEK 804 
                                                ******************************************************************* PP

                                 TIGR01369  804 ikeivkkiakelkvkGllniqfvvkdeevyviEvnvRasRtvPfvskalgvplvklavkvllgkkle 870 
                                                ikei++kia el+vkGl+niqf+vkd+e+y+iEvn+RasRtvPfvska+g+pl+k+a+kv +gkkl+
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  805 IKEITRKIALELNVKGLINIQFAVKDGEIYIIEVNPRASRTVPFVSKATGIPLAKVATKVAMGKKLR 871 
                                                ******************************************************************* PP

                                 TIGR01369  871 elekgvkkekksklvavkaavfsfsklagvdvvlgpemkstGEvmgigrdleeallkallaskakik 937 
                                                el+    ke ++++++vk+avf+f+++ +vd+vlgpemkstGEvmgi+ d+  a++ka+la++++++
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  872 ELGV---KEVEPEYYSVKEAVFPFNRFPEVDPVLGPEMKSTGEVMGIDPDVGIAFYKAQLAAGSRLP 935 
                                                *876...9999****************************************************8887 PP

                                 TIGR01369  938 kk...gsvllsvkdkdkeellelakklaekglkvyategtakvleeagikaevvlkvseea.ekile 1000
                                                 +   g+v++svkdkdk ++ ++ak+la++g+k+++tegt ++l+e+gi +e+v+k +e + ++i +
  lcl|NCBI__GCF_000185805.1:WP_013538311.1  936 LDpscGKVFISVKDKDKPKIYSVAKQLADMGFKIVSTEGTYRFLREKGIPVELVYKLQEGRrPNIGD 1002
                                                54222889************************************************999876999** PP

                                 TIGR01369 1001 llkeeeielvinltskkkkaaekgykirreaveykvplvteletaeallea 1051
                                                l+k++e+ l+in+++ ++++++++y+irr av+y++p+ t++++ae++++a
  lcl|NCBI__GCF_000185805.1:WP_013538311.1 1003 LIKNGEVCLIINTPT-GTRSKRDAYSIRRLAVNYGIPYYTTVRGAEMAVRA 1052
                                                ************998.778999************************99987 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (1052 nodes)
Target sequences:                          1  (1073 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.08u 0.03s 00:00:00.11 Elapsed: 00:00:00.11
# Mc/sec: 9.84
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory