GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ramA in Desulfacinum hydrothermale DSM 13146

Align ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) (characterized)
to candidate WP_084058852.1 B9A12_RS14650 DUF4445 domain-containing protein

Query= reanno::Phaeo:GFF1501
         (698 letters)



>NCBI__GCF_900176285.1:WP_084058852.1
          Length = 651

 Score =  304 bits (778), Expect = 1e-86
 Identities = 204/677 (30%), Positives = 334/677 (49%), Gaps = 70/677 (10%)

Query: 23  VVFTPSGKRGRFPVGTPVLTAARQLGVDLDSVCGGRGICSKCQITPSYG--EFSKHGVTV 80
           V F P  +      G  +L AA Q G+ ++S+CGG G+C KC++    G  E S  GV  
Sbjct: 11  VTFQPENRVVEASPGDTLLDAAAQAGIYINSLCGGEGVCGKCRLKVLSGQVEMSSQGVG- 69

Query: 81  ADDALTEWNKVEQRYKDKRGLIDGRRLGCQAQVQG-DVVIDVPPESQVHRQVVRK----- 134
                         + D++ L  G  L CQ+ ++  DV + +PPE++   + +       
Sbjct: 70  --------------FLDRKELDAGFVLACQSTLKDQDVEVWIPPEARQEEEQILMVDNIV 115

Query: 135 ---RAEARDITMNPSTRLYY--------VEVEEPDMHKPTGDMERLIEAL-----DAQWD 178
                  ++    P+   YY        +++ EP +     D+ER+  AL     D +W+
Sbjct: 116 HYAEPSPQEAGTVPAPVPYYKPLCDKVFLKLPEPTIQDNLSDLERIYRALARKHPDVKWE 175

Query: 179 LKGVKTDLHILSVLQPALRKGGWKVTVAVHLGDEN-HPPKIMHIWPGFYEGSIYGLAVDL 237
                +D   L  L   LRK  W+VT  VH  D   H  + +   PG      YG+A+D+
Sbjct: 176 -----SDFACLKDLAHLLRKNNWEVTALVHCLDAQCHHVRALE--PGDTSKRTYGVAIDV 228

Query: 238 GSTTIAAHLCDLKTGDVVASSGIMNPQIRFGEDLMSRVSYSMMNKGGDQEMTRAVREGMN 297
           G+TTI A L DLKTG V+      N Q R+GED++SR+ ++   +GG   +  AV   +N
Sbjct: 229 GTTTIVAQLVDLKTGKVIGVEASHNQQARYGEDVISRMIFAC-GRGGVDPLKNAVVTTIN 287

Query: 298 ALFTQIAAEAEIDKALIVDAVFVCNPVMHHLFLGIDPFELGQAPFALATSNALALRAVEL 357
           +L   + A A I    IV  V   N  M HL +G++P  +   P+    +     RA E+
Sbjct: 288 SLIHSLVAGAGIQPTDIVSFVAAGNTTMTHLLVGLEPCTIRVEPYIPTATRIPWARAAEV 347

Query: 358 DLNIHPAARVYLLPCIAGHVGADAAAVALSEAPDKSEDLVLVVDVGTNAEILLGNKDKVL 417
            L  HP A ++ +PC++ +VG D  A  L+   + S  L  ++D+GTN EI++GN + ++
Sbjct: 348 GLTGHPDALLHCMPCVSSYVGGDITAGVLACGMNDSSQLSALIDIGTNGEIVVGNNEWLV 407

Query: 418 ACSSPTGPAFEGAQISSGQRAAPGAIERVEINPETKEPRFRVIGSDIWSDEDGFAAAVAT 477
            CS+  GPAFEG     G RA  GA+++V I+ +  E   + IG                
Sbjct: 408 CCSASAGPAFEGGGTKCGMRATKGAVQKVRIHGDRVE--IQTIGGG-------------- 451

Query: 478 TGITGICGSGIIEAIAEMRMAGLLDASGLIGSAEQTGTTRCIQDGRTNAYLLWDGSVEGG 537
               GICGSG+I+ +AE+   G++D +G   + +       + DG     +  +   E G
Sbjct: 452 -KARGICGSGLIDCMAELVAEGIIDQNGKFIALDHPRVR--VTDGVPEFVVAQESESETG 508

Query: 538 PTITVTNPDIRAIQMAKAALYSGARLLMDKFGID--TVDRVVLAGAFGAHISAKHAMVLG 595
             + +T  DI  +  +KAA+ +  ++L++  G+    +DR+ +AG FGAH+  + ++ +G
Sbjct: 509 EAVVITEDDIGNLMKSKAAVLAAMKILLEGLGLQFFDLDRLYVAGGFGAHLDIEKSIRIG 568

Query: 596 MIPDCPLDKVTSAGNAAGTGARIALLNTEARSEIEATVQQIEKIETAVEPRFQEHFVNAS 655
           ++PD P +K+   GN++  GAR ALL+T A  +  A  +Q+   E +V   F   FV A 
Sbjct: 569 LLPDVPKEKILFIGNSSVAGARQALLSTHAYRKANAIARQMTYFELSVHAGFMNEFVAAL 628

Query: 656 AIPNSAEP-FPILSSIV 671
            +P++ E  FP +  ++
Sbjct: 629 FLPHTDESLFPSVRQVL 645


Lambda     K      H
   0.318    0.135    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 982
Number of extensions: 54
Number of successful extensions: 6
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 698
Length of database: 651
Length adjustment: 39
Effective length of query: 659
Effective length of database: 612
Effective search space:   403308
Effective search space used:   403308
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_084058852.1 B9A12_RS14650 (DUF4445 domain-containing protein)
to HMM PF14574 (RACo_C_ter)

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/PF14574.10.hmm
# target sequence database:        /tmp/gapView.15132.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       RACo_C_ter  [M=261]
Accession:   PF14574.10
Description: C-terminal domain of RACo the ASKHA domain
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   2.4e-105  337.4   1.1   8.1e-105  335.6   0.2    2.0  2  lcl|NCBI__GCF_900176285.1:WP_084058852.1  B9A12_RS14650 DUF4445 domain-con


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_900176285.1:WP_084058852.1  B9A12_RS14650 DUF4445 domain-containing protein
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 ?    0.1   0.0     0.019     0.019     146     174 ..     285     313 ..     275     318 .. 0.84
   2 !  335.6   0.2  8.1e-105  8.1e-105       2     260 ..     386     641 ..     385     642 .. 0.98

  Alignments for each domain:
  == domain 1  score: 0.1 bits;  conditional E-value: 0.019
                                RACo_C_ter 146 AiyagvktLleevglevedidkvylaGaf 174
                                                i++ ++ L+  +g++ +di +++ aG +
  lcl|NCBI__GCF_900176285.1:WP_084058852.1 285 TINSLIHSLVAGAGIQPTDIVSFVAAGNT 313
                                               5778888899999*************975 PP

  == domain 2  score: 335.6 bits;  conditional E-value: 8.1e-105
                                RACo_C_ter   2 islliDiGTNaEivlgnkdwllaasaaaGPAlEGgeikcGmrAapgAierveidpetlevelkvignek 70 
                                               +s liDiGTN+Eiv+gn++wl+++sa+aGPA+EGg++kcGmrA++gA+++v+i+ +   ve+++ig+ k
  lcl|NCBI__GCF_900176285.1:WP_084058852.1 386 LSALIDIGTNGEIVVGNNEWLVCCSASAGPAFEGGGTKCGMRATKGAVQKVRIHGDR--VEIQTIGGGK 452
                                               6789**************************************************999..9********* PP

                                RACo_C_ter  71 pkGicGsGiidliaelleagiidkkgklnkelkserireeeeteeyvlvlaeesetekdivitekDide 139
                                               ++GicGsG+id++ael+ +giid++gk+   l+++r+r +++ +e+v+++++eset++ +vite Di +
  lcl|NCBI__GCF_900176285.1:WP_084058852.1 453 ARGICGSGLIDCMAELVAEGIIDQNGKF-IALDHPRVRVTDGVPEFVVAQESESETGEAVVITEDDIGN 520
                                               ****************************.5579************************************ PP

                                RACo_C_ter 140 lirakaAiyagvktLleevglevedidkvylaGafGsyidlekAitiGllPdlelekvkqvGNtslagA 208
                                               l+++kaA+ a++k+Lle +gl++ d+d++y+aG+fG+++d+ek+i+iGllPd+++ek+ ++GN+s+agA
  lcl|NCBI__GCF_900176285.1:WP_084058852.1 521 LMKSKAAVLAAMKILLEGLGLQFFDLDRLYVAGGFGAHLDIEKSIRIGLLPDVPKEKILFIGNSSVAGA 589
                                               ********************************************************************* PP

                                RACo_C_ter 209 raallsreareeleeiarkityielavekkFmeefvaalflphtdlelfpsv 260
                                               r+alls++a++++++iar++ty+el+v++ Fm+efvaalflphtd +lfpsv
  lcl|NCBI__GCF_900176285.1:WP_084058852.1 590 RQALLSTHAYRKANAIARQMTYFELSVHAGFMNEFVAALFLPHTDESLFPSV 641
                                               **************************************************98 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (261 nodes)
Target sequences:                          1  (651 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00
# Mc/sec: 17.91
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory