GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ramA in Desulfatibacillum aliphaticivorans DSM 15576

Align ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) (characterized)
to candidate WP_028314329.1 G491_RS0108760 DUF4445 domain-containing protein

Query= reanno::Phaeo:GFF1501
         (698 letters)



>NCBI__GCF_000429905.1:WP_028314329.1
          Length = 610

 Score =  305 bits (781), Expect = 4e-87
 Identities = 207/641 (32%), Positives = 317/641 (49%), Gaps = 53/641 (8%)

Query: 23  VVFTPSGKRGRFPVGTPVLTAARQLGVDLDSVCGGRGICSKCQITPSYGEFSKHGVTVAD 82
           V F P G+R   P  T +  AA+  GV L + CGG+G C KC++    G     G    D
Sbjct: 11  VEFQPLGRRIEAPFETTIAQAAQSAGVPLAADCGGKGKCGKCRVHILAG-----GAAPPD 65

Query: 83  DALTEWNKVEQRYKDKRGLIDGRRLGCQAQVQGDVVIDVPPESQVHRQVVRKRAEARDIT 142
            +       E +  ++       RL C  ++ GDV I VP  S VH Q ++     R   
Sbjct: 66  SS-------ELKVLEQSSAAPQERLACMTRILGDVKIHVPKASMVHEQSLQLEGRMRSPD 118

Query: 143 MNPSTRLYYVEVEEPDMHKPTGDMERLIEALDAQWDLKGVKTDLHILSVLQPALRKGGWK 202
            +   +  +V +  P++     D  RL EA+DA WD  G       +  L  +LR+   +
Sbjct: 119 GDRLVQSRFVNLPLPNLQDQRSDSRRLSEAMDA-WDENGWTLSPEFVRRLSGSLRESNGE 177

Query: 203 VTVAVHLGDENHPPKIMHIWPGFYEG--SIYGLAVDLGSTTIAAHLCDLKTGDVVASSGI 260
           +TV +  G             G   G  +  G A DLG+TTIA  L DL+TG+++ S G 
Sbjct: 178 LTVFLQDGAPI----------GLIAGKKTPIGAAFDLGTTTIAGRLVDLETGEILCSEGC 227

Query: 261 MNPQIRFGEDLMSRVSYSMMNKGGDQEMTRAVREGMNALFTQIAAEAEIDKALIVDAVFV 320
           MNPQI +GED++SR+ Y++ N  G   ++ A ++ +N L + +   A ++   + +    
Sbjct: 228 MNPQISYGEDVISRLDYAIHNPDGPGRLSAAAKDAINDLLSALCKNAGVEPERVSNISVA 287

Query: 321 CNPVMHHLFLGIDPFELGQAPFALATSNALALRAVELDLNIHPAARVYLLPCIAGHVGAD 380
           CN  M HL L +    L ++P+A   SN L L      +   P A+VY+ PCI G VG D
Sbjct: 288 CNTAMSHLLLKLPASPLARSPYAAGFSNPLELTGKAFGIKDAPTAKVYVFPCIEGFVGGD 347

Query: 381 AAAVALSEAPDKSEDLVLVVDVGTNAEILL---GNKDKVLACSSPTGPAFEGAQISSGQR 437
             A+ L+   D++++  L VD+GTN EI+L   G    +   S  +GPA EGA +  G R
Sbjct: 348 HTAMILACGLDQADETCLGVDIGTNTEIVLTRPGADGGMFVTSCASGPALEGAHVRDGMR 407

Query: 438 AAPGAIERVEINPETKEPRFRVIGSDIWSDEDGFAAAVATTGITGICGSGIIEAIAEMRM 497
           A+PGAI +V I            G +I++        +      GICGSG+++A+AEM  
Sbjct: 408 ASPGAIHKVRITEN---------GPEIFT--------INNEPPVGICGSGLVDALAEMVR 450

Query: 498 AGLLDASGLIGSAEQTGTTRCIQDGRTNAYLL-WDGSVEGGPTITVTNPDIRAIQMAKAA 556
           AG+LD+ G   + E  G  +     R   YLL      +    I +T  DI  +Q+AK A
Sbjct: 451 AGVLDSRGHF-TYEAKGVHK---SPRGKYYLLATKEKDQAKQDIIITQKDISELQLAKGA 506

Query: 557 LYSGARLLMDKFGIDT--VDRVVLAGAFGAHISAKHAMVLGMIPDCPLD-KVTSAGNAAG 613
           +++G + L+ + GI    + RV +AGAFG+H++ + A+ + ++P+   D +   AGNAA 
Sbjct: 507 IHAGVQKLLQQMGISAKEISRVYMAGAFGSHLNMESALAIRLLPEDLADAEFVQAGNAAA 566

Query: 614 TGARIALLNTEARSEIEATVQQIEKIETAVEPRFQEHFVNA 654
            GA +ALL+   RS  E   +    +E A +  F    V A
Sbjct: 567 DGACLALLSRRERSRAEKIARNAVHVEMANDSSFSSVLVKA 607


Lambda     K      H
   0.318    0.135    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 906
Number of extensions: 42
Number of successful extensions: 7
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 698
Length of database: 610
Length adjustment: 38
Effective length of query: 660
Effective length of database: 572
Effective search space:   377520
Effective search space used:   377520
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate WP_028314329.1 G491_RS0108760 (DUF4445 domain-containing protein)
to HMM PF14574 (RACo_C_ter)

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/PF14574.10.hmm
# target sequence database:        /tmp/gapView.13658.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       RACo_C_ter  [M=261]
Accession:   PF14574.10
Description: C-terminal domain of RACo the ASKHA domain
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
      3e-82  261.6   0.6    8.9e-81  256.8   0.6    2.1  2  lcl|NCBI__GCF_000429905.1:WP_028314329.1  G491_RS0108760 DUF4445 domain-co


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000429905.1:WP_028314329.1  G491_RS0108760 DUF4445 domain-containing protein
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !    2.8   0.0    0.0028    0.0028     142     171 ..     258     287 ..     216     293 .. 0.94
   2 !  256.8   0.6   8.9e-81   8.9e-81       1     248 [.     362     609 ..     362     610 .] 0.96

  Alignments for each domain:
  == domain 1  score: 2.8 bits;  conditional E-value: 0.0028
                                RACo_C_ter 142 rakaAiyagvktLleevglevedidkvyla 171
                                                ak Ai+  ++ L++++g+e e+++++ +a
  lcl|NCBI__GCF_000429905.1:WP_028314329.1 258 AAKDAINDLLSALCKNAGVEPERVSNISVA 287
                                               6999***********************998 PP

  == domain 2  score: 256.8 bits;  conditional E-value: 8.9e-81
                                RACo_C_ter   1 eislliDiGTNaEivl...gnkdwllaasaaaGPAlEGgeikcGmrAapgAierveidpetlevelkvi 66 
                                               e++l +DiGTN Eivl   g +  ++++s+a+GPAlEG++++ GmrA+pgAi++v+i+++   +e+ +i
  lcl|NCBI__GCF_000429905.1:WP_028314329.1 362 ETCLGVDIGTNTEIVLtrpGADGGMFVTSCASGPALEGAHVRDGMRASPGAIHKVRITENG--PEIFTI 428
                                               6899************44446789************************************9..****** PP

                                RACo_C_ter  67 gnekpkGicGsGiidliaelleagiidkkgklnkelkserireeeeteeyvlvlaeesetekdivitek 135
                                               +ne+p+GicGsG++d++ae+++ag++d++g+++ e   + ++++ + + y+l+++e+ + ++di+it+k
  lcl|NCBI__GCF_000429905.1:WP_028314329.1 429 NNEPPVGICGSGLVDALAEMVRAGVLDSRGHFTYE--AKGVHKSPRGKYYLLATKEKDQAKQDIIITQK 495
                                               ********************************665..78999999************************ PP

                                RACo_C_ter 136 DidelirakaAiyagvktLleevglevedidkvylaGafGsyidlekAitiGllPd.lelekvkqvGNt 203
                                               Di+el+ ak+Ai+agv+ Ll+++g+++++i++vy+aGafGs++++e+A+ i llP+ l+ ++++q+GN+
  lcl|NCBI__GCF_000429905.1:WP_028314329.1 496 DISELQLAKGAIHAGVQKLLQQMGISAKEISRVYMAGAFGSHLNMESALAIRLLPEdLADAEFVQAGNA 564
                                               *******************************************************7588999******* PP

                                RACo_C_ter 204 slagAraallsreareeleeiarkityielavekkFmeefvaalf 248
                                               +  gA  allsr++r+++e+iar+  ++e+a++++F +  v+a+ 
  lcl|NCBI__GCF_000429905.1:WP_028314329.1 565 AADGACLALLSRRERSRAEKIARNAVHVEMANDSSFSSVLVKAMR 609
                                               *****************************************9986 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (261 nodes)
Target sequences:                          1  (610 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.01s 00:00:00.02 Elapsed: 00:00:00.00
# Mc/sec: 16.56
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory