GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ramA in Sinorhizobium meliloti 1021

Align ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) (characterized)
to candidate SMc04347 SMc04347 hypothetical protein

Query= reanno::Phaeo:GFF1501
         (698 letters)



>FitnessBrowser__Smeli:SMc04347
          Length = 683

 Score =  761 bits (1966), Expect = 0.0
 Identities = 389/682 (57%), Positives = 491/682 (71%), Gaps = 7/682 (1%)

Query: 16  DPASHPLVVFTPSGKRGRFPVGTPVLTAARQLGVDLDSVCGGRGICSKCQITPSYGEFSK 75
           D  + PLV+F PSGKRGRFPVGTP+L AAR LGV ++SVCGGR  C +CQ++   G F+K
Sbjct: 8   DEKNDPLVLFMPSGKRGRFPVGTPILDAARSLGVYVESVCGGRATCGRCQVSVQEGNFAK 67

Query: 76  HGVTVADDALTEWNKVEQRYKDKRGLIDGRRLGCQAQVQGDVVIDVPPESQVHRQVVRKR 135
           H +  + D ++     EQRY   R L DGRRL C +Q+ GD+VIDVP ++ ++ QVVRK 
Sbjct: 68  HKIVSSSDHISPIGPKEQRYASVRELPDGRRLSCSSQILGDLVIDVPQDTVINAQVVRKA 127

Query: 136 AEARDITMNPSTRLYYVEVEEPDMHKPTGDMERLIEALDAQWDLKGVKTDLHILSVLQPA 195
           A  R I  N + +L YVE++EPDMHKP GD++R+   L+  W  K +    H++  +Q  
Sbjct: 128 ASDRVIERNAAVQLCYVEIDEPDMHKPLGDLDRMKAVLEKDWGWKDLLIAPHLIPQVQGI 187

Query: 196 LRKGGWKVTVAVHLGDENHPPKIMHIWPGFYEGSIYGLAVDLGSTTIAAHLCDLKTGDVV 255
           LRKG W VT A+H   ++  P I+ +WPG  +   YG+A D+GSTTIA HL  L +G +V
Sbjct: 188 LRKGNWAVTAAIHRDMDSSRPFIVALWPGL-KNEAYGVACDIGSTTIAMHLVSLLSGRIV 246

Query: 256 ASSGIMNPQIRFGEDLMSRVSYSMMNKGGDQEMTRAVREGMNALFTQIAAEAEIDKALIV 315
           ASSG  NPQIRFGEDLMSRVSY MMN  G + MT+AVRE +N L  ++ AE E+D+  I+
Sbjct: 247 ASSGTSNPQIRFGEDLMSRVSYVMMNPDGREAMTKAVREALNGLIGKVCAEGEVDRHDIL 306

Query: 316 DAVFVCNPVMHHLFLGIDPFELGQAPFALATSNALALRAVELDLNIHPAARVYLLPCIAG 375
           D V V NP+MHHLFLGIDP ELGQAPFALA S AL   A E+D+ ++  AR+Y+LPCIAG
Sbjct: 307 DMVVVANPIMHHLFLGIDPTELGQAPFALAVSGALQYWAHEIDIEVNRGARLYMLPCIAG 366

Query: 376 HVGADAAAVALSEAPDKSEDLVLVVDVGTNAEILLGNKDKVLACSSPTGPAFEGAQISSG 435
           HVGADAA   LSE P + + ++L+VDVGTNAEI+LGN+++V+A SSPTGPAFEGA+ISSG
Sbjct: 367 HVGADAAGATLSEGPHRQDKMMLLVDVGTNAEIVLGNRERVVAASSPTGPAFEGAEISSG 426

Query: 436 QRAAPGAIERVEINPETKEPRFRVIGSDIWSDEDGFAAAVATTGITGICGSGIIEAIAEM 495
           QRAAPGAIERV I+PET EPRFRVIG D WS+E+GFA A A  G+TGICGS IIE +AEM
Sbjct: 427 QRAAPGAIERVRIDPETLEPRFRVIGVDKWSNEEGFAEAAAAVGVTGICGSAIIEVVAEM 486

Query: 496 RMAGLLDASGLIGSAEQTGTTRCIQDGRTNAYLLWDGSVEGGPTITVTNPDIRAIQMAKA 555
            + G++   G++  A    + R I +GRT +YLL     EG   ITVT  DIRAIQ+AK+
Sbjct: 487 YLTGIISQDGVVDGAMVAKSPRIIPNGRTFSYLLH----EGEQRITVTQNDIRAIQLAKS 542

Query: 556 ALYSGARLLMDKFGIDTVDRVVLAGAFGAHISAKHAMVLGMIPDCPLDKVTSAGNAAGTG 615
           ALY+G +LLM+K G+D VD +  AGAFG+ I  K+AMVLG+IPDC L +V + GNAAGTG
Sbjct: 543 ALYAGIKLLMEKQGVDHVDTIRFAGAFGSFIDPKYAMVLGLIPDCDLTEVKAVGNAAGTG 602

Query: 616 ARIALLNTEARSEIEATVQQIEKIETAVEPRFQEHFVNASAIPNSAEPFPILSSIVTLPE 675
           A +ALLN   R EIE TV++IEKIETA+E +FQEHFVNA A+PN  + FP L+ +VTLP 
Sbjct: 603 ALMALLNRGHRREIEQTVRKIEKIETALESKFQEHFVNAMAMPNKVDAFPKLAEVVTLPA 662

Query: 676 ANFNTGGGDGNEVGGRRRRRRR 697
               T   DG E  GRRRRR R
Sbjct: 663 RKSLT--DDGGEGSGRRRRRSR 682


Lambda     K      H
   0.318    0.135    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1123
Number of extensions: 34
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 698
Length of database: 683
Length adjustment: 39
Effective length of query: 659
Effective length of database: 644
Effective search space:   424396
Effective search space used:   424396
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate SMc04347 SMc04347 (hypothetical protein)
to HMM PF14574 (RACo_C_ter)

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/PF14574.10.hmm
# target sequence database:        /tmp/gapView.12705.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       RACo_C_ter  [M=261]
Accession:   PF14574.10
Description: C-terminal domain of RACo the ASKHA domain
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                           Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                           -----------
      5e-95  303.5   0.8    7.4e-95  303.0   0.8    1.2  1  lcl|FitnessBrowser__Smeli:SMc04347  SMc04347 hypothetical protein


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Smeli:SMc04347  SMc04347 hypothetical protein
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  303.0   0.8   7.4e-95   7.4e-95       1     260 [.     386     654 ..     386     655 .. 0.98

  Alignments for each domain:
  == domain 1  score: 303.0 bits;  conditional E-value: 7.4e-95
                          RACo_C_ter   1 eislliDiGTNaEivlgnkdwllaasaaaGPAlEGgeikcGmrAapgAierveidpetlevelkvignek..... 70 
                                         +++ll+D+GTNaEivlgn+++++aas+++GPA+EG+ei++G+rAapgAierv+idpetle++++vig +k     
  lcl|FitnessBrowser__Smeli:SMc04347 386 KMMLLVDVGTNAEIVLGNRERVVAASSPTGPAFEGAEISSGQRAAPGAIERVRIDPETLEPRFRVIGVDKwsnee 460
                                         589****************************************************************999999** PP

                          RACo_C_ter  71 ..........pkGicGsGiidliaelleagiidkkgklnkel..kserireeeeteeyvlvlaeesetekdivit 133
                                                   ++GicGs+ii+++ae++++gii+++g ++  +  ks+ri  + +t +y+l++ e     ++i++t
  lcl|FitnessBrowser__Smeli:SMc04347 461 gfaeaaaavgVTGICGSAIIEVVAEMYLTGIISQDGVVDGAMvaKSPRIIPNGRTFSYLLHEGE-----QRITVT 530
                                         **************************************99988899**************9987.....69**** PP

                          RACo_C_ter 134 ekDidelirakaAiyagvktLleevglevedidkvylaGafGsyidlekAitiGllPdlelekvkqvGNtslagA 208
                                         ++Di++++ ak+A+yag+k+L+e+ g  v+++d++ +aGafGs+id+++A+++Gl+Pd++l +vk+vGN++++gA
  lcl|FitnessBrowser__Smeli:SMc04347 531 QNDIRAIQLAKSALYAGIKLLMEKQG--VDHVDTIRFAGAFGSFIDPKYAMVLGLIPDCDLTEVKAVGNAAGTGA 603
                                         **************************..9********************************************** PP

                          RACo_C_ter 209 raallsreareeleeiarkityielavekkFmeefvaalflphtdlelfpsv 260
                                          +all+r +r+e+e+ +rki++ie+a e+kF+e+fv+a+++p+ ++++fp++
  lcl|FitnessBrowser__Smeli:SMc04347 604 LMALLNRGHRREIEQTVRKIEKIETALESKFQEHFVNAMAMPN-KVDAFPKL 654
                                         *******************************************.77899976 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (261 nodes)
Target sequences:                          1  (683 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.01
# Mc/sec: 17.11
//
[ok]

This GapMind analysis is from Apr 09 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory