GapMind for Amino acid biosynthesis

 

Alignments for a candidate for ramA in Sinorhizobium fredii NGR234

Align ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) (characterized)
to candidate YP_002826366.1 NGR_c18490 electron transfer protein

Query= reanno::Phaeo:GFF1501
         (698 letters)



>NCBI__GCF_000018545.1:YP_002826366.1
          Length = 691

 Score =  771 bits (1991), Expect = 0.0
 Identities = 398/690 (57%), Positives = 495/690 (71%), Gaps = 7/690 (1%)

Query: 8   EISTETATDPASHPLVVFTPSGKRGRFPVGTPVLTAARQLGVDLDSVCGGRGICSKCQIT 67
           EI T         PLV+F PSGKRGRFP+GTP+L AAR LGV ++SVCGGR  C +CQ++
Sbjct: 8   EIETNMTEAAQKKPLVLFMPSGKRGRFPIGTPILDAARSLGVYVESVCGGRATCGRCQVS 67

Query: 68  PSYGEFSKHGVTVADDALTEWNKVEQRYKDKRGLIDGRRLGCQAQVQGDVVIDVPPESQV 127
              G F+KH +  +++ ++     EQRY   R L DGRRL C AQ+ GD+VIDVP ++ +
Sbjct: 68  VQEGNFAKHKIVSSNEHISPVGPKEQRYASVRELPDGRRLSCSAQILGDLVIDVPQDTVI 127

Query: 128 HRQVVRKRAEARDITMNPSTRLYYVEVEEPDMHKPTGDMERLIEALDAQWDLKGVKTDLH 187
           + QVVRK A  R I  N + +L YVEVEEPDMHKP GD++RL   L+  W  K +    H
Sbjct: 128 NAQVVRKAATDRVIERNAAVQLCYVEVEEPDMHKPLGDLDRLKAVLEKDWGWKDLLIAPH 187

Query: 188 ILSVLQPALRKGGWKVTVAVHLGDENHPPKIMHIWPGFYEGSIYGLAVDLGSTTIAAHLC 247
           ++  LQ  LRKG W VT A+H   ++  P I+ + PG  +   YG+A D+GSTTIA HL 
Sbjct: 188 LIPQLQGILRKGNWAVTAAIHRDMDSSRPFIVGLSPGL-KNEAYGVACDIGSTTIAMHLV 246

Query: 248 DLKTGDVVASSGIMNPQIRFGEDLMSRVSYSMMNKGGDQEMTRAVREGMNALFTQIAAEA 307
            L +G +VASSG  NPQIRFGEDLMSRVSY MMN  G + MT+AVRE +N L  ++ AE 
Sbjct: 247 SLLSGRIVASSGASNPQIRFGEDLMSRVSYVMMNPDGREAMTKAVREAVNGLIGKVCAEG 306

Query: 308 EIDKALIVDAVFVCNPVMHHLFLGIDPFELGQAPFALATSNALALRAVELDLNIHPAARV 367
           EID+  I+D V V NP+MHHLFLGIDP ELGQAPFALA S AL   A E+D+ ++  AR+
Sbjct: 307 EIDRHDILDMVVVGNPIMHHLFLGIDPTELGQAPFALAVSGALQYWAHEIDIEVNRGARL 366

Query: 368 YLLPCIAGHVGADAAAVALSEAPDKSEDLVLVVDVGTNAEILLGNKDKVLACSSPTGPAF 427
           Y+LPCIAGHVGADAA   LSE P + + ++L+VDVGTNAEI+LGNK++V+A SSPTGPAF
Sbjct: 367 YMLPCIAGHVGADAAGATLSEGPHRQDKMMLLVDVGTNAEIVLGNKERVVAASSPTGPAF 426

Query: 428 EGAQISSGQRAAPGAIERVEINPETKEPRFRVIGSDIWSDEDGFAAAVATTGITGICGSG 487
           EGA+ISSGQRAAPGAIERV I+PET EPRFRVIG D WSDE+GFA A A TG+TGICGS 
Sbjct: 427 EGAEISSGQRAAPGAIERVRIDPETLEPRFRVIGVDKWSDEEGFAEAAAATGVTGICGSA 486

Query: 488 IIEAIAEMRMAGLLDASGLIGSAEQTGTTRCIQDGRTNAYLLWDGSVEGGPTITVTNPDI 547
           IIE +AEM + G++   G++  A    + R + +GRT +YLL DG     P ITVT  DI
Sbjct: 487 IIEVVAEMYLTGIISQDGVVDGAMAARSPRIVPNGRTFSYLLHDGE----PKITVTQNDI 542

Query: 548 RAIQMAKAALYSGARLLMDKFGIDTVDRVVLAGAFGAHISAKHAMVLGMIPDCPLDKVTS 607
           RAIQ+AKAALY+G +LLM+K G++ VD +  AGAFG+ I  K+AMVLG+IPDC L +V +
Sbjct: 543 RAIQLAKAALYAGIKLLMEKQGVEHVDTIRFAGAFGSFIDPKYAMVLGLIPDCDLAEVKA 602

Query: 608 AGNAAGTGARIALLNTEARSEIEATVQQIEKIETAVEPRFQEHFVNASAIPNSAEPFPIL 667
            GNAAGTGA +ALLN   R EIE TV++IEKIETA+E +FQEHFVNA A+PN  + FP L
Sbjct: 603 VGNAAGTGALMALLNRGHRREIEETVRKIEKIETALESKFQEHFVNAMAMPNKVDAFPKL 662

Query: 668 SSIVTLPEANFNTGGGDGNEVGGRRRRRRR 697
           + +VTLPE        DG E GGRRRRR R
Sbjct: 663 AEVVTLPER--KVPADDGGEGGGRRRRRSR 690


Lambda     K      H
   0.318    0.135    0.396 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 1149
Number of extensions: 41
Number of successful extensions: 3
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 698
Length of database: 691
Length adjustment: 39
Effective length of query: 659
Effective length of database: 652
Effective search space:   429668
Effective search space used:   429668
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.3 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.7 bits)
S2: 54 (25.4 bits)

Align candidate YP_002826366.1 NGR_c18490 (electron transfer protein)
to HMM PF14574 (RACo_C_ter)

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.aa/PF14574.10.hmm
# target sequence database:        /tmp/gapView.29327.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       RACo_C_ter  [M=261]
Accession:   PF14574.10
Description: C-terminal domain of RACo the ASKHA domain
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
    2.3e-95  304.6   0.7    3.8e-95  303.9   0.7    1.3  1  lcl|NCBI__GCF_000018545.1:YP_002826366.1  NGR_c18490 electron transfer pro


Domain annotation for each sequence (and alignments):
>> lcl|NCBI__GCF_000018545.1:YP_002826366.1  NGR_c18490 electron transfer protein
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  303.9   0.7   3.8e-95   3.8e-95       1     260 [.     394     662 ..     394     663 .. 0.97

  Alignments for each domain:
  == domain 1  score: 303.9 bits;  conditional E-value: 3.8e-95
                                RACo_C_ter   1 eislliDiGTNaEivlgnkdwllaasaaaGPAlEGgeikcGmrAapgAierveidpetlevelkvigne 69 
                                               +++ll+D+GTNaEivlgnk++++aas+++GPA+EG+ei++G+rAapgAierv+idpetle++++vig +
  lcl|NCBI__GCF_000018545.1:YP_002826366.1 394 KMMLLVDVGTNAEIVLGNKERVVAASSPTGPAFEGAEISSGQRAAPGAIERVRIDPETLEPRFRVIGVD 462
                                               589****************************************************************99 PP

                                RACo_C_ter  70 k...............pkGicGsGiidliaelleagiidkkgklnkel..kserireeeeteeyvlvla 121
                                               k               ++GicGs+ii+++ae++++gii+++g ++  +  +s+ri  + +t +y+l++ 
  lcl|NCBI__GCF_000018545.1:YP_002826366.1 463 KwsdeegfaeaaaatgVTGICGSAIIEVVAEMYLTGIISQDGVVDGAMaaRSPRIVPNGRTFSYLLHDG 531
                                               999*****************************************99877799**************998 PP

                                RACo_C_ter 122 eesetekdivitekDidelirakaAiyagvktLleevglevedidkvylaGafGsyidlekAitiGllP 190
                                               e      +i++t++Di++++ akaA+yag+k+L+e+ g  ve++d++ +aGafGs+id+++A+++Gl+P
  lcl|NCBI__GCF_000018545.1:YP_002826366.1 532 E-----PKITVTQNDIRAIQLAKAALYAGIKLLMEKQG--VEHVDTIRFAGAFGSFIDPKYAMVLGLIP 593
                                               7.....58******************************..9**************************** PP

                                RACo_C_ter 191 dlelekvkqvGNtslagAraallsreareeleeiarkityielavekkFmeefvaalflphtdlelfps 259
                                               d++l++vk+vGN++++gA +all+r +r+e+ee +rki++ie+a e+kF+e+fv+a+++p+ ++++fp+
  lcl|NCBI__GCF_000018545.1:YP_002826366.1 594 DCDLAEVKAVGNAAGTGALMALLNRGHRREIEETVRKIEKIETALESKFQEHFVNAMAMPN-KVDAFPK 661
                                               *************************************************************.7789997 PP

                                RACo_C_ter 260 v 260
                                               +
  lcl|NCBI__GCF_000018545.1:YP_002826366.1 662 L 662
                                               6 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (261 nodes)
Target sequences:                          1  (691 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.01u 0.00s 00:00:00.01 Elapsed: 00:00:00.00
# Mc/sec: 18.32
//
[ok]

This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory