Align ATP-dependent reduction of co(II)balamin (RamA-like) (EC:2.1.1.13) (characterized)
to candidate WP_084058852.1 B9A12_RS14650 DUF4445 domain-containing protein
Query= reanno::Phaeo:GFF1501 (698 letters) >NCBI__GCF_900176285.1:WP_084058852.1 Length = 651 Score = 304 bits (778), Expect = 1e-86 Identities = 204/677 (30%), Positives = 334/677 (49%), Gaps = 70/677 (10%) Query: 23 VVFTPSGKRGRFPVGTPVLTAARQLGVDLDSVCGGRGICSKCQITPSYG--EFSKHGVTV 80 V F P + G +L AA Q G+ ++S+CGG G+C KC++ G E S GV Sbjct: 11 VTFQPENRVVEASPGDTLLDAAAQAGIYINSLCGGEGVCGKCRLKVLSGQVEMSSQGVG- 69 Query: 81 ADDALTEWNKVEQRYKDKRGLIDGRRLGCQAQVQG-DVVIDVPPESQVHRQVVRK----- 134 + D++ L G L CQ+ ++ DV + +PPE++ + + Sbjct: 70 --------------FLDRKELDAGFVLACQSTLKDQDVEVWIPPEARQEEEQILMVDNIV 115 Query: 135 ---RAEARDITMNPSTRLYY--------VEVEEPDMHKPTGDMERLIEAL-----DAQWD 178 ++ P+ YY +++ EP + D+ER+ AL D +W+ Sbjct: 116 HYAEPSPQEAGTVPAPVPYYKPLCDKVFLKLPEPTIQDNLSDLERIYRALARKHPDVKWE 175 Query: 179 LKGVKTDLHILSVLQPALRKGGWKVTVAVHLGDEN-HPPKIMHIWPGFYEGSIYGLAVDL 237 +D L L LRK W+VT VH D H + + PG YG+A+D+ Sbjct: 176 -----SDFACLKDLAHLLRKNNWEVTALVHCLDAQCHHVRALE--PGDTSKRTYGVAIDV 228 Query: 238 GSTTIAAHLCDLKTGDVVASSGIMNPQIRFGEDLMSRVSYSMMNKGGDQEMTRAVREGMN 297 G+TTI A L DLKTG V+ N Q R+GED++SR+ ++ +GG + AV +N Sbjct: 229 GTTTIVAQLVDLKTGKVIGVEASHNQQARYGEDVISRMIFAC-GRGGVDPLKNAVVTTIN 287 Query: 298 ALFTQIAAEAEIDKALIVDAVFVCNPVMHHLFLGIDPFELGQAPFALATSNALALRAVEL 357 +L + A A I IV V N M HL +G++P + P+ + RA E+ Sbjct: 288 SLIHSLVAGAGIQPTDIVSFVAAGNTTMTHLLVGLEPCTIRVEPYIPTATRIPWARAAEV 347 Query: 358 DLNIHPAARVYLLPCIAGHVGADAAAVALSEAPDKSEDLVLVVDVGTNAEILLGNKDKVL 417 L HP A ++ +PC++ +VG D A L+ + S L ++D+GTN EI++GN + ++ Sbjct: 348 GLTGHPDALLHCMPCVSSYVGGDITAGVLACGMNDSSQLSALIDIGTNGEIVVGNNEWLV 407 Query: 418 ACSSPTGPAFEGAQISSGQRAAPGAIERVEINPETKEPRFRVIGSDIWSDEDGFAAAVAT 477 CS+ GPAFEG G RA GA+++V I+ + E + IG Sbjct: 408 CCSASAGPAFEGGGTKCGMRATKGAVQKVRIHGDRVE--IQTIGGG-------------- 451 Query: 478 TGITGICGSGIIEAIAEMRMAGLLDASGLIGSAEQTGTTRCIQDGRTNAYLLWDGSVEGG 537 GICGSG+I+ +AE+ G++D +G + + + DG + + E G Sbjct: 452 -KARGICGSGLIDCMAELVAEGIIDQNGKFIALDHPRVR--VTDGVPEFVVAQESESETG 508 Query: 538 PTITVTNPDIRAIQMAKAALYSGARLLMDKFGID--TVDRVVLAGAFGAHISAKHAMVLG 595 + +T DI + +KAA+ + ++L++ G+ +DR+ +AG FGAH+ + ++ +G Sbjct: 509 EAVVITEDDIGNLMKSKAAVLAAMKILLEGLGLQFFDLDRLYVAGGFGAHLDIEKSIRIG 568 Query: 596 MIPDCPLDKVTSAGNAAGTGARIALLNTEARSEIEATVQQIEKIETAVEPRFQEHFVNAS 655 ++PD P +K+ GN++ GAR ALL+T A + A +Q+ E +V F FV A Sbjct: 569 LLPDVPKEKILFIGNSSVAGARQALLSTHAYRKANAIARQMTYFELSVHAGFMNEFVAAL 628 Query: 656 AIPNSAEP-FPILSSIV 671 +P++ E FP + ++ Sbjct: 629 FLPHTDESLFPSVRQVL 645 Lambda K H 0.318 0.135 0.396 Gapped Lambda K H 0.267 0.0410 0.140 Matrix: BLOSUM62 Gap Penalties: Existence: 11, Extension: 1 Number of Sequences: 1 Number of Hits to DB: 982 Number of extensions: 54 Number of successful extensions: 6 Number of sequences better than 1.0e-02: 1 Number of HSP's gapped: 1 Number of HSP's successfully gapped: 1 Length of query: 698 Length of database: 651 Length adjustment: 39 Effective length of query: 659 Effective length of database: 612 Effective search space: 403308 Effective search space used: 403308 Neighboring words threshold: 11 Window for multiple hits: 40 X1: 16 ( 7.3 bits) X2: 38 (14.6 bits) X3: 64 (24.7 bits) S1: 41 (21.7 bits) S2: 54 (25.4 bits)
Align candidate WP_084058852.1 B9A12_RS14650 (DUF4445 domain-containing protein)
to HMM PF14574 (RACo_C_ter)
# hmmsearch :: search profile(s) against a sequence database # HMMER 3.3.1 (Jul 2020); http://hmmer.org/ # Copyright (C) 2020 Howard Hughes Medical Institute. # Freely distributed under the BSD open source license. # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - # query HMM file: ../tmp/path.aa/PF14574.10.hmm # target sequence database: /tmp/gapView.15132.genome.faa # - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - Query: RACo_C_ter [M=261] Accession: PF14574.10 Description: C-terminal domain of RACo the ASKHA domain Scores for complete sequences (score includes all domains): --- full sequence --- --- best 1 domain --- -#dom- E-value score bias E-value score bias exp N Sequence Description ------- ------ ----- ------- ------ ----- ---- -- -------- ----------- 2.4e-105 337.4 1.1 8.1e-105 335.6 0.2 2.0 2 lcl|NCBI__GCF_900176285.1:WP_084058852.1 B9A12_RS14650 DUF4445 domain-con Domain annotation for each sequence (and alignments): >> lcl|NCBI__GCF_900176285.1:WP_084058852.1 B9A12_RS14650 DUF4445 domain-containing protein # score bias c-Evalue i-Evalue hmmfrom hmm to alifrom ali to envfrom env to acc --- ------ ----- --------- --------- ------- ------- ------- ------- ------- ------- ---- 1 ? 0.1 0.0 0.019 0.019 146 174 .. 285 313 .. 275 318 .. 0.84 2 ! 335.6 0.2 8.1e-105 8.1e-105 2 260 .. 386 641 .. 385 642 .. 0.98 Alignments for each domain: == domain 1 score: 0.1 bits; conditional E-value: 0.019 RACo_C_ter 146 AiyagvktLleevglevedidkvylaGaf 174 i++ ++ L+ +g++ +di +++ aG + lcl|NCBI__GCF_900176285.1:WP_084058852.1 285 TINSLIHSLVAGAGIQPTDIVSFVAAGNT 313 5778888899999*************975 PP == domain 2 score: 335.6 bits; conditional E-value: 8.1e-105 RACo_C_ter 2 islliDiGTNaEivlgnkdwllaasaaaGPAlEGgeikcGmrAapgAierveidpetlevelkvignek 70 +s liDiGTN+Eiv+gn++wl+++sa+aGPA+EGg++kcGmrA++gA+++v+i+ + ve+++ig+ k lcl|NCBI__GCF_900176285.1:WP_084058852.1 386 LSALIDIGTNGEIVVGNNEWLVCCSASAGPAFEGGGTKCGMRATKGAVQKVRIHGDR--VEIQTIGGGK 452 6789**************************************************999..9********* PP RACo_C_ter 71 pkGicGsGiidliaelleagiidkkgklnkelkserireeeeteeyvlvlaeesetekdivitekDide 139 ++GicGsG+id++ael+ +giid++gk+ l+++r+r +++ +e+v+++++eset++ +vite Di + lcl|NCBI__GCF_900176285.1:WP_084058852.1 453 ARGICGSGLIDCMAELVAEGIIDQNGKF-IALDHPRVRVTDGVPEFVVAQESESETGEAVVITEDDIGN 520 ****************************.5579************************************ PP RACo_C_ter 140 lirakaAiyagvktLleevglevedidkvylaGafGsyidlekAitiGllPdlelekvkqvGNtslagA 208 l+++kaA+ a++k+Lle +gl++ d+d++y+aG+fG+++d+ek+i+iGllPd+++ek+ ++GN+s+agA lcl|NCBI__GCF_900176285.1:WP_084058852.1 521 LMKSKAAVLAAMKILLEGLGLQFFDLDRLYVAGGFGAHLDIEKSIRIGLLPDVPKEKILFIGNSSVAGA 589 ********************************************************************* PP RACo_C_ter 209 raallsreareeleeiarkityielavekkFmeefvaalflphtdlelfpsv 260 r+alls++a++++++iar++ty+el+v++ Fm+efvaalflphtd +lfpsv lcl|NCBI__GCF_900176285.1:WP_084058852.1 590 RQALLSTHAYRKANAIARQMTYFELSVHAGFMNEFVAALFLPHTDESLFPSV 641 **************************************************98 PP Internal pipeline statistics summary: ------------------------------------- Query model(s): 1 (261 nodes) Target sequences: 1 (651 residues searched) Passed MSV filter: 1 (1); expected 0.0 (0.02) Passed bias filter: 1 (1); expected 0.0 (0.02) Passed Vit filter: 1 (1); expected 0.0 (0.001) Passed Fwd filter: 1 (1); expected 0.0 (1e-05) Initial search space (Z): 1 [actual number of targets] Domain search space (domZ): 1 [number of targets reported over threshold] # CPU time: 0.00u 0.00s 00:00:00.00 Elapsed: 00:00:00.00 # Mc/sec: 17.91 // [ok]
This GapMind analysis is from Apr 10 2024. The underlying query database was built on Apr 09 2024.
Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.
A candidate for a step is "high confidence" if either:
Otherwise, a candidate is "medium confidence" if either:
Other blast hits with at least 50% coverage are "low confidence."
Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:
GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).
For more information, see:
If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know
by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory