GapMind for catabolism of small carbon sources

 

Alignments for a candidate for deoA in Cupriavidus basilensis 4G11

Align Putative thymidine phosphorylase 1; EC 2.4.2.4; TdRPase 1 (uncharacterized)
to candidate RR42_RS11670 RR42_RS11670 thymidine phosphorylase

Query= curated2:Q0KA59
         (520 letters)



>FitnessBrowser__Cup4G11:RR42_RS11670
          Length = 510

 Score =  660 bits (1702), Expect = 0.0
 Identities = 338/493 (68%), Positives = 402/493 (81%), Gaps = 2/493 (0%)

Query: 29  LVFRPLEIDTWQEHVIYMHPDCPVCRAEGFSAQARVRVQIGD--RSLIATLTLLGAPLLN 86
           L F+PLEIDT+QEHVIYMH DC VC AEGFSAQ RV+V  G   +SLIATL ++G  LL 
Sbjct: 12  LTFKPLEIDTYQEHVIYMHRDCVVCHAEGFSAQTRVQVSTGAGAQSLIATLNVVGDALLA 71

Query: 87  TGEASLSLSAARTLSARAGDVVHVTHAPALESVRALRAKIYGCHLDSVQLDGIIGDISAG 146
             +  LS  A++ L   AGD++ VTHAP LES+RA+R+KI+G  LD+ QL  I+GDISAG
Sbjct: 72  PSQVGLSSGASQQLGVAAGDIIAVTHAPGLESLRAVRSKIHGNPLDAPQLCAIMGDISAG 131

Query: 147 RYADVHIAAFLTACADGRMSLRETVDLTRAMVRSGQRLNWDREVVADKHCVGGLPGNRTT 206
           RY+DVHIAAFL+ACA GRM+ +ETVDLT AM+ +G RL+WDR VVADKHCVGGLPGNRT+
Sbjct: 132 RYSDVHIAAFLSACAGGRMTTQETVDLTCAMLDTGDRLDWDRPVVADKHCVGGLPGNRTS 191

Query: 207 PVVVAIAAAAGLLLPKTSSRAITSPAGTADTMEALTRVTLDSTELRRVVEQVGAALVWGG 266
           P+VVAI AAAGLLLPKTSSRAITSPAGTADTME LTRVTL + E+RRVVE+VGAALVWGG
Sbjct: 192 PIVVAICAAAGLLLPKTSSRAITSPAGTADTMEVLTRVTLSAAEMRRVVERVGAALVWGG 251

Query: 267 ALSLSPADDVLIRVERALDIDSDAQLVASILSKKIAAGSTHVLIDVPVGPTAKIREDSDL 326
           +L+LSPADDVLIRVERAL+IDSDAQLVAS+LSKK+AAGSTHVLIDVP+GPTAK+R D+DL
Sbjct: 252 SLTLSPADDVLIRVERALEIDSDAQLVASVLSKKLAAGSTHVLIDVPLGPTAKVRTDADL 311

Query: 327 ARLDLAMTKVADAFGLKLRILRTDGSQPVGRGVGPALEALDVLAVLQCQPTAPADLRERS 386
           ARL L + +VA AFG+ + ++ TDGSQPVGRG+GPALEA DVLAVLQ   +APADLR R+
Sbjct: 312 ARLRLLLEEVARAFGMHVLVVHTDGSQPVGRGIGPALEARDVLAVLQGAESAPADLRGRA 371

Query: 387 LLLAGELLEFCGAIPPGQGRLLAGSLLDSGAAWARFQAICEAQGGLRTPGQAVFRRDVVA 446
           LLL+  L+EFCGA+P GQG  LA  LL  GAAWA+FQAICEAQGGLR PG A  RR+++A
Sbjct: 372 LLLSASLMEFCGAVPAGQGLALATRLLADGAAWAKFQAICEAQGGLRQPGSAPLRREILA 431

Query: 447 ARSGIVTSVDNRHVARTAKLAGAPRRQVAGLELHVRAGDEVVAGAPLCTLHAQASGELEY 506
              GIVTS+DNR ++R AKLAGAP R+ AG+++HVR  D V AG PL T+HA A GEL Y
Sbjct: 432 PADGIVTSIDNRLLSRAAKLAGAPNRKAAGIDMHVRLNDAVRAGQPLFTIHALAQGELAY 491

Query: 507 AFSYALAHDPFRI 519
           + ++   H    I
Sbjct: 492 SQNFLTTHPAINI 504


Lambda     K      H
   0.320    0.135    0.391 

Gapped
Lambda     K      H
   0.267   0.0410    0.140 


Matrix: BLOSUM62
Gap Penalties: Existence: 11, Extension: 1
Number of Sequences: 1
Number of Hits to DB: 848
Number of extensions: 27
Number of successful extensions: 2
Number of sequences better than 1.0e-02: 1
Number of HSP's gapped: 1
Number of HSP's successfully gapped: 1
Length of query: 520
Length of database: 510
Length adjustment: 35
Effective length of query: 485
Effective length of database: 475
Effective search space:   230375
Effective search space used:   230375
Neighboring words threshold: 11
Window for multiple hits: 40
X1: 16 ( 7.4 bits)
X2: 38 (14.6 bits)
X3: 64 (24.7 bits)
S1: 41 (21.8 bits)
S2: 52 (24.6 bits)

Align candidate RR42_RS11670 RR42_RS11670 (thymidine phosphorylase)
to HMM TIGR02645 (putative thymidine phosphorylase (EC 2.4.2.4))

# hmmsearch :: search profile(s) against a sequence database
# HMMER 3.3.1 (Jul 2020); http://hmmer.org/
# Copyright (C) 2020 Howard Hughes Medical Institute.
# Freely distributed under the BSD open source license.
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -
# query HMM file:                  ../tmp/path.carbon/TIGR02645.hmm
# target sequence database:        /tmp/gapView.2636.genome.faa
# - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - - -

Query:       TIGR02645  [M=493]
Accession:   TIGR02645
Description: ARCH_P_rylase: putative thymidine phosphorylase
Scores for complete sequences (score includes all domains):
   --- full sequence ---   --- best 1 domain ---    -#dom-
    E-value  score  bias    E-value  score  bias    exp  N  Sequence                                 Description
    ------- ------ -----    ------- ------ -----   ---- --  --------                                 -----------
   1.4e-197  642.9   0.4   1.6e-197  642.7   0.4    1.0  1  lcl|FitnessBrowser__Cup4G11:RR42_RS11670  RR42_RS11670 thymidine phosphory


Domain annotation for each sequence (and alignments):
>> lcl|FitnessBrowser__Cup4G11:RR42_RS11670  RR42_RS11670 thymidine phosphorylase
   #    score  bias  c-Evalue  i-Evalue hmmfrom  hmm to    alifrom  ali to    envfrom  env to     acc
 ---   ------ ----- --------- --------- ------- -------    ------- -------    ------- -------    ----
   1 !  642.7   0.4  1.6e-197  1.6e-197       1     491 [.      12     504 ..      12     506 .. 0.97

  Alignments for each domain:
  == domain 1  score: 642.7 bits;  conditional E-value: 1.6e-197
                                 TIGR02645   1 lkvrvlnidtgqekvlinskd...lkeekltpqdrvevrl..gkkslia..ivvssddlvesgevglse 62 
                                               l++++l+idt qe+v+++++d   +++e++++q rv+v    g +slia  +v  +d l++ ++vgls 
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670  12 LTFKPLEIDTYQEHVIYMHRDcvvCHAEGFSAQTRVQVSTgaGAQSLIAtlNV-VGDALLAPSQVGLSS 79 
                                               689******************9999*************9722579****5333.4567*********** PP

                                 TIGR02645  63 evveeleekegdlvtvtpaekpeslrairkklrgkklkkeeikaivsdivdeklsdveisafltalain 131
                                                + ++l +  gd+++vt+a+ +eslra+r k++g++l+  ++ ai+ di   ++sdv+i+afl+a+a +
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670  80 GASQQLGVAAGDIIAVTHAPGLESLRAVRSKIHGNPLDAPQLCAIMGDISAGRYSDVHIAAFLSACAGG 148
                                               ********************************************************************* PP

                                 TIGR02645 132 gldvdeiealtiamvetGetlewdrevivDkhsiGGvPGnktsllvvpivaaaGLliPktssraitsaa 200
                                               +++++e+++lt am +tG+ l+wdr+v++Dkh++GG+PGn+ts++vv+i aaaGLl+Pktssraits+a
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670 149 RMTTQETVDLTCAMLDTGDRLDWDRPVVADKHCVGGLPGNRTSPIVVAICAAAGLLLPKTSSRAITSPA 217
                                               ********************************************************************* PP

                                 TIGR02645 201 GtaDvvevltrvelsveelkrivekvggclvWGGalnlaPaDDvlikverpLslDpeeqllasilskki 269
                                               GtaD++evltrv+ls+ e++r+ve+vg++lvWGG l l+PaDDvli+ver+L++D+++ql+as+lskk+
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670 218 GTADTMEVLTRVTLSAAEMRRVVERVGAALVWGGSLTLSPADDVLIRVERALEIDSDAQLVASVLSKKL 286
                                               ********************************************************************* PP

                                 TIGR02645 270 aiGstkvliDiPvGpgakvksvkeaerLakdlielgkrlgvtvevvityGsqPiGraiGPaLeakeala 338
                                               a+Gst+vliD+P Gp+akv++  ++ rL+ +l+e+++ +g++v vv t+GsqP+Gr+iGPaLea+++la
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670 287 AAGSTHVLIDVPLGPTAKVRTDADLARLRLLLEEVARAFGMHVLVVHTDGSQPVGRGIGPALEARDVLA 355
                                               ********************************************************************* PP

                                 TIGR02645 339 vLesskeaPtsLvekslaLaaiLLemggaaergaGkelarelLdsGkaleklkeiieaqGgdniksedi 407
                                               vL+  ++aP +L+ ++l L+a L+e++ga++ g+G +la+ lL+ G+a+ k++ i+eaqGg    ++++
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670 356 VLQGAESAPADLRGRALLLSASLMEFCGAVPAGQGLALATRLLADGAAWAKFQAICEAQGGL---RQPG 421
                                               **************************************************************...8888 PP

                                 TIGR02645 408 evGklkadikaetdGyvteidnkaltriareaGaPedkgaGvklhvkvgdkvkkGdplytiyaeseekl 476
                                                +  l+++i a+ dG+vt+idn+ l r a++aGaP++k aG+ +hv+++d v++G+pl+ti+a ++ +l
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670 422 SA-PLRREILAPADGIVTSIDNRLLSRAAKLAGAPNRKAAGIDMHVRLNDAVRAGQPLFTIHALAQGEL 489
                                               87.89**************************************************************** PP

                                 TIGR02645 477 dkaialaralepikv 491
                                                ++ +   ++++i++
  lcl|FitnessBrowser__Cup4G11:RR42_RS11670 490 AYSQNFLTTHPAINI 504
                                               **9999999998887 PP



Internal pipeline statistics summary:
-------------------------------------
Query model(s):                            1  (493 nodes)
Target sequences:                          1  (510 residues searched)
Passed MSV filter:                         1  (1); expected 0.0 (0.02)
Passed bias filter:                        1  (1); expected 0.0 (0.02)
Passed Vit filter:                         1  (1); expected 0.0 (0.001)
Passed Fwd filter:                         1  (1); expected 0.0 (1e-05)
Initial search space (Z):                  1  [actual number of targets]
Domain search space  (domZ):               1  [number of targets reported over threshold]
# CPU time: 0.02u 0.01s 00:00:00.03 Elapsed: 00:00:00.02
# Mc/sec: 9.39
//
[ok]

This GapMind analysis is from Sep 17 2021. The underlying query database was built on Sep 17 2021.

Links

Downloads

Related tools

About GapMind

Each pathway is defined by a set of rules based on individual steps or genes. Candidates for each step are identified by using ublast (a fast alternative to protein BLAST) against a database of manually-curated proteins (most of which are experimentally characterized) or by using HMMer with enzyme models (usually from TIGRFam). Ublast hits may be split across two different proteins.

A candidate for a step is "high confidence" if either:

where "other" refers to the best ublast hit to a sequence that is not annotated as performing this step (and is not "ignored").

Otherwise, a candidate is "medium confidence" if either:

Other blast hits with at least 50% coverage are "low confidence."

Steps with no high- or medium-confidence candidates may be considered "gaps." For the typical bacterium that can make all 20 amino acids, there are 1-2 gaps in amino acid biosynthesis pathways. For diverse bacteria and archaea that can utilize a carbon source, there is a complete high-confidence catabolic pathway (including a transporter) just 38% of the time, and there is a complete medium-confidence pathway 63% of the time. Gaps may be due to:

GapMind relies on the predicted proteins in the genome and does not search the six-frame translation. In most cases, you can search the six-frame translation by clicking on links to Curated BLAST for each step definition (in the per-step page).

For more information, see:

If you notice any errors or omissions in the step descriptions, or any questionable results, please let us know

by Morgan Price, Arkin group, Lawrence Berkeley National Laboratory