Protein Structure and Function Prediction
The structure prediction Meta Server offers a gateway to
many high quality fold recognition servers and provides an
infrastructure and main interface to several highly
reliable consensus methods. The Meta Server represents the
first step in our fold prediction strategy.
The Live Bench Project is a continuous benchmarking
program. Every week sequences of newly released PDB
proteins are being submitted to participating fold
recognition servers. The results are collected and
continuous evaluated using automated model assessment
programs. A summary of the results is produced after
several months of data collection. The servers must delay
the updating of their structural template labraries by one
week to participate.
The Gene Relational Database facilitates the comparison of protein families using BASIC.
BASIC is a novel sensitive approach for recognition of
distant similarity between proteins based on consensus
alignments of meta profiles. Specifically, BASIC
compares sequence profiles combined with predicted
secondary structure by utilizing several scoring systems
and alignment algorithms. In our benchmarking tests,
BASIC outperforms many individual servers, including
fold recognition servers, and it can compete with meta
predictors that base their strength on the structural
comparison of models. In addition, BASIC, which
enables detection of very distant relationships even if the
tertiary structure for the reference protein is not known,
has high-throughput capability.
GRDB operates on a grid of computers placed in different institutions.
TOM is used for task management.
The AutoMotif Server (AMS) predicts functional motifs in
proteins based only on sequence information. A list of
possible functional sites for a given query protein is
constructed using its sequence and the database of proteins
annotated for a certain type of biochemical process by
Swiss-Prot database. The whole query protein sequence is
dissected into overlapping short segments. All segments are
then projected into one abstract space of sequence
fragments by different representations (10 different
embeddings). Those representations are compared with the
database of representations of known functional motifs
using the support vector machine SVM approach. The
efficiency of the classification for each type of active
site and the prediction power of the method is estimated
using the leave-one-out tests and presented here.
Registered users can access all sites annotated by
Swiss-Prot database (version 4.2), add new proteins with
annotated segments (positive instances) or change
attributes of already included proteins. All data,
biological information, theoretical classification models
and automatic functional predictor are updated after each
major upgrade of the Swiss-Prot DB.
Small Molecules Tools
Cancer Drug Server
This service facilitates the analysis of correlation between the
activity of many chemical substances tested in (NCI60) anti-cancer
trials and the expression of genes deduced from cDNA microarray results
obtained for 60 cancer cell lines. Because both types of experiments
were conducted on the same group of cells the results can be directly
compared with each other. On one hand it may seem unlikely to find a
gene significantly affecting the action of a drug from a list of almost
10 000 investigated human genes just by collecting 60 measurements.
On the other hand if each measurement could provide just 2 possible
values the total number of possible results would be 60^2 = 1024^6
a number with 19 figures, much larger than the number of nucleotides
in gene sequences stored in current databases. Thus, 60 experiments
are sufficient to create a substance-specific of gene-specific activity
profile. The results can point to interesting genes or groups of genes
interacting in some way with the compound of interest.
Ligand.Info is a system of Small-Molceule Database and
Java-based tool for virtual high-throughput screening of
new potential drugs. The Ligand.Info Meta-Database contains
various publicly available sets of small molecules such as
Harvard's ChemBank, Hetero Atoms from Protein Data Bank,
KEGG Ligand Database, The Open NCI Database, and others.
The total size of the database is 1 million entries. The
Ligand.Info server is based on the idea that small
molecules with similar structure have similar biological
properties. The developed system enables a fast and
sensitive search for similar compounds using structural
indices. The tool can interactively cluster sets of
molecules on the user side and automatically download
similar molecules from the Meta-Database. All downloaded
molecules are automatically displayed using the structure