Enzymes are extremely powerful natural catalysts able to perform almost any type of chemical reaction while being mild by nature and highly specific. In fact, the delicate functioning of enzymes forms the basis of every living creature. The catalytic potential of enzymes is more and more appreciated by the industry as many industrial processes rely on these sophisticated catalysts. However, the number of reactions catalyzed by enzymes is restricted as enzymes only have evolved to catalyze reactions that are physiologically relevant. Furthermore, enzymes have adapted to the direct (cellular) environment in which they have to function (e.g. operative at ambient temperature, resilient towards proteolysis, catalytic turnover rate should fit with metabolic enzyme partners). This excludes the existence of enzymes that do not fit within boundaries set by nature. It is a great challenge to go beyond these natural boundaries and develop methodologies to design ‘unnatural’ tailor-made enzymes. Ideally it should become possible to (re)design enzymes to convert pre-defined substrates. Such designer enzymes could theoretically exhibit unsurpassed catalytic properties and, obviously, will be of significant interest for industrial biotechnology. The OXYGREEN project aims at the design and construction of novel oxygenating enzymes (designer oxygenases) for the production of compounds that can be used in medicine, food and agriculture and the development of novel powerful and generic enzyme redesign tools for this purpose. The enzymes and whole-cell biocatalysts that will be developed should catalyze the specific incorporation of oxygen to afford synthesis of bioactive compounds in a selective and clean way, with minimal side products and with no use of toxic materials. For this, generic platform technologies (novel high-throughput methodology and methods for engineering dedicated host cells) will be developed that allow effective structure-inspired directed evolution of enzyme.
The four principal objectives of the ELM consortium are to (1) design, (2) develop, (3) maintain and (4) apply, a novel infrastructure resource devoted to the prediction of functional motifs in protein sequences. ELM (short for Eukaryotic Linear Motif) will be both “virtual” – provided electronically – and “distributed” – provided by a network of sites. Effective prediction of short motifs will require the implementation of hitherto unique context-dependent filtering software. The ELM resource will be made available to researchers as WWW servers and as a package for local installation.
The four principle objectives correspond approximately to overlapping phases of the ELM project:
Design: The initial design requirements are to integrate: (I) a relational database; (II) data input requirements; (III) new application software; (IV) private consortium web servers; and (V) public web servers. The partners will collectively contribute both the inferred biological needs and the underlying technical specifications. A document will be prepared that describes the internal ELM architecture. Subsequent revisions to the document will be ratified by all ELM partners. A web-based input form will ensure that data input meets the internal specification.
Develop: An extensive development phase is needed to create the software needed to effectively query ELM and to generate useful predictions. Various context filters will be developed as separate modules. The easiest filter modules will be completed first, and the more complex filters later in the project. As the modules are completed, they will be integrated into the ELM resource as serial filters. For optimal performance, the fastest executing filters will be accessed first, so that only the surviving motif candidates are passed on to the slower filters.
Maintain: The ELM servers will be continually maintained and extended as the project matures. Data will be continually added into the ELM resource and older data will be revised as new biological findings are published in the literature. While many motifs are already known, during the project there will be a steady stream of new motif publications. In the mature phase of ELM, releases will be scheduled at 6 month intervals.
Apply: As the ELM resource matures, it will become increasingly powerful and useful to experimentalists. Predicted motifs will suggest unexpected functional interactions or help to confirm suspected but poorly characterised ones. The consortium partners, and their close collaborators in the host institutes, will investigate predicted motifs relevant to their research interests. Verification (and to an extent exclusion) of predicted linear motifs will lead to enhanced understanding of multifunctional multidomain proteins, many of which assemble (via linear motifs) into huge complexes whose aggregate functions are hard to investigate with current experimental approaches.
The new partner will develop an additional ab-initio filter to estimate the conformational preferences of parts of proteins. The main objective of the task proposed by the new partner is to provide a reliable tool for detection of protease target sites. This new objective represents an expansion of the ongoing work complementary to the objectives outlined in WP2 and W3.
Our particle-based method allows us to synthesise high complexity peptide arrays by combinatorial synthesis and for an unrivalled prize. We plan to further develop this new technology up to the level of robust prototype machines, and mate it to bioinformatics and readout tools. Together, our procedure(s) should boost the field of proteomics in a similar way as the lithographic technologies did with the field of genomics. Central to our novel method are the activated chemical building blocks that are “frozen” within solid amino acid particles. Thereby, we can use a colour laser printer to send them to defined addresses on a 2D support, where the particles are simply melted to induce a spatially defined coupling reaction of now freed amino acid derivatives. By repeated printing and melting cycles this simple trick yields high complexity peptide arrays. Based on existing pre-prototypes, we will develop a user-friendly peptide laser printer that spatially defined addresses our 20 different amino acid toners in high resolution to a support (WP1), and a scanner that especially fast and sensitive reads out the large formats delivered by the peptide laser printer (WP2). The increased production of amino acid toners and array supports are other bottlenecks in the output of peptide arrays that are tackled in WP3. This should allow us to increase the output of individual peptide spots from currently 0,5 Million to >10 Million peptides per month. Finally, to foster a market for high complexity peptide arrays, we will work out paradigmatic application examples in WP4. These aim to directly screen for antibiotic or apoptosis inducing D-peptides, and for the comprehensive readout of the different antibodies that patrol the serum of autoimmune patients. Based on user-friendly prototype machines, on first paradigmatic application examples for high complexity peptide arrays, and shielded by a strong patent, the participating SMEs will commercialise this new technology.
This project concerns the design of cryptographic schemes that are secure even if implemented on not-secure devices. The motivation for this work comes from an observation that most of the real-life attacks on cryptographic devices do not break their mathematical foundations, but exploit vulnerabilities of their implementations. This concerns both the cryptographic software executed on PCs (that can be attacked by viruses), and the implementations on hardware (that can be subject to the side-channel attacks). Typically, fixing this problem was left to the practitioners, since it was a common belief that theory cannot be of any help here. However, new exciting results in cryptography suggest that this view was too pessimistic: there exist methods to design cryptographic protocols in such a way that they are secure even if the hardware on which they are executed cannot be fully trusted. The goal of this project is to investigate these methods further, unify them in a solid mathematical theory (many of them were developed independently), and propose new ideas in this area. The project will be mostly theoretical (although some practical experiments may be performed). Our main interest lies within the theory of private circuits, bounded-retrieval model, physically-observable cryptography, and human-assisted cryptography. We view these theories just as the point of departure, since the field is largely unexplored and we expect to witness completely new ideas soon.
The objective of the BioSapiens Network of Excellence is to provide a large-scale effort to annotate human genome using both informatics tools and input from experimentalists. The Network will create a European Virtual Institute for Genome Annotation, bringing together many of the best laboratories in Europe. This institute will help to improve bioinformatics research in Europe and encourage cooperation between various laboratories.
The BioSapiens network tries also to integrate experimentalists and bioinformaticians, through a directed programme of genome analysis, focused on specific biological problems. The annotations generated by the Institute will be available in the public domain and easily accessible on the web. This will be achieved initially through a distributed annotation system (DAS), which will evolve to take advantage of new developments in the GRID.
The Institute will establish a permanent European School of Bioinformatics, to train bioinformaticians and to encourage best practise in the exploitation of genome annotation data for biologists. The courses and meetings will be open to all scientists throughout Europe, and available at all levels, from basic courses for experimentalists to more advanced training for experts. The BioSapiens NoE will increase European competitiveness, by new discoveries, increased integration, expert training and improved tools and services, and enhance Europe’s role in the academic and industrial exploitation of genomics.
Genetically Modified Microbes (GMM) are a biotechnological alternative to different environmental problems such as remediation of polluted sites, where microbes with recombinant catabolic pathways are envisaged as the solution for removal of toxic organic compounds. Moreover, the exploration and exploitation of synergistic interactions between plants and microbes for phytoremediation is also a target to solve contamination problems. Critical to the safe application of recombinant microbes in the environment, and re-assurance of public concerns, is adequate information on safety-related properties of the microbes in question. Current whole genome sequencing efforts on relevant microbes provide a unique opportunity to extract completely new safety-related information, to conduct experiments to generate important new data, and to create new tools for increasing the degree of predictability of the behaviour of strains designed for applications in the open environment or in industrial bioreactors.
One of the microorganisms with current applications in Biotechnology is Pseudomonas putida, a paradigm of metabolically versatile microorganism which recycles organic wastes in aerobic compartments of the environment, and thereby plays a key role in the maintenance of environmental quality. The strain KT2440 is the most extensively characterised and best understood strain of P. putida. KT2440 is a nonpathogenic bacterium certified in 1981 by the Recombinant DNA Advisory Committee (RAC) of the United States National Institutes of Health as the host strain of the first Host-Vector Biosafety (HV1) system for gene cloning in Gram negative soil bacteria. Since then, KT2440 has been used world-wide as host of choice for environmental applications involving expression of cloned genes. This strain is one of the few nonpathogenic microbes which are subject to whole genome sequencing by a P. putida genome project currently in progress in Germany. The sequence data generated in the genome project is being made public at appropriate intervals (a 10-fold genome equivalent of raw sequence data is already available) and will constitute an invaluable resource for this project. Therefore, this microorganism, its recombinant derivatives and the body of knowledge accumulated in the last 20 years on its genetics, physiology and biochemistry make it an ideal and friendly microbe for safe biotechnological applications in the environment.
The major aim of this project is to settle the basis to reduce in a rational, environmentally friendly, and safe manner our contamination problems by developing P. putida strains useful to design environmental treatment systems in harmony with the biosphere.
Deciphering the information on genome sequences in terms of the biological function of the genes and proteins is a major challenge of the post-genomic era. Currently, the bulk of function assignments for newly sequenced genomes is performed using bioinformatics tools that infer the function of a gene on the basis of sequence similarity with other genes of known function. It is now well recognised that these primary, sequence similarity-based function annotation procedures are frequently inaccurate and error prone. Continuing to use them without clearly defining the limits of their applicability would lead to an unmanageable propagation of errors that could jeopardise progress in Biology. On the other hand, various novel bodies of data and resources are becoming available. These provide information on context-based aspects of the biological function of genes, namely on physical and functional interactions between genes and proteins, and on whole networks and processes. In parallel structural genomics efforts world wide are providing a much better coverage of the structural motifs adopted by proteins and on their interactions. The availability of these additional and novel data offers an unprecedented opportunity for the development of methods for incorporating higher-level functional features into the annotation pipeline.
The GeneFun project aims at addressing these two important issues. The issue of annotation errors will de addressed by developing criteria for evaluating the reliability of the annotations currently available in databases. These criteria will be used to assign reliability scores to these annotations and will be incorporated into standard annotation pipelines, for future use. The issue of incorporating higher-level features into functional annotations will be addressed by combining sequence and structure information in order to identify non-linear functional features (e.g. interaction sites), and by integrating available and newly developed methods for inferring function from higher-level and context-based information (protein domain architecture, protein-protein interaction, genomic context such as gene order etc.).
To achieve these aims several European groups with strong track record in developing novel methods and analyses in comparative genomics, structural- and systems- oriented bioinformatics, and in information technology, have teamed up with an experimental group from Canada, which is well known for its outstanding achievements in the field of structural and functional proteomics. The expected output of the GeneFun project is: improved procedures for inferring function on the basis of sequence similarity, a set of procedures for predicting non-linear functional features from sequence and 3D structure in a more automated way, and benchmarked procedures for predicting context-based functional features. Major efforts will be devoted to devising protocols that optimally combine the results from several methods. In particular Web-based servers to the individual and combined procedures will be developed, and made available to the scientific community. The community will be introduced to these new tools through open workshops and training sessions.