|
About the Project
The Data Integration project aims to develop
software
tools and databases to facilitate seamless integration of HIV data
produced by CAPRISA laboratories located at the University of Cape Town
(UCT), the Medical Research Council (MRC) in Durban, and the National
Institute for Communicable Diseases (NICD) in Johannesburg.
The project has two components:
(i) The Sequence Assembly and Annotation Pipeline
(restricted access)
The pipeline is a web-based application that accepts either
chromatogram files or fasta format files and performs semi-automated
sequence assembly and quality control. Finished sequences are
annotated with protein coding sequence, reference strain coordinates,
subtype, and gene name. The pipeline is password protected.
(ii) The Molecular Integration Database (MID)
(restricted access)
The MID contains molecular, immunological, and clinical data.
Sequences are submitted via the sequence assembly pipeline.
Immunological and Clinical data are submitted by the various
labs
and are imported into the MID with specialized import applications.
Data in the MID can be queried by logging into a password protected
web-based query interface. The query interface is based on
BioMart (http://www.biomart.org).
The dataset can be specified,
whereafter data can be filtered using specific criteria.
Output
data types and format can also be specified.
The MID will allow scientists to integrate and extract data to answer
the specific research questions set by the Acute Infection Study.
People
SANBI
Prof Winston Hide - PI
Prof Heikki Lehvaslaiho - Project Manager
Allan Kamau - Database Developer/Administrator
Alan Powell - Developer
Anelda Boardman - Bioinformatician
UCT
Prof Cathal Seoighe
Contact
Details
Prof Heikki Lehvaslaiho
The South African National Bioinformatics Institute
University of the Western Cape
Modderdam Road/Private Bag X17
Bellville 7535
South Africa
Tel: +27 (0)21 959-3465
Fax: +27 (0)21 959-2512
|