South Green bioinformatics platform: Update 2015

Abstract : In 2015, the South Green Bioinformatics Platform is a network of 35 bioinformaticians from five biology research institutes working with two High - Performance Computing Data Centres to develop and use new tools for NGS/ Omic analytics of tropical and Mediterranean crops under projects studying relationsh ip between genetic diversity, agronomic performance and response to selection. South Green is affiliated to the South regional centre of the French Institute of Bioinformatics (the French node of the European research infrastructure, ELIXIR). This communit y and the HPC data centres are all located in Montpellier, which facilitates close collaboration and significant pooling to best meet the biologists' demands of our research units. Since 2004, we developed web - based applications with both generic and in - ho use components, for databases, analysis workflows and web interfaces, in order to: manage genetic and phenotypic information ( e.g. TropGeneDB), analyse molecular markers and genetic diversity ( e.g. SNiPlay), assemble transcriptomes ( e.g. ESTtik) map RNA - Se q ( e.g. ARCAD), annotate and compare genomes ( e.g. GNPAnnot), reconstruct evolutionary history of gene families by phylogenomics ( e.g. GreenPhyl). We also participate to the analysis of numerous crop species, that requires computing and storage facilities as well as interoperable information systems, such as rice ( e.g. OryGenesDB), wheat, sorghum, sugarcane, banana (Banana Genome Hub), palms, yam, coffee (CGH), rubber, cacao (CocoaGenDB), cotton, apple, grapevine, olive, eucalyptus, cassava. To face the dat a deluge, we must increase our analytics capabilities. We document our operation at both, administrator/ developer and user/ scientist level, to provide high quality services and reproducible research. We pool into working groups on key themes such as GBS, at both, developer (extreme pair programming) and user (interdisciplinary knowledge exchange) level. We provide training sessions each year. Finally, we implemented several instances of the Galaxy workflow manager and encapsulated our tools. These instanc es serve as a catalyst for massive NGS analyses but it remains to increase storage capacity and improve data management plans. (Résumé d'auteur)
