ROSO : A Software to Search Optimized Oligonucleotide Probes for Microarrays
Résumé
With the increasing data of sequencing, microarrays appear as a powerful tool to analyze and understand the functionality of the genome. An optimal use of microarrays needs a bioinformatic analysis of genes. Thus, we have developed ROSO (“Logiciel de Recherche et Optimisation de Sondes Oligonucléotidiques”), a software that design optimal oligonucleotide probes for microarrays.
ROSO allows users to choose the size and the type of probes, the target and ion concentrations, the hybridization temperatures range, the thresholds of rejection for secondary structures formation (hairpin and homoduplex) and the location of probes on genes.
Probes optimization is based on four selection criteria. First, the oligonucleotide specificity compared to the overall studied genes, but also compared to the genes, that the user want to avoid any cross-hybridization with. This specificity is calculated with Blast program. Blast parameters were estimated to detect a minimal homology of 70 % on 20 nucleotides length. At the end of the analysis, ROSO defines areas with an homology between 60 % (specific area) and 100 % (strictly homologous area). The second criteria is the formation of secondary structures (hairpin and homoduplex) and leads ROSO to design probes deprived of such structures in the regions of minimal homology. The third criteria is the value of the Tm. ROSO calculates the Tm with the thermodynamic model of nearest-neighbor and keeps the probes with the smallest possible Tm difference. Finally, the software selects for each gene, the best possible probe in regards to its localization (on a range of n nucleotides from the 3’ or 5’ end) and others stability criteria (GC rate, GC clamp…). Moreover, it allows users to calculate Tm of control probes with mismatches.
Our optimization process is new, because it takes into account in the same time different criteria (Tm, secondary structures, localization of the probe on the gene and homology) through successive requests. In front of problematic genes, the user may dispose of different probes depending on the relative importance given to these criteria.
Different kinds of validation were performed. First, simulated data have allowed for the comparison with the reference software Oligo6® and Mfold. ROSO estimations of Tm and secondary structures were found to be equivalent or better than Oligo6® and Mfold estimations, for probes size comprised between 15 and 70 nucleotides. Second, ROSO was used to design two sets of 541 and 609 probes for specific bacterial microarrays corresponding to Buchnera aphidicola and Ralstonia solanacearum. Human and murine probe sets were also designed.
The work is conducted in collaboration with UMR 5558 (UCB Lyon) and with support of the Genopôle Rhône-Alpes. The first version of the ROSO web site has been developed. Soon it will be available on the PBIL server : http://pbil.univ-lyon1.fr.