Compression algorithm for pre-simulated Monte Carlo p-value functions: Application to the ontological analysis of microarray studies
Author
Summary, in English
Monte Carlo simulation is frequently employed to compute p-values for test statistics with unknown null distributions. However, the computations can be exceedingly time-consuming, and, in such cases, the use of pre-computed simulations can be considered to increase speed. This approach is attractive in principle, but complicated in practice because the size of the pre-computed data can be prohibitively large. We developed an algorithm for computing size-reduced representations of Monte Carlo p-value functions. We show that, in typical settings, this algorithm reduces the size of the pre-computed data by several orders of magnitude, while bounding provably the approximation error at an explicitly controllable level. The algorithm is data-independent, fully non-parametric, and easy to implement. We exemplify its practical utility by applying it to the threshold-free ontological analysis of microarray data. The presented algorithm simplifies the use of pre-computed Monte Carlo p-value functions in software, including specialized bioinformatics applications.
Department/s
Publishing year
2008
Language
English
Pages
768-772
Publication/Series
Pattern Recognition Letters
Volume
29
Issue
6
Document type
Journal article
Publisher
Elsevier
Topic
- Medical Genetics
Keywords
- ontological analysis
- microarrays
- biomedical pattern recognition
- bioinformatics
- data compression
Status
Published
ISBN/ISSN/Other
- ISSN: 0167-8655