ToxSci Advance Access published online on March 30, 2009
Toxicological Sciences, doi:10.1093/toxsci/kfp061
| ||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||||
Published by Oxford University Press 2009.
Towards a public toxicogenomics capability for supporting predictive toxicology: Survey of current resources and chemical indexing of experiments in GEO and ArrayExpress

1 U.S. EPA/Office of Research and Development (ORD)/National Health & Environmental Effects Research Laboratory (NHEERL), williams.clarlynda{at}epa.gov 2 Lockheed Martin (Contractor to U.S. EPA), United States, wolf.marti{at}epa.gov 3 U.S. EPA/Office of Research and Development (ORD)/National Center for Computational Toxicology (NCCT), richard.ann{at}epa.gov Institutional Address: US Environmental Protection Agency, 109 T.W. Alexander Drive, Research Triangle Park, NC 27711
Corresponding Author: Ann Richard, Mail Drop D343-03, 109 TW Alexander Dr., US Environmental Protection Agency, Research Triangle Park, NC 27711, Phone: 919-541-3934, Fax: 919-685-3263
Received January 18, 2009; revision received March 20, 2009; accepted March 23, 2009
| Abstract |
|---|
A publicly available toxicogenomics capability for supporting predictive toxicology and meta-analysis depends on availability of gene expression data for chemical treatment scenarios, the ability to locate and aggregate such information by chemical, and broad data coverage within chemical, genomics and toxicological information domains. This capability also depends on common genomics standards, protocol description, and functional linkages of diverse public Internet data resources. We present a survey of public genomics resources from these vantage points and conclude that, despite progress in many areas, the current state of the majority of public microarray databases is inadequate for supporting these objectives, particularly with regard to chemical indexing. To begin to address these inadequacies, we focus chemical annotation efforts on experimental content contained in the two primary public genomic resources: ArrayExpress and Gene Expression Omnibus (GEO). Automated scripts and extensive manual review were employed to transform free text experiment descriptions into a standardized, chemically indexed inventory of experiments in both resources. These files, which include top-level summary annotations, allow for identification of current chemical-associated experimental content, as well as chemical exposure-related (or "Treatment") content of greatest potential value to toxicogenomics investigation. With these chemical index files, it is possible for the first time to assess the breadth and overlap of chemical study space represented in these databases, and to begin to assess the sufficiency of data with shared protocols for chemical similarity inferences. Chemical indexing of public genomics databases is a first important step towards integrating chemical, toxicological and genomics data into predictive toxicology.
Key Words: microarray; chemical; toxicogenomics; toxicity; prediction.