Integration of microarray data for a comparative study of classifiers and identification of marker genes

Daniel Berrar, B Sturgeon, I Bradbury, Stephen Downes, Werner Dubitzky

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    Abstract

    Novel diagnostic tools promise the development of patient- tailored cancer treatment. However, one major step towards individualized therapy is to use a combination of various data sources, e.g. transcriptomic, proteomic, and clinical data. We have integrated clinical data and lung cancer microarray data that were generated on two different oligonucleotide platforms. We were interested in the question whether the prediction of survival outcome benefits from the integration of clinical and transcriptomic data. In addition, we attempted to identify those genes whose expression profiles correlate with survival outcome. We applied five machine learning techniques to predict survival risk groups, and we compared the models with respect to their performance and general user acceptance. Based on quantitative and qualitative evaluation criteria, we chose decision trees as the most relevant technique for this type of analysis. Our in silico analysis corroborates the role of numerous marker genes already described in lung adenocarcinomas. In addition, our study reveals a set of highly interesting genes whose expression profiles correlate with genetic risk groups of unexpected survival outcomes.
    LanguageEnglish
    Title of host publicationUnknown Host Publication
    Pages147-162
    Number of pages386
    Publication statusPublished - 2005
    EventMETHODS OF MICROARRAY DATA ANALYSIS IV - Durham, NC
    Duration: 1 Jan 2005 → …

    Conference

    ConferenceMETHODS OF MICROARRAY DATA ANALYSIS IV
    Period1/01/05 → …

    Fingerprint

    Survival
    Transcriptome
    Genes
    Decision Trees
    Information Storage and Retrieval
    Oligonucleotides
    Computer Simulation
    Proteomics
    Lung Neoplasms
    Therapeutics
    Neoplasms
    Machine Learning
    Adenocarcinoma of lung

    Cite this

    Berrar, D., Sturgeon, B., Bradbury, I., Downes, S., & Dubitzky, W. (2005). Integration of microarray data for a comparative study of classifiers and identification of marker genes. In Unknown Host Publication (pp. 147-162)
    Berrar, Daniel ; Sturgeon, B ; Bradbury, I ; Downes, Stephen ; Dubitzky, Werner. / Integration of microarray data for a comparative study of classifiers and identification of marker genes. Unknown Host Publication. 2005. pp. 147-162
    @inproceedings{df3eb460211a4c72b547f41174367456,
    title = "Integration of microarray data for a comparative study of classifiers and identification of marker genes",
    abstract = "Novel diagnostic tools promise the development of patient- tailored cancer treatment. However, one major step towards individualized therapy is to use a combination of various data sources, e.g. transcriptomic, proteomic, and clinical data. We have integrated clinical data and lung cancer microarray data that were generated on two different oligonucleotide platforms. We were interested in the question whether the prediction of survival outcome benefits from the integration of clinical and transcriptomic data. In addition, we attempted to identify those genes whose expression profiles correlate with survival outcome. We applied five machine learning techniques to predict survival risk groups, and we compared the models with respect to their performance and general user acceptance. Based on quantitative and qualitative evaluation criteria, we chose decision trees as the most relevant technique for this type of analysis. Our in silico analysis corroborates the role of numerous marker genes already described in lung adenocarcinomas. In addition, our study reveals a set of highly interesting genes whose expression profiles correlate with genetic risk groups of unexpected survival outcomes.",
    author = "Daniel Berrar and B Sturgeon and I Bradbury and Stephen Downes and Werner Dubitzky",
    note = "4th International Conference for the Critical Assessment of Microarray Data Analysis (CAMDA 2003), Durham, NC, NOV 12-14, 2003",
    year = "2005",
    language = "English",
    pages = "147--162",
    booktitle = "Unknown Host Publication",

    }

    Berrar, D, Sturgeon, B, Bradbury, I, Downes, S & Dubitzky, W 2005, Integration of microarray data for a comparative study of classifiers and identification of marker genes. in Unknown Host Publication. pp. 147-162, METHODS OF MICROARRAY DATA ANALYSIS IV, 1/01/05.

    Integration of microarray data for a comparative study of classifiers and identification of marker genes. / Berrar, Daniel; Sturgeon, B; Bradbury, I; Downes, Stephen; Dubitzky, Werner.

    Unknown Host Publication. 2005. p. 147-162.

    Research output: Chapter in Book/Report/Conference proceedingConference contribution

    TY - GEN

    T1 - Integration of microarray data for a comparative study of classifiers and identification of marker genes

    AU - Berrar, Daniel

    AU - Sturgeon, B

    AU - Bradbury, I

    AU - Downes, Stephen

    AU - Dubitzky, Werner

    N1 - 4th International Conference for the Critical Assessment of Microarray Data Analysis (CAMDA 2003), Durham, NC, NOV 12-14, 2003

    PY - 2005

    Y1 - 2005

    N2 - Novel diagnostic tools promise the development of patient- tailored cancer treatment. However, one major step towards individualized therapy is to use a combination of various data sources, e.g. transcriptomic, proteomic, and clinical data. We have integrated clinical data and lung cancer microarray data that were generated on two different oligonucleotide platforms. We were interested in the question whether the prediction of survival outcome benefits from the integration of clinical and transcriptomic data. In addition, we attempted to identify those genes whose expression profiles correlate with survival outcome. We applied five machine learning techniques to predict survival risk groups, and we compared the models with respect to their performance and general user acceptance. Based on quantitative and qualitative evaluation criteria, we chose decision trees as the most relevant technique for this type of analysis. Our in silico analysis corroborates the role of numerous marker genes already described in lung adenocarcinomas. In addition, our study reveals a set of highly interesting genes whose expression profiles correlate with genetic risk groups of unexpected survival outcomes.

    AB - Novel diagnostic tools promise the development of patient- tailored cancer treatment. However, one major step towards individualized therapy is to use a combination of various data sources, e.g. transcriptomic, proteomic, and clinical data. We have integrated clinical data and lung cancer microarray data that were generated on two different oligonucleotide platforms. We were interested in the question whether the prediction of survival outcome benefits from the integration of clinical and transcriptomic data. In addition, we attempted to identify those genes whose expression profiles correlate with survival outcome. We applied five machine learning techniques to predict survival risk groups, and we compared the models with respect to their performance and general user acceptance. Based on quantitative and qualitative evaluation criteria, we chose decision trees as the most relevant technique for this type of analysis. Our in silico analysis corroborates the role of numerous marker genes already described in lung adenocarcinomas. In addition, our study reveals a set of highly interesting genes whose expression profiles correlate with genetic risk groups of unexpected survival outcomes.

    M3 - Conference contribution

    SP - 147

    EP - 162

    BT - Unknown Host Publication

    ER -

    Berrar D, Sturgeon B, Bradbury I, Downes S, Dubitzky W. Integration of microarray data for a comparative study of classifiers and identification of marker genes. In Unknown Host Publication. 2005. p. 147-162