Abstract
The Latent Block Model (LBM) is a prominent model-based co-clustering method, returning parametric representations of each block-cluster and allowing the use of well-grounded model selection methods. Although the LBM has been adapted to accommodate various feature types, it cannot be applied to datasets consisting of multiple distinct sets of features, termed views, for a common set of observations. The multi-view LBM is introduced herein, extending the LBM method to multi-view data, where each view marginally follows an LBM. For any pair of two views, the dependence between them is captured by a row-cluster membership matrix. A likelihood-based approach is formulated for parameter estimation, harnessing a stochastic EM algorithm merged with a Gibbs sampler, while an ICL criterion is formulated to determine the number of row- and column-clusters in each view. To justify the application of the multi-view approach, hypothesis tests are formulated to evaluate the independence of row-clusters across views, with the testing procedure seamlessly integrated into the estimation framework. A penalty scheme is also introduced to induce sparsity in row-clusterings. The algorithm's performance is validated using synthetic and real-world datasets, accompanied by recommendations for optimal parameter selection. Finally, the multi-view co-clustering method is applied to a complex genomics dataset, and is shown to provide new insights for high-dimension multi-view problems.
| Original language | English |
|---|---|
| Article number | 108188 |
| Pages (from-to) | 1-22 |
| Number of pages | 22 |
| Journal | Computational Statistics & Data Analysis |
| Volume | 210 |
| Early online date | 10 Apr 2025 |
| DOIs | |
| Publication status | Published (in print/issue) - 31 Oct 2025 |
Bibliographical note
Publisher Copyright:© 2025 The Authors
Funding
This work was funded in part by the HEA , DFHERIS and the Shared Island Fund and by the Research Ireland grant 21/RC/10295_P2 .
| Funders | Funder number |
|---|---|
| Higher Education Authority | |
| 21/RC/10295_P2 |
UN SDGs
This output contributes to the following UN Sustainable Development Goals (SDGs)
-
SDG 3 Good Health and Well-being
-
SDG 10 Reduced Inequalities
Keywords
- Co-clustering
- Latent Block Model
- Multi-View Data
- High-dimensional Data
- Gene Expression
- High-dimensional data
- Multi-view data
- Gene expression
Fingerprint
Dive into the research topics of 'Co-Clustering Multi-View Data Using the Latent Block Model'. Together they form a unique fingerprint.Cite this
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver