Integrating semantically heterogeneous aggregate views of distributed databases

Research output: Contribution to journalArticle

6 Citations (Scopus)

Abstract

In statistical databases and data warehousing applications it is commonly the case that aggregate views are maintained as an underlying mechanism for summarising information. Where the databases or applications are distributed, or arise from independent data collections or system developments, there may be incompatibility, heterogeneity, and data inconsistency. These challenges need to be overcome if federations of aggregated databases are to be successfully incorporated into systems for database management, querying, retrieval, and knowledge discovery. In this paper we address the issue of integrating aggregate views that have semantically heterogeneous classification schemes. In previous work we have developed a methodology that is efficient but that cannot easily handle data inconsistencies. Our previous approach is therefore not particularly well suited to very large databases or federations of large numbers of databases. We now address these scalability issues by introducing a methodology for heterogeneous aggregate view integration that constructs a dynamic shared ontology to which each of the aggregate views can be explicitly related. A maximum likelihood technique, implemented using the EM (Expectation Maximisation) algorithm, is used to inherently handle data inconsistencies in the computation of integrated aggregates that are described in terms of the dynamic shared ontology.
Original languageEnglish
Pages (from-to)73-94
JournalDistributed and Parallel Databases
Volume24
Issue number1-3
DOIs
Publication statusPublished - Dec 2008

Keywords

  • Distributed databases
  • Aggregate views
  • Heterogeneous data
  • Dynamic shared ontologies

Fingerprint Dive into the research topics of 'Integrating semantically heterogeneous aggregate views of distributed databases'. Together they form a unique fingerprint.

  • Cite this