SOPHIA-TCBR: A knowledge discovery framework for textual case-based reasoning

David Patterson, Niall Rooney, Mykola Galushka, Vladimir Dobrynin, Elena Smirnova

    Research output: Contribution to journalArticlepeer-review

    22 Citations (Scopus)


    In this paper, we present a novel textual case-based reasoning system called SOPHIA-TCBR which provides a means of clustering semantically related textual cases where individual clusters are formed through the discovery of narrow themes which then act as attractors for related cases. During this process, SOPHIA-TCBR automatically discovers appropriate case and similarity knowledge. It then is able to organize the cases within each cluster by forming a minimum spanning tree, based on their semantic similarity. SOPHIA’s capability as a case-based text classifier is benchmarked against the well known and widely utilised k-Means approach. Results show that SOPHIA either equals or outperforms k-Means based on 2 different case-bases, and as such is an attractive approach for case-based classification. We demonstrate the quality of the knowledge discovery process by showing the high level of topic similarity between adjacent cases within the minimum spanning tree. We show that the formation of the minimum spanning tree makes it possible to identify a kernel region within the cluster, which has a higher level of similarity between cases than the cluster in its entirety, and that this corresponds directly to a higher level of topic homogeneity. We demonstrate that the topic homogeneity increases as the average semantic similarity between cases in the kernel increases. Finally having empirically demonstrated the quality of the knowledge discovery process in SOPHIA, we show how it can be competently applied to case-based retrieval.
    Original languageEnglish
    Pages (from-to)404-414
    JournalKnowledge-Based Systems
    Issue number5
    Publication statusPublished (in print/issue) - Jul 2008


    Dive into the research topics of 'SOPHIA-TCBR: A knowledge discovery framework for textual case-based reasoning'. Together they form a unique fingerprint.

    Cite this