Combining Rules for Text Categorization Using Dempster's Rule of Combination

Y Bi, TJ Anderson, SI McClean

Research output: Chapter in Book/Report/Conference proceedingConference contribution

3 Citations (Scopus)

Abstract

In this paper, we present an investigation into the combination of rules for text categorization using Dempsters rule of combination. We first propose a boosting-like technique for generating multiple sets of rules based on rough set theory, and then describe how to use Dempsters rule of combination to combine the classification decisions produced by multiple sets of rules. We apply these methods to 10 out of the 20-newsgroups – a benchmark data collection, individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data can achieve 80.47% classification accuracy, which is 3.24% better than that of the best single set of rules.
LanguageEnglish
Title of host publicationUnknown Host Publication
Pages457-463
Number of pages7
Volume3177/2
DOIs
Publication statusPublished - Aug 2004
EventIntelligent Data Engineering and Automated Learning - IDEAL 2004 - Exeter, UK
Duration: 1 Aug 2004 → …

Conference

ConferenceIntelligent Data Engineering and Automated Learning - IDEAL 2004
Period1/08/04 → …

Fingerprint

Rough set theory

Cite this

Bi, Y ; Anderson, TJ ; McClean, SI. / Combining Rules for Text Categorization Using Dempster's Rule of Combination. Unknown Host Publication. Vol. 3177/2 2004. pp. 457-463
@inproceedings{9d58d0b44c914393b25794f6ccad3aa1,
title = "Combining Rules for Text Categorization Using Dempster's Rule of Combination",
abstract = "In this paper, we present an investigation into the combination of rules for text categorization using Dempsters rule of combination. We first propose a boosting-like technique for generating multiple sets of rules based on rough set theory, and then describe how to use Dempsters rule of combination to combine the classification decisions produced by multiple sets of rules. We apply these methods to 10 out of the 20-newsgroups – a benchmark data collection, individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data can achieve 80.47{\%} classification accuracy, which is 3.24{\%} better than that of the best single set of rules.",
author = "Y Bi and TJ Anderson and SI McClean",
year = "2004",
month = "8",
doi = "10.1007/b99975",
language = "English",
isbn = "3-540-22881-0",
volume = "3177/2",
pages = "457--463",
booktitle = "Unknown Host Publication",

}

Bi, Y, Anderson, TJ & McClean, SI 2004, Combining Rules for Text Categorization Using Dempster's Rule of Combination. in Unknown Host Publication. vol. 3177/2, pp. 457-463, Intelligent Data Engineering and Automated Learning - IDEAL 2004, 1/08/04. https://doi.org/10.1007/b99975

Combining Rules for Text Categorization Using Dempster's Rule of Combination. / Bi, Y; Anderson, TJ; McClean, SI.

Unknown Host Publication. Vol. 3177/2 2004. p. 457-463.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Combining Rules for Text Categorization Using Dempster's Rule of Combination

AU - Bi, Y

AU - Anderson, TJ

AU - McClean, SI

PY - 2004/8

Y1 - 2004/8

N2 - In this paper, we present an investigation into the combination of rules for text categorization using Dempsters rule of combination. We first propose a boosting-like technique for generating multiple sets of rules based on rough set theory, and then describe how to use Dempsters rule of combination to combine the classification decisions produced by multiple sets of rules. We apply these methods to 10 out of the 20-newsgroups – a benchmark data collection, individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data can achieve 80.47% classification accuracy, which is 3.24% better than that of the best single set of rules.

AB - In this paper, we present an investigation into the combination of rules for text categorization using Dempsters rule of combination. We first propose a boosting-like technique for generating multiple sets of rules based on rough set theory, and then describe how to use Dempsters rule of combination to combine the classification decisions produced by multiple sets of rules. We apply these methods to 10 out of the 20-newsgroups – a benchmark data collection, individually and in combination. Our experimental results show that the performance of the best combination of the multiple sets of rules on the 10 groups of the benchmark data can achieve 80.47% classification accuracy, which is 3.24% better than that of the best single set of rules.

U2 - 10.1007/b99975

DO - 10.1007/b99975

M3 - Conference contribution

SN - 3-540-22881-0

VL - 3177/2

SP - 457

EP - 463

BT - Unknown Host Publication

ER -