Abstract
The huge amount of text documents has made the manual organization of text data a tedious task. Automatic text classification helps to easily handle the large number of documents by organising them automatically into predefined classes. The effectiveness and efficiency of automatic text classification largely depends on the way text documents are represented. A text document is usually viewed as a bag of terms (or words) and represented as a vector using the vector space model where terms are assumed unordered and independent and term frequencies (or weights) are used in the representation. Graphs are another text representation scheme that considers the structure of terms in the text document which is important for natural language. Terms weighted on the basis of graph representation increase the performance of text classification. In this paper, we present a novel approach for graph-based supervised term weighting which considers information relevant for the classification task using node centrality in the co-occurrence graphs built from the labelled training documents. Our experimental evaluation of the proposed term weighting scheme on four benchmark datasets shows the scheme has consistently superior performance over the state-of-the-art term weighting methods for text classification
| Original language | English |
|---|---|
| Pages | 1261-1268 |
| DOIs | |
| Publication status | Published (in print/issue) - 2 Feb 2017 |
| Event | 2016 IEEE 16th International Conference on Data Mining Workshops - Barcelona, Spain Duration: 12 Dec 2015 → 15 Dec 2015 |
Conference
| Conference | 2016 IEEE 16th International Conference on Data Mining Workshops |
|---|---|
| Country/Territory | Spain |
| City | Barcelona |
| Period | 12/12/15 → 15/12/15 |
Fingerprint
Dive into the research topics of 'Centrality-based approach for supervised term weighting'. Together they form a unique fingerprint.Student theses
-
Graph-theoretic approaches to text classification
Shanavas, N. (Author), Lin, Z. (Supervisor), Hawe, G. (Supervisor) & Wang, H. (Supervisor), Aug 2020Student thesis: Doctoral Thesis
File