Text Categorization via Ellipsoid Separation

Andriy Kharechko, John Shawe-Taylor, Ralf Herbrich, Thore Graepel

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

Abstract

We present a new batch learning algorithm for text classification in the vector space of document representations. The algorithm uses ellipsoid separation in the feature space which leads to a semidefinite program. An approximation of the latent semantic feature extraction approach using Gram-Schmidt orthogonalization is used for the feature extraction. Preliminary results demonstrate some potential for the presented approach.
Original languageUndefined
Title of host publicationLearning Methods for Text Understanding and Mining (26/01/04 - 29/01/04)
Publication statusPublished (in print/issue) - 2004

Bibliographical note

Event Dates: 26 - 29 January 2004

Keywords

  • Text categorization
  • pattern separation
  • semidefinite programming
  • ellipsoid
  • latent semantic indexing
  • feature extraction
  • bag-of-words text representation
  • Gram-Schmidt orthogonalization.

Cite this