Reusing Stanford POS Tagger for Tagging Urdu Sentences

Adnan Naseem, Muazzama Anwar, Salman Ahmed, Avais Jan, Ahmad Kamran Malik

Research output: Contribution to conferencePaperpeer-review

2 Citations (Scopus)
253 Downloads (Pure)

Abstract

Several Natural Language Processing applications
in a particular language consider POS tagging a necessary
component. To develop a new language specific POS tagger
targeting such particular language is a tedious job for
unstructured data due to the variation in text, type and
complexity of text. For that reason, it impacts the precision of
tagging as a result of the variety of a certain language. Current
research focused on the thought of reusability of a popular
language specific Part of speech tagger, for example, Stanford
Part of speech Tagger can be employed for tagging non-Engish
phrases. For generalizeability, any translator can be used to
translate the sentences, however, a well-known translator, named
“Google translator” is considered for sentence translation
purpose across the languages. For evaluation perspective, Urdu
tweets of a hot political issue “Panama leaks” from twitter.com
are extracted. To measure the accuracy, the kappa statistic along
with confusion matrix is deliberated. The precision of tagging the
Urdu sentences by reusing Stanford Part of speech tagger is
96.05 percent. The respected approach can be globally applied to
tagging the sentences of several different languages.
Original languageEnglish
Number of pages6
DOIs
Publication statusPublished (in print/issue) - 8 Feb 2018
Event13th International Conference on Emerging Technologies 2017 - Higher Education Commission Office, Islamabad, Pakistan
Duration: 27 Dec 201728 Dec 2017

Conference

Conference13th International Conference on Emerging Technologies 2017
Abbreviated titleICET 2017
Country/TerritoryPakistan
CityIslamabad
Period27/12/1728/12/17

Keywords

  • Stanford-Part-of-speech Tagger
  • Google-Translator
  • Multi-lingual labling

Fingerprint

Dive into the research topics of 'Reusing Stanford POS Tagger for Tagging Urdu Sentences'. Together they form a unique fingerprint.

Cite this