The pursuit of extracting actionable knowledge from text documents has journeyed from academia to the industry over recent years. Such methods can now be found in technologies such as Google’s Duplex, Apple’s Siri and Amazon’s Alexa. The aim of this thesis is to develop entity classification techniques mature for a common commercial environment all the while improving performance. An in-depth literature review of sequence tagging is included with a specific focus on entity classification. The last 3 decades of research is reviewed in Chapter 2 where research opportunities are identified and scoped out. Chapter 3 builds on and establishes a new state-of-the-art in Named Entity Recognition by building on a pre-existing neural architecture. Chapter 4 proposes an alternative neural topology based on sequence-to-sequence learning improving the results yet again. Chapter 5 continues these experiments by exploring sequence learning process by encoding multiple deep input representations in one shot with a bias skewed on a single sequence tagging task. The aim of this work is to demonstrate how to improve the performance of one task by utilizing auxiliary inputs instead of input features. The final study, Chapter 7, investigates and proposes hybrid-neural architectures based on a premise that a majority of industrial systems favour the use of controllable knowledge and rule-based programs. Results are presented which suggest utilizing static techniques with neural topologies can boost performance. The recorded results are reflected upon and recommendations for future research are proposed based on the findings of this thesis.
|Date of Award||Apr 2019|
|Supervisor||H. Wang (Supervisor) & Zhiwei Lin (Supervisor)|
- Artificial intelligence
- Natural language processing