Measuring Tree Similarity for Natural Language Processing Based Information Retrieval

Zhiwei Lin, Hui Wang, Sally McClean

Research output: Chapter in Book/Report/Conference proceedingChapter

9 Citations (Scopus)

Abstract

Natural language processing based information retrieval (NIR) aims to go beyond the conventional bag-of-words based information retrieval (KIR) by considering syntactic and even semantic information in documents. NIR is a conceptually appealing approach to IR, but is hard due to the need to measure distance/similarity between structures. We aim to move beyond the state of the art in measuring structure similarity for NIR. In this paper, a novel tree similarity measurement dtwAcs is proposed in terms of a novel interpretation of trees as multi dimensional sequences. We calculate the distance between trees by the way of computing the distance between multi dimensional sequences, which is conducted by integrating the all common subsequences into the dynamic time warping method. Experimental result shows that dtwAcs outperforms the state of the art.
LanguageEnglish
Title of host publicationNatural Language Processing and Information Systems
Pages13-23
Volume6177/2
DOIs
Publication statusPublished - 20 Jun 2010

Fingerprint

Information retrieval
Syntactics
Processing
Semantics

Cite this

Lin, Zhiwei ; Wang, Hui ; McClean, Sally. / Measuring Tree Similarity for Natural Language Processing Based Information Retrieval. Natural Language Processing and Information Systems. Vol. 6177/2 2010. pp. 13-23
@inbook{2e163320b8ff441688c4970b7c5a1142,
title = "Measuring Tree Similarity for Natural Language Processing Based Information Retrieval",
abstract = "Natural language processing based information retrieval (NIR) aims to go beyond the conventional bag-of-words based information retrieval (KIR) by considering syntactic and even semantic information in documents. NIR is a conceptually appealing approach to IR, but is hard due to the need to measure distance/similarity between structures. We aim to move beyond the state of the art in measuring structure similarity for NIR. In this paper, a novel tree similarity measurement dtwAcs is proposed in terms of a novel interpretation of trees as multi dimensional sequences. We calculate the distance between trees by the way of computing the distance between multi dimensional sequences, which is conducted by integrating the all common subsequences into the dynamic time warping method. Experimental result shows that dtwAcs outperforms the state of the art.",
author = "Zhiwei Lin and Hui Wang and Sally McClean",
year = "2010",
month = "6",
day = "20",
doi = "10.1007/978-3-642-13881-2_2",
language = "English",
isbn = "978-3-642-13880-5",
volume = "6177/2",
pages = "13--23",
booktitle = "Natural Language Processing and Information Systems",

}

Measuring Tree Similarity for Natural Language Processing Based Information Retrieval. / Lin, Zhiwei; Wang, Hui; McClean, Sally.

Natural Language Processing and Information Systems. Vol. 6177/2 2010. p. 13-23.

Research output: Chapter in Book/Report/Conference proceedingChapter

TY - CHAP

T1 - Measuring Tree Similarity for Natural Language Processing Based Information Retrieval

AU - Lin, Zhiwei

AU - Wang, Hui

AU - McClean, Sally

PY - 2010/6/20

Y1 - 2010/6/20

N2 - Natural language processing based information retrieval (NIR) aims to go beyond the conventional bag-of-words based information retrieval (KIR) by considering syntactic and even semantic information in documents. NIR is a conceptually appealing approach to IR, but is hard due to the need to measure distance/similarity between structures. We aim to move beyond the state of the art in measuring structure similarity for NIR. In this paper, a novel tree similarity measurement dtwAcs is proposed in terms of a novel interpretation of trees as multi dimensional sequences. We calculate the distance between trees by the way of computing the distance between multi dimensional sequences, which is conducted by integrating the all common subsequences into the dynamic time warping method. Experimental result shows that dtwAcs outperforms the state of the art.

AB - Natural language processing based information retrieval (NIR) aims to go beyond the conventional bag-of-words based information retrieval (KIR) by considering syntactic and even semantic information in documents. NIR is a conceptually appealing approach to IR, but is hard due to the need to measure distance/similarity between structures. We aim to move beyond the state of the art in measuring structure similarity for NIR. In this paper, a novel tree similarity measurement dtwAcs is proposed in terms of a novel interpretation of trees as multi dimensional sequences. We calculate the distance between trees by the way of computing the distance between multi dimensional sequences, which is conducted by integrating the all common subsequences into the dynamic time warping method. Experimental result shows that dtwAcs outperforms the state of the art.

U2 - 10.1007/978-3-642-13881-2_2

DO - 10.1007/978-3-642-13881-2_2

M3 - Chapter

SN - 978-3-642-13880-5

VL - 6177/2

SP - 13

EP - 23

BT - Natural Language Processing and Information Systems

ER -