Transformation of XML Data Sources for Sequential Path Mining

Ruth McNerlan, Yaxin Bi, Gouge Zhao, Bing Hang

Research output: Chapter in Book/Report/Conference proceedingConference contribution

Abstract

In recent years XML has become one of the most promising ways to define semi-structured data. Data mining techniques devised for detecting interesting patterns from semi-structure data have also grown in popularity, but carrying out such techniques on XML data can be problematic due to its hierarchical structure. Therefore, it has become necessary to transform XML into flattened, path data, so as to enable data mining to be carried out efficiently. However, problems may arise when the XML tree needs to be reconstructed from the traversal path. There are currently many transformation techniques for XML data, many of which take advantage of its tree-like hierarchical structure; but most of these approaches do not allow the XML tree to be reconstructed from the traversal path. In this paper we propose a new approach to the transformation of XML data into path data. The new approach employs a 5 step transformation process along with a new ‘Postorder Sequencing’ method of traversing the XML tree. The proposed method, on the one hand, can be seen an efficient and effective way of transforming XML data into collections of paths, and on the other hand enables XML trees to be generated from the traversal paths
LanguageEnglish
Title of host publicationUnknown Host Publication
Number of pages10
Publication statusE-pub ahead of print - 19 Oct 2017
EventInternational workshop on graph data management and analysis (GDMA 2017) - Beijing, China
Duration: 19 Oct 2017 → …

Workshop

WorkshopInternational workshop on graph data management and analysis (GDMA 2017)
Period19/10/17 → …

Fingerprint

XML
Data mining

Keywords

  • XML
  • Transformation
  • XPath
  • Sequential Data Mining

Cite this

McNerlan, R., Bi, Y., Zhao, G., & Hang, B. (2017). Transformation of XML Data Sources for Sequential Path Mining. In Unknown Host Publication
McNerlan, Ruth ; Bi, Yaxin ; Zhao, Gouge ; Hang, Bing. / Transformation of XML Data Sources for Sequential Path Mining. Unknown Host Publication. 2017.
@inproceedings{d6d1a49b9b75429abc9bd9e7c70c1359,
title = "Transformation of XML Data Sources for Sequential Path Mining",
abstract = "In recent years XML has become one of the most promising ways to define semi-structured data. Data mining techniques devised for detecting interesting patterns from semi-structure data have also grown in popularity, but carrying out such techniques on XML data can be problematic due to its hierarchical structure. Therefore, it has become necessary to transform XML into flattened, path data, so as to enable data mining to be carried out efficiently. However, problems may arise when the XML tree needs to be reconstructed from the traversal path. There are currently many transformation techniques for XML data, many of which take advantage of its tree-like hierarchical structure; but most of these approaches do not allow the XML tree to be reconstructed from the traversal path. In this paper we propose a new approach to the transformation of XML data into path data. The new approach employs a 5 step transformation process along with a new ‘Postorder Sequencing’ method of traversing the XML tree. The proposed method, on the one hand, can be seen an efficient and effective way of transforming XML data into collections of paths, and on the other hand enables XML trees to be generated from the traversal paths",
keywords = "XML, Transformation, XPath, Sequential Data Mining",
author = "Ruth McNerlan and Yaxin Bi and Gouge Zhao and Bing Hang",
year = "2017",
month = "10",
day = "19",
language = "English",
isbn = "978-3-319-69780-2",
booktitle = "Unknown Host Publication",

}

McNerlan, R, Bi, Y, Zhao, G & Hang, B 2017, Transformation of XML Data Sources for Sequential Path Mining. in Unknown Host Publication. International workshop on graph data management and analysis (GDMA 2017), 19/10/17.

Transformation of XML Data Sources for Sequential Path Mining. / McNerlan, Ruth; Bi, Yaxin; Zhao, Gouge; Hang, Bing.

Unknown Host Publication. 2017.

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Transformation of XML Data Sources for Sequential Path Mining

AU - McNerlan, Ruth

AU - Bi, Yaxin

AU - Zhao, Gouge

AU - Hang, Bing

PY - 2017/10/19

Y1 - 2017/10/19

N2 - In recent years XML has become one of the most promising ways to define semi-structured data. Data mining techniques devised for detecting interesting patterns from semi-structure data have also grown in popularity, but carrying out such techniques on XML data can be problematic due to its hierarchical structure. Therefore, it has become necessary to transform XML into flattened, path data, so as to enable data mining to be carried out efficiently. However, problems may arise when the XML tree needs to be reconstructed from the traversal path. There are currently many transformation techniques for XML data, many of which take advantage of its tree-like hierarchical structure; but most of these approaches do not allow the XML tree to be reconstructed from the traversal path. In this paper we propose a new approach to the transformation of XML data into path data. The new approach employs a 5 step transformation process along with a new ‘Postorder Sequencing’ method of traversing the XML tree. The proposed method, on the one hand, can be seen an efficient and effective way of transforming XML data into collections of paths, and on the other hand enables XML trees to be generated from the traversal paths

AB - In recent years XML has become one of the most promising ways to define semi-structured data. Data mining techniques devised for detecting interesting patterns from semi-structure data have also grown in popularity, but carrying out such techniques on XML data can be problematic due to its hierarchical structure. Therefore, it has become necessary to transform XML into flattened, path data, so as to enable data mining to be carried out efficiently. However, problems may arise when the XML tree needs to be reconstructed from the traversal path. There are currently many transformation techniques for XML data, many of which take advantage of its tree-like hierarchical structure; but most of these approaches do not allow the XML tree to be reconstructed from the traversal path. In this paper we propose a new approach to the transformation of XML data into path data. The new approach employs a 5 step transformation process along with a new ‘Postorder Sequencing’ method of traversing the XML tree. The proposed method, on the one hand, can be seen an efficient and effective way of transforming XML data into collections of paths, and on the other hand enables XML trees to be generated from the traversal paths

KW - XML

KW - Transformation

KW - XPath

KW - Sequential Data Mining

M3 - Conference contribution

SN - 978-3-319-69780-2

BT - Unknown Host Publication

ER -

McNerlan R, Bi Y, Zhao G, Hang B. Transformation of XML Data Sources for Sequential Path Mining. In Unknown Host Publication. 2017