Enhancing the interactive visualisation of a data preparation tool from in-memory fitting to Big Data sets

Gorka Epelde, Roberto Álvarez, Andoni Beristain, Mónica Arrúe, Itsasne Arangoa, Debbie Rankin

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

146 Downloads (Pure)

Abstract

In order to derive reliable insights or make evidence-based decisions, the starting point is to assess and meet a minimum quality of data, either by those that publish the data (preferably) or alternatively by those that prepare data for analysis and develop specific analytics. Much of the (open) data shared by governments and different institutions, or crowdsourced, is in tabular format, and the amount and size of it is increasing rapidly. This paper presents the challenges faced and the solutions adopted while evolving the web-based graphical user interface (GUI) of a tabular data preparation tool from in-memory fitting to Big Data sets. Traditional standalone processing and rendering solutions are no longer usable in a Big Data context. We report on the approach adopted to asynchronously pre-compute the visualisations required for the tool, in addition to the applied visualisation aggregation strategies. The implementation of this approach has allowed us to overcome web-browsers’ client-side data handling limitations and to avoid information overload when using granular information charts from our existing in-memory data preparation tool with Big Data sets. The developed solution provides the user with an acceptable GUI interaction time.
Original languageEnglish
Title of host publication3rd Workshop on Quality of Open Data (QOD 2020)
Publication statusAccepted/In press - 24 May 2020

Keywords

  • big data visualisation
  • data preparation
  • data quality
  • exploratory data analysis
  • visual information cluttering
  • data reduction
  • asynchronous pre-processing

Fingerprint

Dive into the research topics of 'Enhancing the interactive visualisation of a data preparation tool from in-memory fitting to Big Data sets'. Together they form a unique fingerprint.

Cite this