Parameter Optimization on Spark for Particulate Matter Estimation

Zhenyu Yu, Zhibao Wang, Lu Bai, Liangfu Chen, Jinhua Tao

Research output: Chapter in Book/Report/Conference proceedingConference contributionpeer-review

3 Downloads (Pure)

Abstract

With the rapid growth of remote sensing satellites, the volume of remote sensing data has been continuously increasing, which makes it necessary to utilize the big data platform for the rapid practical application of remote sensing inversion algorithms. This paper proposes an atmospheric remote sensing inversion processing method based on Spark. As a popular large-scale data processing framework, the memory-based iterable calculation model of Spark makes it suitable for the application of atmospheric remote sensing inversion. In this paper, we use the Spark computing framework to calculate the average value of the particulate matter in China over the past 10 years and the running time is much faster than the traditional single-node method. Furthermore, how Spark configuration parameters affect the performance of the task is explored. Different regression models in XGBoost are used to evaluate the performance of the parameters obtained by the parameter optimization algorithm in order to find the Spark optimal configuration parameters that meet the requirements.
Original languageEnglish
Title of host publication2021 Workshop on Algorithm and Big Data
PublisherAssociation for Computing Machinery
Pages9-13
Number of pages5
DOIs
Publication statusPublished - 12 Mar 2021

Keywords

  • Particulate matter estimation
  • Spark
  • Parameter optimization
  • Performance prediction

Fingerprint

Dive into the research topics of 'Parameter Optimization on Spark for Particulate Matter Estimation'. Together they form a unique fingerprint.

Cite this