A Smooth-Delayed Phase-Type Mixture Model for Human-Driven Process Duration Modeling

Research output: Contribution to journalArticlepeer-review

12 Downloads (Pure)

Abstract

Activities in business processes primarily depend on human behavior for completion. Due to human agency, the behavior underlying individual activities may occur in multiple phases and can vary in execution. As a result, the execution duration and nature of such activities may exhibit complex multimodal characteristics. Phase-type distributions are useful for analyzing the underlying behavioral structure, which may consist of multiple sub-activities. The phenomenon of delayed start is also common in such activities, possibly due to the minimum task completion time or prerequisite tasks. As a result, the distribution of durations or certain components does not start at zero but has a minimum value, and the probability below this value is zero. When using phase-type models to fit such distributions, a large number of phases are often required, which exceed the actual number of sub-activities. This reduces the interpretability of the parameters and may also lead to optimization difficulties due to overparameterization. In this paper, we propose a smooth-delayed phase-type mixture model that introduces delay parameters to address the difficulty of fitting this kind of distribution. Since durations shorter than the delay should have zero probability, such hard truncation renders the parameter not estimable under the Expectation–Maximization (EM) framework. To overcome this, we design a soft-truncation mechanism to improve model convergence. We further develop an inference framework that combines the EM algorithm, Bayesian inference, and Sequential Least Squares Programming for comprehensive and efficient parameter estimation. The method is validated on a synthetic dataset and two real-world datasets. Results demonstrate that the proposed approach maintains a suitable performance comparable to purely data-driven methods while providing good interpretability to reveal the potential underlying structure behind human-driven activities.
Original languageEnglish
Article number575
Pages (from-to)1-28
Number of pages28
JournalAlgorithms
Volume18
Issue number9
Early online date11 Sept 2025
DOIs
Publication statusPublished (in print/issue) - 30 Sept 2025

Bibliographical note

Publisher Copyright:
© 2025 by the authors.

Data Access Statement

The two data sets presented in this study are openly available as follows: (1) Hospital Billing-Event Log: Mannhardt, Felix (2017): Hospital Billing-Event Log. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/uuid:76c46b83-c930-4798-a1c9-4be94dfeb741 [32]. (2) BPI Challenge 2020: Domestic Declarations: van Dongen, Boudewijn (2020): BPI Challenge 2020: Domestic Declarations. Version 1. 4TU.ResearchData. dataset. https://doi.org/10.4121/uuid:3f422315-ed9d-4882-891f-e180b5b4feb5 [35].

Funding

This research was funded by Invest NI through the Advanced Research and Engineering Centre and was part-financed by the European Regional Development Fund under the Investment for Growth and Jobs Programme 2014–2020.

Keywords

  • phase-type distribution
  • mixture model
  • Bayesian inference
  • human-driven process
  • process duration modeling

Fingerprint

Dive into the research topics of 'A Smooth-Delayed Phase-Type Mixture Model for Human-Driven Process Duration Modeling'. Together they form a unique fingerprint.

Cite this