MeetMulti-X: A Benchmark Analysis of Scaling and Prompting Large Language Models on Automatic Minuting

Research output: Contribution to journalArticlepeer-review

Abstract

The task of automatic minuting, i.e., capturing all the points from transcripts of multi-party meetings, presents considerable challenges owing to the spontaneous and complex nature of discussions. As organisations increasingly depend on meetings for decision-making, the need for efficient and optimised minuting has intensified, underscoring the shortcomings of manual note-taking due to cognitive overload and diverted participant engagement. This study systematically analyses the impact of scaling Large Language Models (LLMs) on automatic minuting, emphasising key factors including pretrained dataset size, model size, context length, and prompt length. The benchmark evaluation includes both quantitative and qualitative with 19 open-source models (from 77M to 70B parameters) and 4 closed-source models (over 1T parameters) across 4 meeting corpora and prompts. Our findings indicate that (1) models with less than 8B parameters offer a favorable trade-off between performance and efficiency, achieving comparable results to their larger counterparts. (2) scaling pretrained data size improves performance up to a threshold, beyond which gains diminish. (3) context length exhibits a non-linear effect, with optimal performance around 8K–16K tokens. (4) longer prompts consistently degrade output quality, highlighting the need for concise and well-structured prompting. To the best of our knowledge this is the first work, exploring scaling LLMS on automatic minuting. Code is available at https://anonymous.4open.science/r/MeetMultiX-9A36
Original languageEnglish
Article number130428
JournalExpert Systems with Applications
Early online date29 Nov 2025
DOIs
Publication statusPublished online - 29 Nov 2025

Bibliographical note

© 2025 The Author(s). Published by Elsevier Ltd.

Funding

FundersFunder number
Engineering and Physical Sciences Research Council (EPSRC) - full college memberEP/T022175/1
Uster University Vice Chancellor’s PhD research scholarship

    Keywords

    • Automatic minuting
    • Meeting summarisation
    • Scaling laws
    • Large language models

    Fingerprint

    Dive into the research topics of 'MeetMulti-X: A Benchmark Analysis of Scaling and Prompting Large Language Models on Automatic Minuting'. Together they form a unique fingerprint.

    Cite this