Abstract
In recent years, the integration of Large Language Models (LLMs) like ChatGPT into various tasks has revolutionized the process of generating training data for machine learning models. This paper presents a novel approach that leverages both human expertise and AI collaboration to expand small datasets, particularly in cases where data scarcity limits the performance of models requiring fine-tuning. The study documents methodologies used to scale data generation efforts by combining human input with advanced AI, focuses on prompt engineering to optimize outputs. The objective is to generate comprehensive datasets from a limited number of regulatory articles related to regulating the practice of engineering professions, ensuring accuracy and contextual relevance. The methodology involved processing articles individually, transitioning to batch processing, and iterating with continuous feedback. The results underscore the importance of human-AI synergy in achieving high-quality outputs, where the human element ensures accuracy, and the AI accelerates the data generation process. The findings demonstrate that prompt engineering plays a critical role in guiding AI to generate reliable data. Finally, the research emphasizes the potential of this approach to improve the efficiency and scalability of data generation for model fine-tuning, offering insights into the effective use of human-AI collaboration in broader contexts.
Original language | English |
---|---|
Title of host publication | 2024 International Conference on Decision Aid Sciences and Applications (DASA) |
Publisher | IEEE |
Pages | 1-6 |
Number of pages | 6 |
ISBN (Electronic) | 979-8-3503-6910-6 |
ISBN (Print) | 979-8-3503-6911-3 |
DOIs | |
Publication status | Published online - 17 Jan 2025 |
Event | 2024 International Conference on Decision Aid Sciences and Applications (DASA) - manma, Bahrain Duration: 11 Dec 2024 → 12 Dec 2024 |
Conference
Conference | 2024 International Conference on Decision Aid Sciences and Applications (DASA) |
---|---|
Country/Territory | Bahrain |
City | manma |
Period | 11/12/24 → 12/12/24 |
Keywords
- Feedback loop
- Accuracy
- Scalability
- Collaboration
- Machine learning
- Data collection
- Chatbots
- Data models
- Prompt engineering
- Artificial intelligence
- Human-AI Collaboration
- Fine-Tuning Models
- Small Data Expansion
- Large Language Models (LLMs)
- Prompt Engineering
- Architecture
- Engineering & Construction (AEC) industry
- Building Regulations
- Kingdom of Bahrain