Resumen
As textbooks evolve into digital platforms, they open a world of opportunities for Artificial Intelligence in Education (AIED) research. This paper delves into the novel use of textbooks as a source of high-quality labeled data for automatic keyword extraction, demonstrating an affordable and efficient alternative to traditional methods. By utilizing the wealth of structured information provided in textbooks, we propose a methodology for annotating corpora across diverse domains, circumventing the costly and time-consuming process of manual data annotation. Our research presents a deep learning model based on Bidirectional Encoder Representations from Transformers (BERT) fine-tuned on this newly labeled dataset. This model is applied to keyword extraction tasks, with the model’s performance surpassing established baselines. We further analyze the transformation of BERT’s embedding space before and after the fine-tuning phase, illuminating how the model adapts to specific domain goals. Our findings substantiate textbooks as a resource-rich, untapped well of high-quality labeled data, underpinning their significant role in the AIED research landscape.
| Idioma original | Inglés |
|---|---|
| Páginas (desde-hasta) | 66-77 |
| Número de páginas | 12 |
| Publicación | CEUR Workshop Proceedings |
| Volumen | 3444 |
| Estado | Publicada - 2023 |
| Evento | 5th International Workshop on Intelligent Textbooks, iTextbooks 2023 - Tokyo, Japón Duración: 3 jul 2023 → … |
Huella
Profundice en los temas de investigación de 'Harnessing Textbooks for High-Quality Labeled Data: An Approach to Automatic Keyword Extraction'. En conjunto forman una huella única.Citar esto
- APA
- Author
- BIBTEX
- Harvard
- Standard
- RIS
- Vancouver