Fine-tuning multilingual transformers for automatic assessment of cross-language concept map semantic similarity / Nadindra Dwi Ariyanta</p> - Repositori Universitas Negeri Malang

Fine-tuning multilingual transformers for automatic assessment of cross-language concept map semantic similarity / Nadindra Dwi Ariyanta</p>

Ariyanta, Nadindra Dwi (2025) Fine-tuning multilingual transformers for automatic assessment of cross-language concept map semantic similarity / Nadindra Dwi Ariyanta</p>. Masters thesis, Universitas Negeri Malang.

Full text not available from this repository.

Abstract

This study evaluates the performance of multilingual transformer models in automated cross-lingual concept map assessment addressing inherent challenges in increasingly diverse global educational environments. While concept maps are crucial tools for student comprehension their automated assessment is complex particularly as pre-trained models like Multilingual Bidirectional Encoder Representations from Transformers (mBERT) Cross-Lingual Language Model - RoBERTa (XLM-R) Multilingual Bidirectional and Auto-Regressive Transformers (mBART) and Multilingual Text-to-Text Transfer Transformer (MT5) are often not optimally suited for this specific semantic similarity task especially in low-resource languages. Therefore this research aims to evaluate the performance of these models in their pre-trained state and analyze the impact of fine-tuning on their ability to automatically assess concept map quality. The methodology involved collecting Indonesian and English concept map data followed by data preprocessing (cleaning case folding tokenization) and fine-tuning the models on STS-B (Semantic Textual Similarity Benchmark) and GLUE (General Language Understanding Evaluation) datasets. Semantic similarity was measured using Cosine Similarity and model performance was evaluated with Accuracy Precision F1-Score RMSE and MAE. Results indicate that fine-tuning significantly improved performance for Indonesian (e.g. fine-tuned XLM-R and mBART achieved 89% accuracy and 94% F1-score) though overfitting was identified. Performance improvement in English was less significant due to its inherent linguistic complexity. The study s implications underscore the importance of targeted fine-tuning to maximize model effectiveness while also highlighting the need for more efficient fine-tuning strategies to mitigate overfitting and enhance generalization on complex datasets.

Item Type: Thesis (Masters)
Divisions: Fakultas Teknik (FT) > Departemen Teknik Elektro (TE) > S2 Teknik Elektro
Depositing User: library UM
Date Deposited: 21 Aug 2025 04:29
Last Modified: 09 Sep 2025 03:00
URI: http://repository.um.ac.id/id/eprint/390734

Actions (login required)

View Item View Item