Fine-tuning bert-based pre-trained models for arabic dependency parsing | مواقع أعضاء هيئة التدريس

شريفة بنت أحمد بن محمد الغامدي

أستاذ مساعد

رئيسة قسم تقنية المعلومات

كلية علوم الحاسب والمعلومات

6T107

Fine-tuning bert-based pre-trained models for arabic dependency parsing

With the advent of pre-trained language models, many natural language processing tasks in various languages have achieved great success. Although some research has been conducted on fine-tuning BERT-based models for syntactic parsing, and several Arabic pre-trained models have been developed, no attention has been paid to Arabic dependency parsing. In this study, we attempt to fill this gap and compare nine Arabic models, fine-tuning strategies, and encoding methods for dependency parsing. We evaluated three treebanks to highlight the best options and methods for fine-tuning Arabic BERT-based models to capture syntactic dependencies in the data. Our exploratory results show that the AraBERTv2 model provides the best scores for all treebanks and confirm that fine-tuning to the higher layers of pre-trained models is required. However, adding additional neural network layers to those models drops the accuracy. Additionally, we found that the treebanks have differences in the encoding techniques that give the highest scores. The analysis of the errors obtained by the test examples highlights four issues that have an important effect on the results: parse tree post-processing, contextualized embeddings, erroneous tokenization, and erroneous annotation. This study reveals a direction for future research to achieve enhanced Arabic BERT-based syntactic parsing.

اسم الناشر

Applied Sciences

مزيد من المنشورات

MirathQA: A dataset for evaluating large language models on Hanbali Islamic inheritance reasoning tasks

Islamic inheritance (Muwārīth/مواريث) refers to the distribution of a deceased person's estate among qualified heirs in accordance with Sharia laws derived from the Qur’an and Sunnah.

2026

A Novel Approach for Root Selection in the Dependency Parsing

Although syntactic analysis using the sequence labeling method is promising, it can be problematic when the labels sequence does not contain a root label. This can result in errors in the final…

بواسطة Sharefah Ahmed Al-Ghamdi, Hend Al-Khalifa, Abdulmalik AlSalman

2024

Fine-tuning bert-based pre-trained models for arabic dependency parsing

With the advent of pre-trained language models, many natural language processing tasks in various languages have achieved great success.

بواسطة Sharefah Al-Ghamdi, Hend Al-Khalifa, Abdulmalik Al-Salman

2023

تم النشر فى:

Applied Sciences