Hana, Shohwatul (2025) Prediction of increasing number of citations in scopus documents based on random forest algorithm. / Shohwatul Hana</p>. Diploma thesis, Universitas Negeri Malang.
Full text not available from this repository.Abstract
p Citation is a key indicator in measuring the credibility and impact of a scientific work. This study aims to predict the increase in citation counts for Scopus-indexed documents using the Random Forest algorithm. The prediction model is developed based on various factors such as the number of authors open-access status and institutional affiliation. The research dataset consists of articles published between January 2021 and July 2024 with citation data collected in July and September 2024. The data undergoes preprocessing clustering using K-Means and model testing with Cross Validation. Initial model evaluation shows that Random Forest achieves high accuracy (98.48%) but struggles to classify documents with significant citation growth. To address data imbalance the Synthetic Minority Oversampling Technique (SMOTE) is applied significantly improving the model rsquo s performance in predicting highly cited documents. After implementing SMOTE the F1-score for the high-citation class increased from 8.46% to 82.01% with a slight drop in overall accuracy to 79.87%. Key factors contributing to citation growth include document type (Conference Papers amp Articles) open-access status and international collaboration. This study demonstrates that Random Forest can be an effective prediction tool for identifying articles with a high potential for increased citations especially after applying data balancing techniques. /p
| Item Type: | Thesis (Diploma) |
|---|---|
| Divisions: | Fakultas Teknik (FT) > Departemen Teknik Elektro (TE) > S1 Teknik Informatika |
| Depositing User: | library UM |
| Date Deposited: | 28 May 2025 04:29 |
| Last Modified: | 09 Sep 2025 03:00 |
| URI: | http://repository.um.ac.id/id/eprint/400282 |
Actions (login required)
![]() |
View Item |
