Please use this identifier to cite or link to this item:
https://repositori.uma.ac.id/handle/123456789/27660
Title: | Meningkatkan Deteksi Email Phising Melalui Pendekatan SVM yang Dioptimalkan NLP |
Other Titles: | Enhancing Phishing Email Detection through NLP-Optimized SVM Approach |
Authors: | Tanjung, Rino Nurcahyo Fauzi |
metadata.dc.contributor.advisor: | Rahman, Sayuti |
Keywords: | deteksi email phising;nlp;Enhancing Phishing Email Detection |
Issue Date: | May-2025 |
Publisher: | Universitas Medan Area |
Series/Report no.: | NPM;208160029 |
Abstract: | Email phishing merupakan ancaman keamanan siber yang signifikan, dengan risiko mencakup kebocoran data pribadi, penipuan keuangan, dan penyebaran malware. Studi ini bertujuan untuk meningkatkan deteksi email phishing dengan pendekatan machine learning, mengoptimalkan teknik ekstraksi fitur, serta memilih algoritma klasifikasi yang paling efektif. Penelitian ini menggunakan Dataset Deteksi Email Phishing, yang berisi berbagai teks email yang telah dikategorikan sebagai aman atau phishing. Teknik Natural Language Processing (NLP), khususnya Term Frequency- Inverse Document Frequency (TF-IDF), diterapkan untuk mengubah teks menjadi vektor numerik guna meningkatkan representasi fitur. Model klasifikasi utama yang digunakan adalah Support Vector Machine (SVM) dengan kernel polynomial, yang dibandingkan dengan algoritma lain seperti Random Forest, Naïve Bayes, Logistic Regression, dan K-Nearest Neighbors (KNN). Evaluasi dilakukan menggunakan metrik akurasi, precision, recall, dan F1-score. Hasil eksperimen menunjukkan bahwa SVM dengan kernel polinomial dan TF-IDF memberikan akurasi tertinggi sebesar 97,85%. Teknik NLP seperti tokenisasi dan penghapusan kata henti juga berkontribusi terhadap peningkatan akurasi klasifikasi. Studi ini menunjukkan bahwa kombinasi NLP dan SVM secara efektif meningkatkan deteksi email phishing. Penelitian selanjutnya dapat mengeksplorasi integrasi model pembelajaran mendalam dan teknik NLP lanjutan untuk meningkatkan akurasi serta efisiensi sistem deteksi phishing. Phishing emails are a significant cybersecurity threat, with risks including personal data leakage, financial fraud, and malware distribution. This study aims to improve phishing email detection using a machine learning approach, optimizing feature extraction techniques, and selecting the most effective classification algorithm. This study uses the Phishing Email Detection Dataset, which contains various email texts that have been categorized as safe or phishing. Natural Language Processing (NLP) techniques, specifically Term Frequency- Inverse Document Frequency (TF-IDF), are applied to transform the text into numeric vectors to improve feature representation. The main classification model used is Support Vector Machine (SVM) with a polynomial kernel, which is compared with other algorithms such as Random Forest, Naïve Bayes, Logistic Regression, and K-Nearest Neighbors (KNN). Evaluation is carried out using accuracy, precision, recall, and F1-score metrics. The experimental results show that SVM with a polynomial kernel and TF-IDF provide the highest accuracy of 97.85%. NLP techniques such as tokenization and stopword removal also contribute to improving the classification accuracy. This study shows that the combination of NLP and SVM effectively improves the detection of phishing emails. Future research can explore the integration of deep learning models and advanced NLP techniques to improve the accuracy and efficiency of phishing detection systems. |
Description: | 14 Halaman |
URI: | https://repositori.uma.ac.id/handle/123456789/27660 |
Appears in Collections: | SP - Informatic Engineering |
Files in This Item:
File | Description | Size | Format | |
---|---|---|---|---|
208160029 - Rino Nurcahyo Fauzi Tanjung - Fulltext.pdf | Fulltext | 1.07 MB | Adobe PDF | View/Open |
Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.