Identifikasi Kuantitas Berbasis Rule Pada Masalah Text-to-SQL
Abstract
Abstract: This study focuses on developing a rule-based system designed to detect and extract quantity information from Indonesian sentences as part of the Text-to-SQL process. The proposed system consists of three main components: a user interface, a knowledge base containing IF-THEN rules, and a forward-chaining inference engine implemented using the Experta framework. The dataset comprises 1,000 sentences from various domains, including academic, inventory, accounting, and booking, with label distributions of 815 digit data, 164 word data, and 21 unknown data. Evaluation results demonstrate system performance with an Exact Match Ratio of 0.76 and a Jaccard Similarity of 0.75. For multilabel classification, the system achieves a Micro Precision of 0.98, Micro Recall of 0.66, and Micro F1-score of 0.79, while the Macro average reaches 0.52. With a Hamming Loss of 0.00, the system shows very low label errors. This research is expected to serve as a foundation for developing hybrid models combining rule-based methods and machine learning to improve accuracy, flexibility, and overall system performance in the future.
Abstrak: Penelitian ini berfokus pada pengembangan sistem berbasis aturan (rule-based) untuk mendeteksi serta mengekstraksi informasi kuantitas dalam kalimat berbahasa Indonesia sebagai bagian dari proses Text-to-SQL. Sistem yang dirancang terdiri dari tiga komponen utama, yaitu antarmuka pengguna, basis pengetahuan berbentuk aturan IF-THEN, serta mesin inferensi berbasis forward chaining yang dibangun menggunakan kerangka kerja Experta. Dataset penelitian mencakup 1.000 kalimat dari berbagai domain, termasuk akademik, inventori, akuntansi, dan pemesanan, dengan distribusi label sebanyak 815 data digit, 164 data kata, dan 21 data unknown. Hasil evaluasi menunjukkan kinerja sistem dengan nilai Exact Match Ratio sebesar 0,76 dan Jaccard Similarity sebesar 0,75. Pada klasifikasi multilabel, diperoleh Micro Precision 0,98, Micro Recall 0,66, dan Micro F1-score 0,79, sedangkan rata-rata per label (Macro) sebesar 0,52. Dengan Hamming Loss 0,00, sistem menunjukkan kesalahan label yang sangat rendah. Penelitian ini diharapkan dapat menjadi dasar pengembangan model hybrid yang menggabungkan aturan dan pembelajaran mesin untuk meningkatkan akurasi, fleksibilitas, serta performa sistem di masa mendatang.
Keywords
Full Text:
PDFDOI: https://doi.org/10.35334/jbit.v5i1.6819
Refbacks
- There are currently no refbacks.

This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Address: Gedung Dekanat, Fakultas Teknik, Universitas Borneo Tarakan. Jl. Amal Lama No. 1, Tarakan, Kalimantan Utara, Indonesia. Kodepos: 77123. | All publications by JBIT (Jurnal Borneo Informatika dan Teknik Komputer) are licensed under a External Link: |