Enhanced Web Payload Classification Using WAMM: An AI-Based Framework for Dataset Refinement and Model Evaluation

Authors: Heba Osama, Omar Elebiary, Youssef Qassim, Mohamed Amgad, Ahmed Maghawry, Ahmed Saafan, Haitham Ghalwash | Published: 2025-12-29

2025.12.29

Authors: Heba Osama, Omar Elebiary, Youssef Qassim, Mohamed Amgad, Ahmed Maghawry, Ahmed Saafan, Haitham Ghalwash
Published: 2025-12-29

Source: https://arxiv.org/abs/2512.23610

PDF: https://arxiv.org/pdf/2512.23610

AIにより推定されたラベル

データ前処理機械学習技術 SQLインジェクション攻撃検出

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Web applications increasingly face evasive and polymorphic attack payloads, yet traditional web application firewalls (WAFs) based on static rule sets such as the OWASP Core Rule Set (CRS) often miss obfuscated or zero-day patterns without extensive manual tuning. This work introduces WAMM, an AI-driven multiclass web attack detection framework designed to reveal the limitations of rule-based systems by reclassifying HTTP requests into OWASP-aligned categories for a specific technology stack. WAMM applies a multi-phase enhancement pipeline to the SR-BH 2020 dataset that includes large-scale deduplication, LLM-guided relabeling, realistic attack data augmentation, and LLM-based filtering, producing three refined datasets. Four machine and deep learning models are evaluated using a unified feature space built from statistical and text-based representations. Results show that using an augmented and LLM-filtered dataset on the same technology stack, XGBoost reaches 99.59