Phishing attacks are becoming increasingly sophisticated, underscoring the
need for detection systems that strike a balance between high accuracy and
computational efficiency. This paper presents a comparative evaluation of
traditional Machine Learning (ML), Deep Learning (DL), and quantized
small-parameter Large Language Models (LLMs) for phishing detection. Through
experiments on a curated dataset, we show that while LLMs currently
underperform compared to ML and DL methods in terms of raw accuracy, they
exhibit strong potential for identifying subtle, context-based phishing cues.
We also investigate the impact of zero-shot and few-shot prompting strategies,
revealing that LLM-rephrased emails can significantly degrade the performance
of both ML and LLM-based detectors. Our benchmarking highlights that models
like DeepSeek R1 Distill Qwen 14B (Q8_0) achieve competitive accuracy, above
80%, using only 17GB of VRAM, supporting their viability for cost-efficient
deployment. We further assess the models' adversarial robustness and
cost-performance tradeoffs, and demonstrate how lightweight LLMs can provide
concise, interpretable explanations to support real-time decision-making. These
findings position optimized LLMs as promising components in phishing defence
systems and offer a path forward for integrating explainable, efficient AI into
modern cybersecurity frameworks.