Principles for Evaluation of AI/ML Model Performance and Robustness

TOP Literature Database Principles for Evaluation of AI/ML Model Performance and Robustness

arxiv

AI Security Portal bot

Information in the literature database is collected automatically.

Source

https://arxiv.org/abs/2107.02868

PDF

https://arxiv.org/pdf/2107.02868

Paper Information

Author: Olivia Brown;Andrew Curtis;Justin Goodwin
Published: 7-7-2021
Affiliation
Country
Conference

Labels Estimated by AI

Model Performance Evaluation Robustness Robustness Evaluation

These labels were automatically added by AI and may be inaccurate.
For details, see About Literature Database.

Abstract

The Department of Defense (DoD) has significantly increased its investment in the design, evaluation, and deployment of Artificial Intelligence and Machine Learning (AI/ML) capabilities to address national security needs. While there are numerous AI/ML successes in the academic and commercial sectors, many of these systems have also been shown to be brittle and nonrobust. In a complex and ever-changing national security environment, it is vital that the DoD establish a sound and methodical process to evaluate the performance and robustness of AI/ML models before these new capabilities are deployed to the field. This paper reviews the AI/ML development process, highlights common best practices for AI/ML model evaluation, and makes recommendations to DoD evaluators to ensure the deployment of robust AI/ML capabilities for national security needs.