Negative impact “Unfair biased and discriminatory output”

This page provides the security targets of negative impacts “Unfair biased and discriminatory output” in the external influence aspect in the AI Security Map, as well as the attacks and factors that cause them, and the corresponding defense methods and countermeasures.

Security target

  • Consumer

Attack or cause

  • Integrity violation
  • Degradation of controllability
  • Degradation of output fairness

Defensive method or countermeasure

  • Defensive method for integrity
  • AI alignment
  • Countermeasures for output fairness
  • Detection of bias in AI output

References