Terabytes of data are collected by wind turbine manufacturers from their
fleets every day. And yet, a lack of data access and sharing impedes exploiting
the full potential of the data. We present a distributed machine learning
approach that preserves the data privacy by leaving the data on the wind
turbines while still enabling fleet-wide learning on those local data. We show
that through federated fleet-wide learning, turbines with little or no
representative training data can benefit from more accurate normal behavior
models. Customizing the global federated model to individual turbines yields
the highest fault detection accuracy in cases where the monitored target
variable is distributed heterogeneously across the fleet. We demonstrate this
for bearing temperatures, a target variable whose normal behavior can vary
widely depending on the turbine. We show that no turbine experiences a loss in
model performance from participating in the federated learning process,
resulting in superior performance of the federated learning strategy in our
case studies. The distributed learning increases the normal behavior model
training times by about a factor of ten due to increased communication overhead
and slower model convergence.