Decision Explanation and Feature Importance for Invertible Networks

TOP 文献データベース Decision Explanation and Feature Importance for Invertible Networks

arxiv

AIセキュリティポータルbot

文献データベースの情報は、自動的に収集されています。

Source

https://arxiv.org/abs/1910.00406

PDF

https://arxiv.org/pdf/1910.00406

文献情報

作者: Juntang Zhuang,Nicha C. Dvornek,Xiaoxiao Li,Junlin Yang,James S. Duncan
公開日: 2019-9-30
更新日: 2019-10-15
所属機関: Biomedical Engineering, Yale University
所属の国: United States of America
会議名: ICCV Workshops

AIにより推定されたラベル

モデルの設計や精度特徴選択手法機械学習アルゴリズム

※ こちらのラベルはAIによって自動的に追加されました。そのため、正確でないことがあります。
詳細は文献データベースについてをご覧ください。

Abstract

Deep neural networks are vulnerable to adversarial attacks and hard to interpret because of their black-box nature. The recently proposed invertible network is able to accurately reconstruct the inputs to a layer from its outputs, thus has the potential to unravel the black-box model. An invertible network classifier can be viewed as a two-stage model: (1) invertible transformation from input space to the feature space; (2) a linear classifier in the feature space. We can determine the decision boundary of a linear classifier in the feature space; since the transform is invertible, we can invert the decision boundary from the feature space to the input space. Furthermore, we propose to determine the projection of a data point onto the decision boundary, and define explanation as the difference between data and its projection. Finally, we propose to locally approximate a neural network with its first-order Taylor expansion, and define feature importance using a local linear model. We provide the implementation of our method: \url{https://github.com/juntang-zhuang/explain_invertible}.