Research Summary
Research Summary
Understanding the Limitations of Model Explanations
Description
The goal of this research is to understand how adversaries can exploit various algorithms used for explaining complex machine learning models with an intention to mislead end users. For instance, can adversaries trick these algorithms into masking their racial and gender biases?