Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • Article
  • Advances in Neural Information Processing Systems (NeurIPS)

Counterfactual Explanations Can Be Manipulated

By: Dylan Slack, Sophie Hilgard, Himabindu Lakkaraju and Sameer Singh
  • Format:Print
ShareBar

Abstract

Counterfactual explanations are useful for both generating recourse and auditing fairness between groups. We seek to understand whether adversaries can manipulate counterfactual explanations in an algorithmic recourse setting: if counterfactual explanations indicate both men and women must earn $100 more on average to receive a loan, can we be sure lower cost recourse does not exist for the men? By construction, we show that adversaries can design models for which counterfactual explanations generate similar cost recourses between groups. However, the same methods provide much lower cost recourses for specific subgroups in the data when the original instances are slightly perturbed, effectively hiding recourse disparities in models. We demonstrate vulnerabilities in a variety of counterfactual explanation techniques. On loan and violent crime prediction data sets, we train models where counterfactual explanations find up to 20x lower cost recourse for specific subgroups in the data. These results raise crucial concerns regarding the dependability of current counterfactual explanation techniques with adversarial actors, which we hope will inspire further investigations in robust and reliable counterfactual explanations.

Keywords

Machine Learning Models; Counterfactual Explanations

Citation

Slack, Dylan, Sophie Hilgard, Himabindu Lakkaraju, and Sameer Singh. "Counterfactual Explanations Can Be Manipulated." Advances in Neural Information Processing Systems (NeurIPS) 34 (2021).
  • Read Now

About The Author

Himabindu Lakkaraju

Technology and Operations Management
→More Publications

More from the Authors

    • June 2023
    • Transactions on Machine Learning Research (TMLR)

    When Does Uncertainty Matter? Understanding the Impact of Predictive Uncertainty in ML Assisted Decision Making

    By: Sean McGrath, Parth Mehta, Alexandra Zytek, Isaac Lage and Himabindu Lakkaraju
    • 2023
    • Proceedings of the International Conference on Learning Representations (ICLR)

    Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse

    By: Martin Pawelczyk, Teresa Datta, Johannes van-den-Heuvel, Gjergji Kasneci and Himabindu Lakkaraju
    • April 2023
    • Proceedings of the International Conference on Artificial Intelligence and Statistics (AISTATS)

    On the Privacy Risks of Algorithmic Recourse

    By: Martin Pawelczyk, Himabindu Lakkaraju and Seth Neel
More from the Authors
  • When Does Uncertainty Matter? Understanding the Impact of Predictive Uncertainty in ML Assisted Decision Making By: Sean McGrath, Parth Mehta, Alexandra Zytek, Isaac Lage and Himabindu Lakkaraju
  • Probabilistically Robust Recourse: Navigating the Trade-offs between Costs and Robustness in Algorithmic Recourse By: Martin Pawelczyk, Teresa Datta, Johannes van-den-Heuvel, Gjergji Kasneci and Himabindu Lakkaraju
  • On the Privacy Risks of Algorithmic Recourse By: Martin Pawelczyk, Himabindu Lakkaraju and Seth Neel
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College