Skip to Main Content
HBS Home
  • About
  • Academic Programs
  • Alumni
  • Faculty & Research
  • Baker Library
  • Giving
  • Harvard Business Review
  • Initiatives
  • News
  • Recruit
  • Map / Directions
Faculty & Research
  • Faculty
  • Research
  • Featured Topics
  • Academic Units
  • …→
  • Harvard Business School→
  • Faculty & Research→
Publications
Publications
  • 2015
  • Article
  • Journal of Computational and Graphical Statistics

Scalable Detection of Anomalous Patterns With Connectivity Constraints

By: Skyler Speakman, Edward McFowland III and Daniel B. Neill
  • Format:Print
ShareBar

Abstract

We present GraphScan, a novel method for detecting arbitrarily shaped connected clusters in graph or network data. Given a graph structure, data observed at each node, and a score function defining the anomalousness of a set of nodes, GraphScan can efficiently and exactly identify the most anomalous (highest-scoring) connected subgraph. Kulldorff’s spatial scan, which searches over circles consisting of a center location and its k − 1 nearest neighbors, has been extended to include connectivity constraints by FlexScan. However, FlexScan performs an exhaustive search over connected subsets and is computationally infeasible for k > 30. Alternatively, the upper level set (ULS) scan scales well to large graphs but is not guaranteed to find the highest-scoring subset. We demonstrate that GraphScan is able to scale to graphs an order of magnitude larger than FlexScan, while guaranteeing that the highest-scoring subgraph will be identified. We evaluate GraphScan, Kulldorff’s spatial scan (searching over circles) and ULS in two different settings of public health surveillance. The first examines detection power using simulated disease outbreaks injected into real-world Emergency Department data. GraphScan improved detection power by identifying connected, irregularly shaped spatial clusters while requiring less than 4.3 sec of computation time per day of data. The second scenario uses contaminant plumes spreading through a water distribution system to evaluate the spatial accuracy of the methods. GraphScan improved spatial accuracy using data generated from noisy, binary sensors in the network while requiring less than 0.22 sec of computation time per hour of data.

Keywords

Biosurveillance; Event Detection; Graph Mining; Scan Statistics; Spatial Scan Statistic

Citation

Speakman, Skyler, Edward McFowland III, and Daniel B. Neill. "Scalable Detection of Anomalous Patterns With Connectivity Constraints." Journal of Computational and Graphical Statistics 24, no. 4 (2015): 1014–1033.
  • Find it at Harvard
  • Purchase

About The Author

Edward McFowland III

Technology and Operations Management
→More Publications

More from the Authors

    • 2023
    • Faculty Research

    Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality

    By: Fabrizio Dell'Acqua, Edward McFowland III, Ethan Mollick, Hila Lifshitz-Assaf, Katherine C. Kellogg, Saran Rajendran, Lisa Krayer, François Candelon and Karim R. Lakhani
    • July 2023
    • Management Science

    So, Who Likes You? Evidence from a Randomized Field Experiment

    By: Ravi Bapna, Edward McFowland III, Probal Mojumder, Jui Ramaprasad and Akhmed Umyarov
    • 2023
    • Faculty Research

    Insufficiently Justified Disparate Impact: A New Criterion for Subgroup Fairness

    By: Neil Menghani, Edward McFowland III and Daniel B. Neill
More from the Authors
  • Navigating the Jagged Technological Frontier: Field Experimental Evidence of the Effects of AI on Knowledge Worker Productivity and Quality By: Fabrizio Dell'Acqua, Edward McFowland III, Ethan Mollick, Hila Lifshitz-Assaf, Katherine C. Kellogg, Saran Rajendran, Lisa Krayer, François Candelon and Karim R. Lakhani
  • So, Who Likes You? Evidence from a Randomized Field Experiment By: Ravi Bapna, Edward McFowland III, Probal Mojumder, Jui Ramaprasad and Akhmed Umyarov
  • Insufficiently Justified Disparate Impact: A New Criterion for Subgroup Fairness By: Neil Menghani, Edward McFowland III and Daniel B. Neill
ǁ
Campus Map
Harvard Business School
Soldiers Field
Boston, MA 02163
→Map & Directions
→More Contact Information
  • Make a Gift
  • Site Map
  • Jobs
  • Harvard University
  • Trademarks
  • Policies
  • Accessibility
  • Digital Accessibility
Copyright © President & Fellows of Harvard College