Get access

A Scalable Local Algorithm for Distributed Multivariate Regression

Authors

  • Kanishka Bhaduri,

    Corresponding author
    1. Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, Maryland, 21250, USA
    2. Mission Critical Technologies Inc, NASA Ames Research Center, Moffett Field CA 94035, USA
    • Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, Maryland, 21250, USA
    Search for more papers by this author
  • Hillol Kargupta

    1. Department of Computer Science and Electrical Engineering, University of Maryland, Baltimore County, 1000 Hilltop Circle, Baltimore, Maryland, 21250, USA
    2. Agnik, LLC., Columbia, MD, USA
    Search for more papers by this author

  • A shorter version of this paper was published in SIAM Data Mining Conference 2008.

Abstract

This paper offers a local distributed algorithm for multivariate regression in large peer-to-peer environments. The algorithm can be used for distributed inferencing, data compaction, data modeling and classification tasks in many emerging peer-to-peer applications for bioinformatics, astronomy, social networking, sensor networks and web mining. Computing a global regression model from data available at the different peer-nodes using a traditional centralized algorithm for regression can be very costly and impractical because of the large number of data sources, the asynchronous nature of the peer-to-peer networks, and dynamic nature of the data/network. This paper proposes a two-step approach to deal with this problem. First, it offers an efficient local distributed algorithm that monitors the “quality” of the current regression model. If the model is outdated, it uses this algorithm as a feedback mechanism for rebuilding the model. The local nature of the monitoring algorithm guarantees low monitoring cost. Experimental results presented in this paper strongly support the theoretical claims. © 2008 Wiley Periodicals, Inc. Statistical Analysis and Data Mining 1: 000-000, 2008

Get access to the full text of this article

Ancillary