Get access

Disclosure-protected Inference Using Generalised Linear Models



Vast amounts of data that could be used in the development and evaluation of policy for the benefit of society are collected by statistical agencies. It is therefore no surprise that there is very strong demand from analysts, within business, government, universities and other organisations, to access such data. When allowing access to micro-data, a statistical agency is obliged, often legally, to ensure that it is unlikely to result in the disclosure of information about a particular person or organisation. Managing the risk of disclosure is referred to as statistical disclosure control (SDC). This paper describes an approach to SDC for output from analysis using generalised linear models, including estimates of regression parameters and their variances, diagnostic statistics and plots. The Australian Bureau of Statistics has implemented the approach in a remote analysis system, which returns analysis output from remotely submitted queries. A framework for measuring disclosure risk associated with a remote server is proposed. The disclosure risk and utility of approach are measured in two real-life case studies and in simulation.