A remote analysis system addresses the challenge of enabling the use of confidential or private data while maintaining standards of confidentiality and privacy. Traditional approaches typically involve reducing the risk of disclosure by modifying or confidentialising data before releasing it to users. In contrast, a remote analysis system enables users to submit statistical queries and receive output without direct access to the data. A remote analysis system may involve confidentialisation of the underlying data itself or the system outputs, or both.
In this paper we discuss the implementation of a remote analysis system enabling survival analysis. In this system the underlying data are not confidentialised, although for some analyses a random sample of the data is used, and the system outputs are modified to protect confidentiality and privacy. We describe confidentiality objectives for the system outputs, and describe measures for achieving them. To illustrate the effect of the methods, we provide a comprehensive example comparing confidentialised output with traditional output for a range of common survival analyses.
We believe that the confidentialised output of the remote analysis system for survival analysis as described in this paper is still useful for survival analysis in some situations, provided the user understands the confidentialisation process and its potential impact. If the remote analysis system user requires more detailed information such as outlier values, event times and/or and standard errors, then they would need to apply for access to the underlying data.
O'Keefe, Christine M.; Sparks, Ross Stewart; McAullay, Damien; and Loong, Bronwyn
"Confidentialising Survival Analysis Output in a Remote Data Access System,"
Journal of Privacy and Confidentiality:
1, Article 6.
Available at: http://repository.cmu.edu/jpc/vol4/iss1/6