TY - JOUR
AU - Sheffet, Or
PY - 2019/03/30
Y2 - 2024/05/27
TI - Differentially Private Ordinary Least Squares
JF - Journal of Privacy and Confidentiality
JA - JPC
VL - 9
IS - 1
SE - Articles
DO - 10.29012/jpc.654
UR - https://journalprivacyconfidentiality.org/index.php/jpc/article/view/654
SP -
AB - <p>Linear regression is one of the most prevalent techniques in machine learning; however, it is also common to use linear regression for its <em>explanatory</em> capabilities rather than label prediction. Ordinary Least Squares (OLS) is often used in statistics to establish a correlation between an attribute (e.g. gender) and a label (e.g. income) in the presence of other (potentially correlated) features. OLS assumes a particular model that randomly generates the data, and derives <em>t-values</em> - representing the likelihood of each real value to be the true correlation. Using <em>t</em>-values, OLS can release a <em>confidence interval</em>, which is an interval on the reals that is likely to contain the true correlation; and when this interval does not intersect the origin, we can <em>reject the null hypothesis</em> as it is likely that the true correlation is non-zero.<br>Our work aims at achieving similar guarantees on data under differentially private estimators. First, we show that for well-spread data, the Gaussian Johnson-Lindenstrauss Transform (JLT) gives a very good approximation of <em>t</em>-values; secondly, when JLT approximates Ridge regression (linear regression with <em>l<sub>2</sub></em>-regularization) we derive, under certain conditions, confidence intervals using the projected data; lastly, we derive, under different conditions, confidence intervals for the "Analyze Gauss" algorithm of Dwork et al (STOC 2014).</p>
ER -