Privacy-Preserving Statistical Learning and Testing



Abstract: Identity Testing and Distributional Property Learning are two fundamental problems in statistical inference. However, in many settings, data may contain sensitive information about individuals. It is critical that our methods should protect sensitive information, meanwhile not preclude our overall goals of statistical analysis.

In this talk, I will talk about Differentially Private Identity Testing and Distributional Property Learning. We derive almost-tight sample complexity bounds for both problems. Our upper bounds come from privatizing non-private estimators. As to the lower bound, we establish a general coupling method which we believe can be used to obtain strong lower bounds for other statistical problems under privacy.