《Differential+Privacy》由会员分享,可在线阅读,更多相关《Differential+Privacy(14页珍藏版)》请在金锄头文库上搜索。
1、2014.5.12,Differential Privacy,A preliminary story,- A hospital has a database of patient records, each record containing a binary value indicating whether or not the patient has some form of cancer. - We want to know the total number of patients with cancers? Easy! A summation over these binary val
2、ues,Dierential privacy address the question of, given the total number of patients with cancer, whether or not an adversary can learn if a particular individual has cancer.,- But how about if we know anyone must on the list? Or anyone must be the end of the list? If Jack has cancer? S(3)-S(2),A prel
3、iminary story,Differential Privacy “Let us do the sum function in peace” -What we want is a protocol that has a probability distribution over outputs such that if person I changed their input from xi to any other allowed xi, the relative probabilities of any output do not change by much -So, for ins
4、tance, can pretend your input was any other allowed value you want.,Introduction,Differential privacyaims to provide means to maximize the accuracy of queries from statistical databases while minimizing the chances of identifying its records. The denition was rst proposed in Cynthia Dworks ICALP pap
5、er. 1 C. Dwork. Dierential privacy. In ICALP, pages 112, 2006. 2 C. Dwork, F. McSherry, K. Nissim, and A. Smith. Calibrating noise to sensitivity in private data analysis. In Proceedings of the Third conference on Theory of Cryptography, TCC06, pages 265284, 2006. 3 C. Dwork , Dierential privacy: A
6、survey of results, Theory and Applications of Models of Computation, (2008), pp. 119. Previous eorts -be “broken” in the sense there are well known attacks k-anonymity -dierential privacy has rigorous denition successful,Definitions,Let : be a randomized algorithm. Let 1 , 2 be two datasets that dif
7、fer in at most one entry (we call these database neighbors),D1,D2,Database neighbors,Deifinition 1. Let 0. Define to be private if for all neighboring databases 1 , 2 , and for all(measurable) subsets , we have Pr( 2 ) Pr( 1 ) exp() Where the probability is taken over the coin tosses of ,Deifinition
8、 1. Let 0. Define to be private if for all neighboring databases 1 , 2 , and for all(measurable) subsets , we have Pr( 2 ) Pr( 1 ) exp() Where the probability is taken over the coin tosses of Observation 2. Because we can switch 1 , 2 interchangeably, Definition 1 implies that exp Pr 2 Pr 1 exp Sinc
9、e exp 1+ for small , then we have roughly 1 Pr 2 Pr 1 1+, satisfies ,Intuition,A game between Alice and Bob is permutation invariant, its space D is finite( =) Alice picks an arbitrary , Let =( 1 , 2 , 1 ), , = 1 , 1 , = Alice gives Bob the tuple ( , =(), Bob must guess the value of Bobs best guess
10、for is , = But if satisfies Pr , = Pr , = Bob will have very low confidence in his estimate of , and will not be able to win much better than random guessing,Statistical guarantees,Pure semantic privacy Can not learn information about an individual by the output of some algorithm. Unfortunately, ext
11、ernal information makes such a privacy denition impossible . Dierential privacy aims for more relaxed denitions of privacy than pure semantic privacy. It states that an adversary with access to the output of an algorithm will learn roughly the same information whether or not a single users data was
12、included or not.,Laplace mechanism,First, let : , and let 1 be the usual 1 norm. Define (), the global sensitivity of , for all neighboring database 1 , 2 as = 1 ( 2 ) 1 Theorem (Laplace Mechanism) let f be defined as before and 0. Define randomized algorithm as = + ( () ) Laplace distribution has m
13、ean 0, and has density ; = 1 2 ( ) And () = 1 , Then satisfies ,Laplace mechanism,Proof, satisfies ,Laplace distribution (), its density function exp( /) Suppose 0,1 , we want to learn = , the total number 1s in database Adding noise to according to Laplace distribution 1 , = +,where ( 1 ) This is ,
14、 because: For any two database x and x which differ in a single entry, the sums and differ by one. ( =) Pr( =) = () () () ,Example,Composition,Theorem 1. (Sequential composibility) is ,Theorem 2. (Parallel composibility) is , Local sensitivity / Smooth sensitivity Nissim-Raskhodnikova-Smith 07 Objec
15、tive perturbation Chaudhuri-Monteleoni-Sarwate 08 Sample and Aggregate NRS 07 Exponential Mechanism McSherry-Talwar 07 What can you say about publishing a sanitized database? B-Ligett-Roth 08,Furthermore,Summary,Define a new notion: Ensures of transcripts with added noise Differential privacy is a solution concept, or a mechanism design It gives us a meaningful way to bound how much an individual might be harmed through loss of “privacy”,