Achieving Anonymity Via Clustering
Essay by review • December 23, 2010 • Study Guide • 523 Words (3 Pages) • 1,322 Views
Achieving Anonymity via Clustering
Gagan Aggarwal1
Google Inc.
Mountain View, CA 94043
gagan@cs.stanford.edu
TomaÒ's Feder2
Comp. Sc. Dept.
Stanford University
Stanford, CA 94305
tomas@cs.stanford.edu
Krishnaram Kenthapadi2
Comp. Sc. Dept.
Stanford University
Stanford, CA 94305
kngk@cs.stanford.edu
Samir Khuller3
Comp. Sc. Dept.
University of Maryland
College Park, MD 20742
samir@cs.umd.edu
Rina Panigrahy2,4
Comp. Sc. Dept.
Stanford University
Stanford, CA 94305
rinap@cs.stanford.edu
Dilys Thomas2
Comp. Sc. Dept.
Stanford University
Stanford, CA 94305
dilys@cs.stanford.edu
An Zhu1
Google Inc.
Mountain View, CA 94043
anzhu@cs.stanford.edu
ABSTRACT
Publishing data for analysis from a table containing personal
records, while maintaining individual privacy, is a problem
of increasing importance today. The traditional approach of
de-identifying records is to remove identifying fields such as
social security number, name etc. However, recent research
has shown that a large fraction of the US population can be
identified using non-key attributes (called quasi-identifiers)
such as date of birth, gender, and zip code [15]. Sweeney [16]
proposed the k-anonymity model for privacy where non-key
attributes that leak information are suppressed or generalized
so that, for every record in the modified table, there are
at least k−1 other records having exactly the same values for
quasi-identifiers. We propose a new method for anonymizing
data records, where quasi-identifiers of data records are
first clustered and then cluster centers are published. To
ensure privacy of the data records, we impose the constraint
1This work was done when the authors were Computer Science
PhD students at Stanford University.
2Supported in part by NSF Grant ITR-0331640. This
work was also supported in part by TRUST (The Team
for Research in Ubiquitous Secure Technology), which receives
support from the National Science Foundation (NSF
award number CCF-0424422) and the following organizations:
Cisco, ESCHER, HP, IBM, Intel, Microsoft, ORNL,
Qualcomm, Pirelli, Sun and Symantec.
3Supported by NSF Award CCF-0430650.
4Supported in part by Stanford Graduate Fellowship.
Permission to make digital or hard copies of all or part of this work for
personal or classroom use is granted without fee provided that copies are
not made or distributed for profit or commercial advantage and that copies
bear this notice and the full
...
...