PROJET AUTOBLOG


Richard Stallman's Political Notes

Site original : Richard Stallman's Political Notes

⇐ retour index

Identifying information

vendredi 15 décembre 2017 à 01:00

A zip code, a birth date, and a sex are enough to uniquely identify most people in the US. This defeats most data anonymization schemes. Many other collections of data also permit deanonymization.

I think there is a way to do crowdsourcing of recommendations without enabling any person to be identified. The trick is not to save a long list of things that one contributor liked.

Suppose each person anonymously contributes many separate triples of things which person liked. "I liked A, B, and C." "I liked A, B, and D." "I liked A, C, and D." (The software wouldn't have to submit all the n!/(3! * (n-3)!) combinations of three of the n things you liked.) This may be enough to make somewhat useful recommendations; but if the system does not know when various triples are from the same person, its data are not enough to identify any person.

I would rather have privacy than personalized recommendations. I reject Netflix entirely for several reasons, but preserving my privacy would be enough reason by itself.