NO you don’t need personal data for personalization
What? So all those articles proclaiming personalization can’t be possible unless you give them the last ounce of your personal data to make the world a better place were just shams? UNDOUBTEDLY.
Before I go into the details, here is the proof from a recent classification use case.
The results use no personal data. The trick is to use carefully engineered features as is used in Recsys 15 paper and about 100G of clickstream data with minor tweaks to accommodate the use case. A simple random forest classifier was used with the default parameters with no model stacking, which has the obvious advantage of model interpretability — also an important element of GDPR.
The GDPR comes into force 25 May, 2018. This has huge ramifications for the machine learning models deployed in production which use personal data i.e. companies have to simply stop using the machine learning models in production if they don’t comply with the provisions else face fines upto 20 million euro or 4% of global turnover, whichever is higher.
As proved in the results, personalization can be possible without using personal information albeit by putting in a little bit more brain cycles in feature engineering and understanding the domain. It was possible all along but now with GDPR coming into to force there are no excuses anymore
The age old wisdom of crowds still works aka aggregation and pseudonymisation
Hit me up if you have any questions and remember if you are not paying for the product, you are the product