The California Consumer Privacy Act (CCPA), which comes into force at the beginning of next year, is, like the European Union's GDPR, a data law focussing on protecting the privacy of consumers. While legally it will only come into force in California, the rest of the country is watching it keenly. Other states are considering passing similar laws, and Congress is already considering a federal data law. At the same time, the FTC is already negotiating a multi-billion dollar fine with Facebook for the company's alleged privacy and data lapses. The landscape is starting to change irrevocably, and it's clear that data practices are going to have to change.
No more wild west of data
The CCPA proposes to help consumers protect their privacy by giving them the right to know what personally identifiable information (PII) data companies collect about them and to whom these enterprises are selling or disclosing said data. They also have the right to refuse to allow the selling but not the disclosure of their data when there is no financial gain involved and to access said personal data. Finally, companies need to ensure that they are not discriminating against consumers in terms of services offered or price charged when they do exercise their CCPA rights. Thus the CCPA, like GDPR, endeavors to give consumers control over their PII data, and enforces consumer right to privacy through legislation.
However, unlike GDPR, the CCPA does not apply to all companies that operate in California. Only companies who either have a gross revenue (in total, not solely through operations in California) of more than $25 million or hold PII data of at least 50,000 people or make half their recorded profits from selling on PII data are liable to the conditions carried in CCPA. Meanwhile, GDPR affects all companies who store or use data about people located in the EU while CCPA only protects California residents.
The obligations of CCPA
The CCPA makes stringent demands on those companies who are subjects to it. These demands include designing a framework that makes it easy for consumers to request their data, which, as a minimum needs to include a toll-free telephone number. It also includes a link on their website homepage to another page where consumers can demand that the company does not sell their PII data; stricter rules for under 16s, and especially under 13s, and a description of California residents' rights explicitly stated in their privacy policies.
While these demands look reasonably straightforward, practically speaking it may be challenging to organize the data in such a way as to be able to fulfill the CCPA obligations unless your data engineering team is focused on and trained in how to do so. Getting a toll-free number and links on the website home page aren't the concern; the issue is organizing the PII data so well that if Jane Doe demands to see her PII data or to have it deleted that the company can speedily respond to either of her requests.
So if you are a connected car company with gross revenues above $25 million or more than 50,000 cars and therefore clients, then even if you sell no PII data, you are still subject to CCPA. What is the PII data here? The BIT initiative has identified 46 data types that can be considered to be PII. Successfully incorporating this BIT standard means not only having to take care of these PII elements but also of any other data which is linked to any one of these elements and thus could be used to identify a person. We strongly recommend that all companies use the BIT initiative as the basis for identifying PII data and then make sure that they pseudonymize this data in such a way that there is no possibility of connecting any of the elements to other data, which by association then becomes PII data.
In the connected car example, the name or any other personally identifying details of the driver, or even the owner, are PII data. The journeys taken by the car are not PII data unless the driver is identifiable, in which case the whole trip and all trips by all drivers are PII data. Understanding whether the driver is identifiable is in itself challenging. The data points per se may not be PII, but what if you live in a rural part of the country, share a Zipcode with relatively few other people, and make the same journey routinely (say, to work) in your vehicle? The risk of reidentification in this scenario is relatively high.
Selling this data say to supermarkets so that the driver can then receive targetted ads from a supermarket which the driver regularly goes past is then also PII data. Perhaps the driver doesn't want his wife to know the route he takes home, for whatever reason, and seeing his targetted ads on social media might make her suspicious that he is taking a route which is different from what she thinks he takes. While this may still sound a little far-fetched, this is only because, as a society, we haven't adjusted to the reality of targetted advertising yet.
The key is understanding the data
Thus the first task facing any data science, strategy or engineering team in adjusting to CCPA is to understand the data they have in their business, what datasets are likely to be added, which is PII, and how linking non-PII data to other datasets in the market might make that data identifiable, and therefore PII. Businesses must create a strategy of how to store this data in a way that protects the privacy of a company's customers and adheres to the regulations of the CCPA.
In terms of CCPA penalties for non-compliance, the Act demands fines of up to $7,500 for each intentional violation and $2,500 for each unintentional violation. If your data science team has successfully data mined the data and know what it contains and will contain in the future, it becomes much easier to justify any violations and thus minimize any potential fines. Though avoiding CCPA violations isn't just about penalties, it is also about preserving the company's reputation. If the California authorities take a company to court, and it turned out the business didn't really know what data they were storing because they had never fully understood the information they had in their business or did not have the checks and balances in place to stop privacy breaches from happening, this can be reputationally damaging.
By fully understanding what the data contains and which information is considered PII and which isn't, most companies should find that as long as they understand what CCPA demands, that compliance ought to be relatively straightforward.