Big data and privacy was one of the topics discussed at the Canadian IT Law Association conference this week. Some of the issues worth pondering include:
- Privacy principles say one should collect only what you need, and keep only as long as needed. Big data says collect and retain as much as possible in case it is useful.
- Accuracy is a basic privacy principle – but with big data accuracy is being replaced by probability.
- A fundamental privacy notion is informed consent for the use of one’s personal information. How do you have informed consent and control for big data uses when you don’t know what it might be used for or combined with?
- Probability means that the inferences drawn may not always be accurate. How do we deal with that if we as individuals are faced with erroneous inferences about us?
- If based on information that may itself be questionable, the results may be questionable. (The old garbage in, garbage out concept.) It has been proposed that for big data and AI, we might want to add to Asimov’s 3 laws of robotics that it won’t discriminate, and that it will disclose its algorithm.
- If AI reaches conclusions that lead to discriminatory results, is that going to be dealt with by privacy regulators, or human rights regulators, or some combination?
- Should some of this be dealt with by ethical layers on top of privacy principles? Perhaps no go zones for things felt to be improper, such as capturing audio and video without notice, charging to remove or amend information, or re-identifying anonymized information.
Cross-posted to Slaw
The internet of things and big data are separate but related hot topics. As is often the case with new technology, the definitions are fluid, the potential is unclear, and they pose challenges to legal issues. All of these will develop over time.
Take privacy, for example. The basic concept of big data is that huge amounts of data are collected and mined for useful information. That flies in the face of privacy principles that no more personal info than the task at hand needs should be collected, and that it shouldn’t be kept for longer than the task at hand requires. Both of these concepts can lead to personal info being created, while privacy laws generally focus on the concept of personal info being collected.
Another legal issue is ownership of information, and who gets to control and use it. If no one owns a selfie taken by a monkey, then who owns information created by your car?
If anyone is interested in taking a deeper dive into these legal issues, I’ve written a bit about it here and here, and here are some recent articles others have written:
The ‘Internet of Things’ – 10 Data Protection and Privacy Challenges
Big Data, Big Privacy Issues
The Internet of Things Comes with the Legal Things
Today’s Slaw post:
I think we are going to see over the next while some interesting technical developments with some equally interesting legal issues to ponder around big data and wearable computing.
One of the things I like about being an IT lawyer is that I get to see interesting new technology and businesses, and with any luck do their legal work.
Earlier this morning I was at a business that has some cool technology around social media and big data. It has the ability to turn into a 5 minute project what can now take months to do manually, if you can do it at all.
Later today I’m going to try out a Google Glass owned by someone who is exploring possible business models based on its use.
And on a related note, it was announced yesterday that the Pebble watch will be available at Best Buy in August – but unfortunately not in Canada.
Today’s Slaw post:
Two basic privacy principles are that no more personal info should be collected than necessary, and it should not be kept any longer than necessary. That flies in the face of repeated attempts by governments and law enforcement to collect and retain data, or to require others to retain it.
One example is attempts to pass laws to require ISPs and telecommunications companies to retain data on customers for a fixed period of time just in case it might be helpful to police. Denmark has had such a data retention law in place for many years. The Danish Ministry of Justice has just concluded, however, that five years of extensive Internet surveillance have proven to be of almost no use to the police. (I’m relying on a news story – the actual report is in Danish.)
“Session logging has caused serious practical problems,” the ministry’s staffers write in the report. “The implementation of session logging proved to be unusable to the police; this became clear the first time they tried to use [the data] as part of a criminal investigation.”
So the downside of retaining personal info is the cost to the service provider to do it (which is ultimately paid by consumers), the increased risk of it being misused or leaked, and the general privacy invasiveness. And the upside is …?
Today’s Slaw post
Big data is a hot trending tech issue. Wikipedia defines big data as “a term applied to data sets whose size is beyond the ability of commonly used software tools to capture, manage, and process the data within a tolerable elapsed time. Big data sizes are a constantly moving target currently ranging from a few dozen terabytes to many petabytes of data in a single data set.”
The initial issue with big data is the ability to actually work with massive data sets – how to store, search, and manipulate it. But the tools to do that are becoming more sophisticated, and attention is turning to how to take advantage of big data. This McKinsey report entitled Big data: The next frontier for innovation, competition, and productivity is a good summary of the possibilities. There is potential for increased profit margins for retailers, reduced costs for healthcare, product improvements and more.
This all sounds good. Consider for a moment though that big data means massive databases that include huge amounts of customer information. And the information that governments have on us is massive as well. It will be tempting to amass as much data (including personal information) as possible, as the more data is there, the more information that can be learned from it. That flies in the face of privacy principles that say one should only collect the smallest amount of personal information you need for the immediate purpose, and should not keep it for longer than you need it for that purpose.
It is possible to anonymize personal information to avoid the issue, but that is done on a sliding scale – a little anonymization makes it easy to recombine it with other information and figure out who the individuals are – a lot of anonymization makes the data less valuable.
Big data uses that determine generic things like trends and product features are one thing – but it can also be used for targeting individuals for things like advertising and medical treatment. Individuals may welcome or be horrified by that, depending on the use and personal viewpoints.
Another concern is the creeping (and creepy) trend towards industry and government big brother type uses.
It has been pointed out that big data needs to be complemented by “big judgment” . As this Harvard Business Review article entitled Good Data Won’t Guarantee Good Decisions points out, “At this very moment, there’s an odds-on chance that someone in your organization is making a poor decision on the basis of information that was enormously expensive to collect.” That sentiment may very well apply to poor decisions on the privacy aspects of big data as well.