Abstract
Introduction
Big Data has several key characteristics, such as the multiplicity of its sources, the limited structure of the data and the speed with which data is generated and processed (Mayer-Schönberger and Cukier, 2013). But the most prominent reason why Big Data is called big is obviously the sheer size that current datasets can have. Data controllers like large corporations and government institutions are only allowed to collect and process information (whether it is Big Data or small data) when there is a legal basis for this. The use of data may be limited by, for instance, copyright laws or personal data protection laws. In this contribution, we will focus on the use of personal data in the context of Big Data.
Very often the legal basis for collecting and processing personal data consists of informed consent provided by the data subject. For instance, when ordering a book online, an individual has to provide name, shipment address and payment details in order to complete the purchase. Such a transaction implies consent to the use of these personal data, usually by consenting to the general terms and conditions of the website. According to many of the terms and conditions, such consent may imply more than merely consent to shipping the order. It may, for instance, include consent to customize content, customize advertising, contact customers by email, share behavioral data internally, share personal data internally and sell personal data to other parties (Custers et al., 2014).
Complicated consent procedures may constitute barriers for consent and consequently for the availability of large datasets. Many websites, such as social network sites like Facebook, Twitter and LinkedIn, use quick and easy procedures for providing consent in order to generate Big Data. By asking their users for broad consent, they create opportunities to collect large datasets for all kinds of business opportunities. The business models are usually of the type in which users do not have to pay for their accounts. To keep the consent procedures quick and easy, usually consent is asked for when registering at the website. In most cases, consent can be provided by completing a checkbox or clicking on a button stating ‘I agree to the terms and conditions’ or something similar. Very rarely is the renewal of consent asked for. As a result, providing consent once often implies providing consent forever. Given the rapid changes in Big Data and data analysis, however, consent in terms of user expectations may easily become outdated. Consent may be considered outdated when it no longer matches the initial preferences of a user, for instance, because he changed his mind or because the data processing practices have changed significantly.
This contribution suggests introducing expiry dates for consent, not to settle questions, but to put them on the table as a start for further discussion on this topic. We start with some conditions for consent and some of the issues of providing informed consent in the era of Big Data. Next, we discuss the issue that the underlying idea of informed consent is that it should always be possible to change your mind regarding (the extent of) the consent. Particularly when the technological possibilities of how personal data can be processed change over time, the conditions for consent may change and influence the consent decision. Hence, periodical renewal of consent as a default could be considered and we propose to introduce expiry dates for consent. We conclude this contribution with a short assessment of the pros and cons of such expiry dates for consent as a start for further discussion on this topic and acknowledge that expiry dates for consent, though useful in some situations, will not solve many of the issues of consent in the era of Big Data.
Informed consent
When discussing consent, it is usually assumed that consent is only valid when it is
Criteria for
Criteria for
Issues with consent
There are many issues with informed consent and many of these issues have been discussed extensively in literature. In the context of Big Data, there is growing skepticism regarding the effectiveness of informed consent in the context of personal data processing (Acquisti, 2009; Adjerid et al., 2013; Böhme and Köpsell, 2010; Pollach, 2007; Solove, 2013). With regard to models for informed consent based on informational self-determination, Solove (2013) argues that such models fail to offer adequate protection for people, as such models have too many hurdles: (1) people do not read privacy policies; (2) if they do read them, they do not understand them; (3) if people read and understand them, they often lack enough background knowledge to make an informed decision; (4) if people read them, understand them and can make an informed decision, they are not always offered the choice that reflects their preferences.
When consent is asked for, the information provided is often very extensive. It may take a lot of time to read the information policies and to make a decision based on this information. McDonald and Cranor (2008: 560) have estimated that if data subjects would actually read all the privacy policies presented to them, it would take them 244 hours annually. There is a lot of research indicating that people do not read privacy policies (Arcand et al., 2007; Beldad, 2011; Bolchini et al., 2004; Graf et al., 2010; Jensen and Potts, 2004; Lichtenstein et al., 2003; Milne and Culnan, 2004; Pan and Zinkhan, 2006; Sheehan, 2005).
The information provided may be difficult to understand. In many situations, the text is highly legalistic in nature or may contain technical details beyond the comprehension of the average user. While an abbreviated, plain-language policy would be quick and easy to read, it is the hidden details that carry significance (Toubiana and Nissenbaum, 2011). Related to this issue is the asymmetry in power distribution. Those who collect and process the data have technological expertise that the average user usually lacks (Acquisti and Grosslags, 2005; LaRose and Rifon, 2007; Metzger, 2004).
Although models based on informational self-determination certainly have many virtues, they do not always reflect how people use the Internet, social media, etc. Using the Internet, there are (too) many requests for consent (Schermer et al., 2014). Browsing and surfing would take a lot of time if every Internet user would really think through every consent request that is asked for. Due to the large number of consent requests, users often do not really consider the questions asked, do not read the information provided and do not seem to think through the consequences of providing (or refusing) consent. It seems that data subjects simply consent whenever confronted with a consent request (Custers et al., 2013). This is obviously problematic, as such consent no longer has any meaning.
Implications of withholding consent
As a result, Internet users seem to become increasingly disengaged in the consent processes. The consent decision does not really alter the morality of data subjects (Hurd, 1996; Kleinig, 2010). Users often blindly accept consent boxes when they resemble other dialogue boxes (Böhme and Köpsell, 2010). They indicate they often feel that they do not have a choice when dealing with consent decisions, since consent is often framed as a take-it-or-leave-it offer: in case a user refuses consent, access to a website or Internet service is often plainly denied or severely hampered. People indicate they are concerned about their privacy, but at the same time they routinely disclose personal information because of convenience, discounts and other incentives, or a lack of understanding of the consequences (Dutton and Blank, 2013; Regan, 2002).
Withholding consent implies restricted access or no access at all to particular services, but it does not guarantee that your privacy is better protected. The use of Big Data increasingly enables predicting characteristics of people who withheld consent on the basis of the information available from people who did consent. When large numbers of people consent to the use of their personal data, it is possible to predict missing values of other people (Custers, 2012). This may be pretty accurate: Kosinski et al. (2013) showed how a range of highly sensitive personal characteristics, including sexual orientation, ethnicity, religious and political views, personality traits, intelligence, happiness, use of addictive substances and parental separation, can be predicted very accurately on the basis of Facebook likes. Obviously, predicting missing values is also possible for people who provided partial consent or for people who (whether on purpose or not) provided false information. Nevertheless, when data is inaccurate or incomplete, for instance because it is outdated, such predictions may become less accurate, resulting in wrong conclusions about and decisions on people.
The right to change your mind?
Given the rapid changes in Big Data and data analysis, it is strange that consent is usually provided
In a number of places, there are other mechanisms to ask for renewed consent. For instance, when downloading updates of apps or software, users may have to approve revised permissions. Along with such technological changes users have to re-consent. Also, renewed consent is sometimes asked for when users do not use an app or website within a number of days or weeks after registering. It should be noted, however, that in case of software updates, providing new consent for a new purpose does not always imply that the previous consent for other purposes ends. New consent can be additional consent instead of revised consent.
Consent of a data subject must include the actual data processing. However, data processing activities may change and expand over time and therefore it is important that consent is up to date. While there is, for instance, no explicit requirement in the current EU-legislation 3 mandating that consent is ‘up to date’, such a requirement may be inferred from the principles of purpose specification and use limitation (i.e. the data collected may only be used for the purposes specified in advance) as set forth in article 6(1)b of the current EU-directive on personal data protection. If consent is given by the data subject for a specific, well-defined purpose, any substantial deviation from this purpose will require a renewal or confirmation of the consent. When consent no longer matches the actual data processing or the user preferences, it is outdated.
Many websites and Internet services argue that a user can at any time withdraw consent that was provided earlier. This is also explicitly mentioned in the proposed EU-regulation on personal data protection. 4 This is a regular practice in many areas of society: many contracts are for an unlimited time but can be terminated by either party, usually with a particular term of notice. There are some problems with this argument, though. First of all, it creates a take-it-or-leave-it situation as mentioned above: disagreeing with some aspects would imply not being able to use the entire website or service anymore. This is not a particularly great solution as it may be too strong in many situations. For instance, people may not like some aspects of Facebook, but would still like to have an account. Another issue with revoking consent by cancelling your account is that it assumes that nothing changes during the length of the contract. But in areas where technologies are rapidly changing, there may be changes in practices that are not covered by the initial consent. A third issue is that, in practice, most users do not seem to cancel their free accounts anyway. They do not actually and actively leave the service, but simply do not use it any more. In that case, however, data controllers can continue to use their personal data. Hence we would like to raise the question whether it would be more appropriate to limit the duration of consent to a maximum amount of time. In other words, should there be expiry dates for consent?
In the light of rapid technological changes, it may be suggested to include a provision in the existing legal frameworks that consent, when not renewed, expires after some time, say after two or three years. When people are regularly asked to renew their consent, the more engaged users may realize they have changed their mind after some time, for instance, because the ways in which their personal data is processed have changed. People may also better understand the consequences of their consent after being a user of a website or service for two or three years than when they registered for the first time.
Having suggested this, we should also point out some practical drawbacks of this proposal. The idea behind taking back consent or letting it expire is to block further use and reuse of personal data. At first sight, data subjects would be more ‘in control’. The debate on the right to be forgotten (Ausloos, 2012; Graux et al., 2012; Koops, 2011) shows, however, how complex it is to have data deleted. Introducing consent expiry dates may yield similar database management problems. Giving people the option to provide bits and pieces of partial consent via “privacy settings” creates a complicated world. It would involve the collection of a lot of metadata on which personal data can be used for which purposes before which expiry date. In fact, such metadata may also reveal privacy preferences of data subjects, yielding less privacy rather than more privacy, as privacy preferences can be used for personalization or profiling. Expiry dates may also require metadata, but we think that general expiry dates may reduce part of this complexity. Note that when consent for processing personal data expires, the anonymized data may still be used for profiling and statistics. Expiry dates will not solve this issue.
Another issue with updating consent is that it follows the same process as the initial consent. For some users it may create more awareness as mentioned above, but for many users it may just be another checkbox to click on without reading it. We admit that expiry dates do not solve all abovementioned issues with consent, but for the ‘disengaged’ users expiry dates may still be helpful in those situations in which users no longer actively use their accounts. Their inactivity is then automatically interpreted as a revocation of their consent, blocking further use of their data.
A third issue may be more practical: when consent expires or is withdrawn, data controllers may not remove the data or stop using it. A typical example is the data breach of Ashley Madison, a website enabling extramarital affairs (Thomsen, 2015). Because of the website’s policy not to delete users’ personal data, many user details were disclosed.
Conclusion
Most consent is rarely renewed, despite the fact that it may be outdated quickly in the context of Big Data developments. Expiry dates for consent could further enable people to change their mind and further exercise their rights. Also, expiry dates would improve the likelihood that consent is up to date, similar to food expiry dates and periodic vehicle inspections. This will also reduce the risk of function creep (when data is used for other purposes than for which it was initially collected). Consent that is more frequently renewed may also (potentially) create further awareness among engaged users on how their data are processed: when people provide consent when registering somewhere, they may not yet overview the consequences of their consent, but after one or two years they may have a different perspective and increased awareness (see also Zwitter, 2014). As such, renewed consent decreases the delay between consent and its consequences. For disengaged users, renewed consent may simply be another checkbox to blindly accept, but expiry dates may better reflect their preferences in case they are no longer actively using their accounts.
Expiry dates for consent will not solve the many issues of informed consent in the era of Big Data, but it may be a useful tool for some situations. Rather than arguing that expiry dates for consent will settle questions, we acknowledge the limitations of this approach. We hope, however, that this paper is a start on the discussion of how long consent should be valid in situations in which technologies to collect and analyze data are rapidly changing. Whatever the answer may be, we think consent should not be forever.
Footnotes
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research, authorship, and/or publication of this article.
Funding
The author(s) received no financial support for the research, authorship, and/or publication of this article.
