Abstract
Introduction
Today, the omnipresence of sensors tracking our social whereabouts has led to the production of digital traces with high speed and volume commonly referred to as ‘big data’. While celebrated for its potentialities, scholars have also asserted how this growing body of digital traces, due to their common origin as by-products of already existing processes, often are used ‘out of context’, which decrease the ‘meaning and value’ (Boyd and Crawford, 2012: 670). This lack of
Figure 1 presents a simple example of such decontextualization from a big dataset containing spatial path points of persons moving within a store as recorded by video sensors. To calculate the movements, the video footage of the sensors has been reduced to decontextualized numeric traces of the interaction that took place, thereby enabling insights that someone ‘moved’ somewhere. Context – what was accomplished, by whom and where, how and why, all need to be re-created for ‘the data to carry meaning'. As Blank (2008: 540) commented: ‘With many interesting variables unavailable, people are, at best, thinly described. Because of these problems many forms of electronic record are very difficult for researchers to use.’ While big datasets are extensive in volume and granularity, these qualities often only extend along one dimension whereby they appear as ‘thin’ to the analysts and researchers working with them.
Data within red square represents person moving from A to B as captured by video analytics. While a person’s path is highly detailed, data traces offer no information describing the context of the path.
Figure 2 combines these two distinctions: thin/thick–extensive/small into a matrix. The two red areas define the main data sources the blending methodology engages with.
Splitting the data universe by the two distinctions, thin/thick–extensive/small. Four common methods for collecting data have been added to the figure to illustrate how it is possible to think about highly different data sources along these lines. The two red areas in, respectively, the extensive–thin and thick–small define the data sources that the blending methodology has been developed from and where the complementarity is strongest. Authors model.
In the top left corner, we find the big-thin data sources that Blank talked about. These data sources are extensive in numbers but also overly thin with very little context linked to them. Most big datasets built from sensors, such as location data collected from a GPS sensor, belong to this group of data. While less talked about than, e.g. social media data due to its less clear application, this group of data sources is by far the fastest growing with a steady appearance of new sensors tracking in our everyday life. To address the challenge of thin data, ‘data scholars’ have suggested complementing big data with sources of highly contextualized thick data (Blok and Pedersen, 2014; Boellstorff, 2013; Curran, 2013; Ford, 2014; Stoller, 2013; Wang, 2013).
We use the word ‘thick data’ for the group of data sources located in the opposite end of the coordinate system. ‘Thick data’ is synonymous with ethnographically collected and analysed observational data in the tradition of Clifford Geertz (1977), who described how thick descriptions of human behaviour include detailed data collection and analysis of the context in which a behaviour occurs. Thick data is defined by its contextual complexity which enables the researcher to reflect upon how and why people do what they do. Small data is opposed to big data by being a low number of instances. It is of course possible to have a small number of thin data, though this would in most cases be useless. Thus, thick and small data do not share the same epistemological status. However, it is often the case that the collection and analysis of thick data produced as human actions and interactions provides a relative small collection of thick phenomena, i.e. opposed to big data being millions of thin nodes. However, context is not a simple matter (Duranti and Goodwin, 1992). One particular strong analytical perspective for analysing thick ethnographically collected data is ethnomethodology (EM) (Garfinkel, 1967), conversation analysis (CA) (Sacks et al., 1974) and multimodal interaction analysis (Streeck et al., 2011), which rests on the collection of naturally occurring data through video recordings. In this tradition, some argue that the only relevant context in social interaction is the utterance before a new utterance is produced in a sequential environment (Schegloff, 1987, 1997). However, others in the ethnographic version of the EM/CA tradition (Arminen, 2005; Atkinson et al., 2001; Heath et al., 2010; Moerman, 1988) use a broader definition and put emphasis on the situated encounters in the tradition from Goffman (1964). The collection of thick data by ethnographic methodology is primarily based on (video/photo-) observations, field records and interviews. All sorts of contextual knowledge can potentially be relevant concerning these situated encounters, but in the tradition of EM/CA, we emphasize that it is primarily the issues that participants themselves somehow orient to, that is the most relevant context for the analysis (Heritage, 1984). Thus, some qualitative techniques might be used in this process, but thick data is not generally speaking per se provided through all sorts of qualitative methods. By Big–Thick Blending we specifically intend to focus on the blending of ethnographically collected thick observational data.
The blending methodology rests on the complementarity between these highly heterogeneous data sources. While this is the focus in the paper, it should be noted that the methodology is not limited to these extremes and it can surely be productive to blend both ‘less thin’ big data sources, e.g. social media data, or ‘less thick’ thick-data, e.g. interview data.
We are far from the first to argue for important positive complementarities to arise from mixing these two types of data. It has, e.g. been argued that the mixing of big decontextualized data with highly contextualized thick data can help ‘uncover the meaning behind Big Data visualization and analysis’ (Wang, 2013). Others have hypothesized how ‘entirely new interferences and polyphonies’ can arise […] given that these two types of data are ‘mixed with care’ (Blok and Pedersen, 2014: 1).
There are also empirical experiments: Researchers from Berkley have, e.g. studied space usage within homes by mixing big data from behavioural tracking sensors with ethnographic observations (Anderson et al., 2009). Similar, tracking using mobile-embedded Bluetooth sensors was conducted by Girardin (2013) to study congestion and space usage at the museum of Louvre, using observations provided by the massive security staff guarding the values of the museum to qualify and extend the thin data traces. A similar ‘qualifying role’ was reached by Hsu (2014) who made use of GPS data from Myspace to ‘navigate’ her ethnographic mapping of online music communities. In contrast to the Louvre study, where thick descriptions were used to qualify analytical results built from thin big data, big geographical knowledge on the location of the individuals wherein Hsu study used to qualify and contextualize her local ethnographic work. Finally, Blok A et al. (2017) have recently used the setting of a party to study possible complementary effects (and the absence hereof) when combining big and thick observations.
However, none of these engage directly with the goal of describing a practical method for integrating big and thick data (cf. Girardin, 2013). The development of applicable methodologies has thus been overly absent leaving scholars like ourselves in the dark as to how in practice billions of thinly digital data (instead of traces) should be mixed together with ethnographic accounts (the only notable exception is the method of ethno-mining that we will discuss further below). In this paper, we report on how we, a team of ethnographers and big data analysts, during the last three years have developed methodological conventions on how to blend big and thick analytical results.
The remaining paper falls in four parts. First, we position the blending methodology within the multimethodology framework. Second, we develop the methodological concept of ‘blending’ as a technique for bringing big and thick analytical insights into shared analytical spaces. Third, we explore this method in relation to two analytical examples, extracting important insights on how best to integrate big and thick data. We end by discussing how the behavioural and temporal granularity intrinsic to most big datasets plays a crucial role in affording the blending of insights built from big and thick data.
Establishing a framework for blending
The blending terminology is borrowed from Fauconnier’s (1997, 2001) and Fauconnier and Turner’s (1998, 2002) research in the field of cognitivism and linguistics. In their terminology, blending is a cognitivistic process assumed to be ubiquitous to everyday thought that people apply to combine elements from diverse scenarios into new elements – so-called blended spaces. The theory thus provides a terminology for the cognitive process of developing new concepts (cf. Koestler, 1964). In the Big–Thick Blending methodology, we draw on this terminology, but in a slightly different manner as we extend the concept of blending from being primarily a cognitive process into one that also covers intentional and strategic processes such as research. We use blending to describe the analytical process in which insights based on big and thick data are brought together into new conceptualizations through deliberate actions performed by researchers. We argue that this move is theoretically appropriate and within the conceptual nature of the terminology (see also other uses, such as in Hougaard (2005) and Hutchins (2005)). Figure 3 shows Fauconnier and Turner’s terminology with a simple example showing the construction of a ‘lamp–chair’.
Blending of two elements into a new third one (see Fauconnier and Turner, 1998; Authors model, Due, 2014).
The figure shows how a common generic space exists: a schematic frame of shared elements. In this case, the common generic space is (at least) the category: ‘furniture’, the shared colour and the wooden material. The blending process then consists in partially matching the two inputs ‘lamp’ and ‘chair’ and projecting selectively from these two input spaces into a new space, the blended space. In the blended space, we have a new type of furniture. This construal is emergent in the blend but it also remains connected to the original inputs by specific affordances: the lamp–chair is a new emergent construct, but the specific affordances of, for instance, the light bulb and the chair legs remain the same.
The example is a simple case of blending. Two inputs share properties that might be blended. They are linked by a cross-space mapping and elements are projected selectively to a blended space. The projection of these specific elements allows an emergent structure to develop. Thus, the blending process can derive concepts from the input spaces to provide relations that do not exist in the separate inputs (Fauconnier and Turner, 2003).
The input spaces that are blended in the Big–Thick Blending methodology consist of analytical insights, which is built on data materiality with different affordances. Rather than mixing different methods with different disciplinary constraints, the Big–Thick methodology focuses upon the blending of insights. The actual blending can thus be described as an interpretative, distributed cognitive and embodied process conducted by the researchers. Consequently, the blending must happen iteratively and in rapid pace to counter how analytical insights tend to stabilize over time. The blending thus needs to take place before the analysis in each input space is finished to secure the full potential of the blending process.
Positioning Big–Thick Blending within a multimethodology framework
As a method, the blending methodology carries obvious similarities to the idea of ‘mixed methods’. At its core, mixed method (or multimethodology) is about reaching a more comprehensive or ‘true’ view on reality by linking different bits and pieces of data – often generated by very different methods (Brewer and Hunter, 1989). Within this broad definition many distinct approaches exist, varying across methodological choices as to
The previous theoretical and empirical engagement in mixing big and thick data has neglected the establishment of methodological recommendations for how to carry out such mixing. In our search for mixed method approaches with an attentiveness to the unique affordances of big and thick data, we managed to identify only one method termed as the ‘ethno-mining’. First suggested by Aipperspach et al. (2006) and later further developed in Anderson et al. (2009), ethno-mining sets out to combine thin big data collected from different sensors (data mining) with ethnographic descriptions of the same settings (ethnography). As in Big–Thick Blending, ethno-mining takes seriously the possibilities of harvesting the complementarity between the heterogeneous data worlds and works to craft hybrids in which ‘traces of each of the ingredients can still be seen’ but the different inputs ‘cannot be separated out’ (Anderson et al., 2009: 125). Ethno-mining also describes the iterative and rapid loops necessary in the process.
However, where Big–Thick Blending attempts to integrate analytical results, but never the method or data itself, the ethno-mining approach attempts to engage both the data and the process (Aipperspach et al., 2006). While this might be possible in some special cases with an abundance of available resources, we fear that this focus runs the risk of underestimating the importance of ‘expertise’ needed to fully master both big and thick data. Second, where the main reason for mixing data in the ethno-mining perspectives is ‘exposing the biases inherent in either type of data alone’, a type of triangulation, the Big–Thick Blending focuses on the crafting of entirely new analytical results (blended spaces) through, e.g. complementarity, extension and calibration. Third, ethno-mining only vaguely offers a terminology as to how big and thick data should be integrated. The blending terminology presented here thus fills out the critical void of suggestions as to how one could approach mixing big and thick data.
The goal of the following two cases is to develop this terminology further as well as show it in action. Here we will demonstrate how blending occurs as (1) a departure from a generic space of interest and data complementarity, (2) different input spaces with findings from, respectively, big and thick data analysis and (3) a selective projection of some of these findings into a blended space with emergent properties. We focus on the structural elements of the blending process and focus a bit rigidly only on the input spaces and the blend with emergent properties. There are many other small steps in the iterative progression that are important, but impossible to discuss within the limited scope of this article.
Case 1: Blending big and thick insights from video recordings
This case was developed in close collaboration with a Danish optometrist chain that wanted to improve the in-store experience of their customers when visiting one of their 100 brick-and-mortar shops. Blending big and thick data became especially crucial as we wanted to both quantify and qualify customer’s physical interactional paths and in-store actions to identify crucial points for enhanced customer interactions.
We collected thick data from 11 stores using observations, shadowing, contextual inquiries, interviews, video recordings and mystery shopping (acting like a normal shopper while observing and taking notes). All employees had signed informed consent forms and customers were informed through visible signs and verbal consent. We collected more than 1000 hours of video footage from the stores through mounted and hand-held devices. In a single optometry shop video cameras were also used to quantify the in-store movement through applying novel video analytics and face recognition converting selected recordings into measures of physical in-store movement. The camera remained in the shop for three months, covering most of the store space during business hours. By tracking movement in the recorded video footage, video analytics transformed the movement into spatial coordinates which were combined to depict the totality of movement in the store, in effect quantifying the physical customer paths and turning them into routes fit for statistical manipulation. While the technology has been used for several years in security (Regazzoni et al., 2010), warfare (Bowman et al., 2017) and certain retail applications (Battiato et al., 2016; Huang et al., 2017; Musalem et al., 2015), it has only recently been adapted to mid-range camera technology making it viable for use as more than a niche product. From this project, we show two examples in which blending came to play a key role.
Example 1a: Identifying the importance of tables and charts
Through ethnographic observations in the shop we identified the tables as important interactional touchpoints and wanted to look more into them.
Input space 1: Analysis of thick interactional data of customer interaction
Using video recordings of the interactions around the tables we did a fine-grained multimodal interaction analysis (Mondada, 2014; Streeck et al., 2011). Figure 4 shows one of our initial (thick) micro-analysis of the interaction between a customer and a sales person.
A detailed ‘Jeffersonian’ transcription (Jefferson, 2004) of the interactions occurring at the interview tables in the store. The transcription reveals how the diagram creates a misalignment between the employee and the customer.
Through the analysis of the thick data we identified the diagram as a focal point at the tables. After proposing what the optician would recommend, the conversation ends by the optician asking the customer what she thinks about ‘that’ (line 62). There is a very long pause on 2.6 sec before the customer initiates an entirely new topic. The long pause and the unrest in the customer’s embodied actions displays how she probably does not understand what ‘that’ is. She demonstratively does not respond to the question put forth by the employee, although she orients through body posture and gaze to the diagram thereby making it relevant for the interaction. However, the diagram is not used as a helpful resource in situ. Instead, the different symbols on the diagram seem to be part of the dis-alignment between the employee and the customer, as the customer silently stairs at the symbols without making any verbal account. Hence, this example displays problems in the interaction relating to explaining products using the chart.
Input space 2: Analysis of customer’s big data behaviour
While departing from the same overall generic space, the analysis of the video analytics big data was simultaneously able to pinpoint to the specific behavioural patterns concerned with the tables. During this analysis, we found that human activity centred around the checkout counter and the interview tables, the later a rather obscure area of the store with no products (light red areas of Figure 5).
The big data video analytics mappings are shown with a focus on the tables.
Blending findings from big and thick analysis
Figure 6 shows the ingredients of the blending. Input space 1 consists of the multimodal interaction analysis identifying the diagram as a crucial actor in the table activity, while input space 2 consists of the tracked movement paths in the store, identifying the store tables as an interactional hub. This supplied us with empirical grounding for zooming further in on the interactions at the tables, leading to the subsequent identification of similar examples where participants displayed perplexity towards the diagram central to the activity at the tables. From these two input spaces, a blended space of the activity at the tables qualified as ‘relevant’ was thus created.
Blending big patterns of in-store behaviour with thick descriptions of activity surrounding the store tables.
The shift between data modes allowed us to identify and innovate on an overlooked diagram crucial to the customer’s interactional trajectories. However, while our fine-grained analysis exposed the workings of the diagram in social interaction and selling practices, the tables only became a ‘relevant’ area of interest through the work of the big data camera patterns. By using the granularity of data and the blending of big and thick analytical results, a thick result was thus qualified as central to the store flow through the frequency count of customers in the area, separating ‘real’ issues from non-real. There were many findings not shown here due to space limits, but as shown: specific findings from the input spaces were selectively projected into the blended space which then dynamically developed an emergent structure, in the example providing managers a solid ground to do something about the use of charts.
Example 1b: Identifying the most relevant activity at glass walls
In the second example from the same case, the generic space and analytical aim was to explore how customers interacted with the shop’s glass walls that exhibited diverse product categories such as ‘contact lenses’, ‘trendy male glasses’, and so on. While the company management was aware that wall content and design were important and attracted different customers, they relied solely upon gut feelings to direct the interior design of their 100 shops.
Input space 1: Big data analysis of in-store customer paths
Using video analytics, we produced several compelling heat maps covering store activity. Analysis of the data revealed great differences concerning customer path behaviour and time spent in front of glass walls. But skewness in data, common to most datasets of digital traces, made it difficult to conduct any nuanced comparison of diverse map areas. Additionally, the measured activities in several zones deviated greatly from our expectations, with zones at the periphery of the shop showing up as intense interaction zones, while zones near the entrance were nearly empty (see Figure 7).
Diagram shows video analytics exploring in-store behaviour in a specific optical store before (left) and after (after) blending, thus leading to more relevant and precise numbers.
Input space 2: Thick data analysis
Through analysis of the ethnographic material (video recordings and field notes), we divided the shop into analytically relevant zones. Figure 6 shows some of the different materials that we employed in the subsequent blending process. One of the central findings from the analysis was the way the customers orient to the glass wall by primarily looking at main height where, e.g. signs with glass-category descriptions hung. The analytical process also resulted in many findings about the type of interaction occurring while trying out glasses at the walls, e.g. the focus on how glasses are passed between customer and employees (Due, 2018a, 208b).
The blending strategies of calibration and contextualization
Very different input spaces generated findings from, respectively, big and thick data analysis. Through the process, ethnographic findings about customer behaviour in the shop became standards that the big data results could be evaluated and negotiated against and vice versa. This process thus resembled the scientific process of ‘calibration’, in which the measurements of an instrument, in the current example the video analytics of the customer paths, are stabilized by alternately comparing the results and adjusting the instrument (Franklin, 1997; c.f. Bateson, 1978). From the thick input space specific findings were projected into the blended space: The ethnographic analysis revealed how the extremely high readings for the sunglasses product category, despite its position at the shop’s periphery, were expected during the summer months, with sunglasses being the only product category able to attract the attention of passing customers. The ethnographic analysis also pointed out how the corridor to the eye-testing area was heavily trafficked by staff members, and the consequently high numbers in nearby zones were a misrepresentation of customer activity levels if not properly adjusted when defining/drawing zones in the shop. Through such calibration, based on selectively projected inputs from the different data worlds and types of analysis, blending transformed the untested digital trace into a somewhat reliable measure of behavioural activity, thereby constructing the blended space with emergent properties as a novel result.
Figure 7 also illustrates the thinness of big data as contextless numbers. We can start by asking if 1405 persons in front of the male designer glass wall are above or below expectations? (see Figure 7, right). The question is rhetorical because the number ‘1405’ without further information is without meaning in itself. Contextualization is strongly needed. From a different input space the ethnographic analysis revealed that, e.g. the more expensive and trendy glasses (e.g., male designer glasses) often were considered mere attention attractors. Beautiful, but with a price well above what most customer could afford, the ethnographic analysis uncovered why many of the optometrists accepted low sales number from this category. Viewed within this blended space, the low numbers of customers found in the trendy product zone were in fact surprising and thus highly relevant to the manager.
Through the blending processes, the ethnographer’s analysis of the shop’s layout thus re-contextualized the otherwise arbitrary numbers of customers in front of a glass wall. This knowledge was far more than an appealing supplement but rather an indispensable complementary ‘thickness’ that entered into the final visualization alongside the quantitative (see Figure 8, ‘blended space’). Such challenges of big datasets are extremely common when working with thin big data (cf. Porway, 2013; Blok et al., 2017).
Model summarizes the blending process. By blending the data visualization of movement in the different zones of the store (input space 1) with thick analytical findings (input space 2), the blended space of Figure 8 is formulated.
Case 2: Blending insights from big sensor data with thick etnographic data
To show how the blending process is not only applicable using video analytics, we briefly present a second case. This case concerns an evaluation study of bike signs initiated by the municipality of Copenhagen. To ease cyclists’ navigation in the city the municipality put up hundreds of bike signs showing the direction and travel distance to key places around the city. The municipality wanted to evaluate the effect of the signs and how the cyclists made use of them. Several conventional methods, including ethnographic observation and shadowing of cyclists, interviews and an online survey, were applied to evaluate the usage of the signs. On top of this, the team also used location data from 371 individual cyclists who installed a specially designed app on their smartphone that collects and transmits GPS data on their journeys through the city. Thus, this project originated in a generic space of shared interest and data complementarity: the object of the study, the cyclists’ naturally occurring paths and usage of signs were shared across both data sources but the methods for collecting insights originate from very different methodologies and data worlds.
Input space 1: Analysing big data from GPS trackers
The analysts drew a heat map of the average length (km) of the routes the participants followed (Figure 9). The map revealed how morning and afternoon commuting follow routes of vastly different length. This pattern seemed to persist when zooming in on the paths of some of the individual cyclists, using the granularity of the dataset. Looking at individual GPS-identified paths thus revealed that the two trips often appeared to follow entirely different routes (see e.g. Figure 10).
Average length (km) of bike routes. Morning commuting appears to follow longer routes than in the afternoon. Visualization of an individual cyclist’s trip through Copenhagen. The visualization reveals that the specific cyclist, like many others, follows different paths to and from work.

While the trackers clearly indicated that biking to work differed greatly from the act of biking from work, the analyst were puzzled by the fact that the busy morning commutes should be the long routes, rather than the afternoon. On this background, the analyst developed an alternative explanation in which the shortness of the afternoon routes was a result of stops along the route. This would split the route into multiple routes that individually were shorter than the morning routes but combined would be longer.
Input space 2: Analysing thick data
That cyclists followed different routes according to the hour of the day were also found by the ethnographers by following the cyclists as they navigated the city. Through this shadowing and contextual inquiry, analysis showed that many citizens developed multiple routes to and from the same destination. Thick data consisted not just of interviews in survey form, but also contextual inquiries accomplished during the field work, where the ethnographers would ask the cyclists why they chose the paths they cycled. As a 30-year-old local woman explained on her way to work: ‘I always bike the same route to work because it is the fastest. […] I need to cross the river, so I always bike across ‘cykelslangen’ [ed. bridge in Copenhagen]. That is clearly the fastest’ (see Figure 11). While speed is thus the primary factor for this cyclists’ choice of route in the morning, this and encounters with other cyclists revealed how speed is a much less important factor in the afternoon with cyclists developing secondary routes based upon feeling of safety, shopping possibilities along the way and green surroundings. As the 30-year-old local woman describes her secondary route: ‘If the sun shines, then I like to take some time instead of biking home directly, and then I will go by another way, gaze a bit and listen to music.’
A woman shadowed and interviewed about her route preferences. The blended space of mixing thick and big descriptions of cyclists led to a more pluralistic understanding of cyclists’ choice of routes, adjusting the understanding of what counts as an effective sign.

Blended space: Emergent results from big and thick data analysis
The different input spaces resulted in different and yet at the same time very complementary findings because of the shared generic space, which was then selectively projected into a blended space with emergent properties. Through the blending of the two input spaces, a deeper understanding of the city’s cyclists emerged. The blending thus revealed that while the increased speed and more direct routes provided by the signs might be useful in most situations, path choices are often more complicated with multiple factors informing the final choice of the route in the afternoon. To improve the signs would thus not only mean optimizing their ‘effectiveness’, but would also necessitate a consideration of how cyclists more interested in green surroundings and traffic might best be assisted. Figure 12 summarizes the blending.
This case represents an example in which reciprocal effects are produced through the blending. By blending big and thick findings the big data patterns of the bike journeys are enriched with an explanation through the thick observation of many cyclists’ reliance upon multiple routes to and from their home. In this sense, the thick observations work on the big by adding a ‘why’ to the big ‘what’, a relationship that has also been brought forward in prior big data studies (e.g., Kitchin, 2014; Porway, 2013). The relationship is, however, also reciprocal, since the same blending process also extends the thick observations with knowledge on the generalizability of the behaviour of using multiple routes to and from one’s home.
The blending thus exploits the unique granularity of most big datasets (Ruppert et al., 2013) which allows us to re-identify selected behavioural traits of the population built from thick descriptions within the big datasets. In this specific case, we identify both thick and big observations of having multiple routes. However, in contrast to conventional quantitative data sources, the extreme granularity of big data allows us to aggregate individual observations together into aggregated basic statistics for the behaviour without having to disconnect from the individual behaviour as illustrated in Figure 13. After identifying a specific behaviour within data, i.e. having multiple bike routes, we thus count the number of people who according to the data make use of multiple routes in order to evaluate the extent of the phenomenon.
Illustration of linking big-and-thick insights through shared behavioural traits.
This strategy is thus not unlike the very common mixed method strategy of exploring the extent of an identified phenomenon by, e.g. following up the observation with a representative survey (Bryman, 2006). However, in contrast to such mixed methods, which commonly ends up working on different populations and within different settings, both setting and population remain closely linked to each other in the blending approach as the departure is within the generic space with alike structural data properties. Through this move, it is possible to reach what has fittingly been described as a quali-quantitative perspective (Latour et al., 2012) with numbers and stories co-appearing within the same blend.
Concluding remarks
For researchers and analysts, the complementary nature of big and thick data suggests moving towards more and deeper integration. While scholars have previously engaged empirically and theoretically with the task of integrating big and thick data worlds, none have attempted to develop a systematic method for this process. An important contribution of our paper is therefore the introduction of methodological specificity backed by empirical cases to the much-talked about, but little practiced, process of complementing big and thick insights. Under the concept of blending, we have reported on our own experiments for engaging analytical insights grounded in big and thick data, conceptually linking insights based on highly heterogeneous datasets.
Summing up, the Big–Thick Blending methodology proposed here is about blending analytical insights. The blending rests upon the contribution from two (or more) separate input spaces containing, respectively, thick and big data analytical insights which share some conceptual associations in a generic space. While the methodology can be applied to blend other data types as well, our interest has been to highlight the unique complementary effects that arise when one attempts to blend thin big data with small thick data. The method unfolds by selectively projecting insights from these input spaces hereby leading to the creation of new blended spaces with the construction of novel results. From the outside, this process bears resemblance to the basic dialectic method of thesis, antithesis, synthesis. However, rather than being based on opposition and conflict (in the Hegelian sense), blending is based on complementarity and extension as the main elements of the creation of analytically interesting cross-space mappings.
Through two cases we have demonstrated how analytical insights built from heterogeneous big-and-thick data sources can qualify and guide each other’s focus through blending processes with the goal of constructing novel results in emergent blended spaces. For simplicity, the presentation of the examples followed a linear, step-by-step progression. While this can be an effective strategy when the objective is to make a specific point, blending processes hardly ever consist of such linear progressions. We fully agree with Elgaard et al (2017) suggestion that mixing should follow the iterative and rapid act of slaloming down a steep hill. The blending processes should also rely on iterative and rapid exchanges between big and thick data insights where the researchers deliberately blend inputs with shared properties.
Common blending strategies and their usage.
As a multimethodology, Big–Thick Blending distinguishes itself from most other approaches because of (1) the attentiveness to data affordances and data complementarity in the generic space, (2) the speed and use of iterations in the method and (3) respect of divergent analytical competencies represented through divergent types of analysis and spaces.
Attentiveness to data affordance not only relate to the use of data sources that are complimentary in their shape (thin-big versus thick), but thin big data and thick data are also commonly connected through a shared focus of observing physical behaviour. What is mixed in Big–Thick Blending are analytical results that share a focus on observing physical behaviour with the implication that both the base of participants and its context are shared across the big and the thick approach leading to much more integrated analytical results in which the different contributions cannot easily be separated out nor exist by itself (cf. Anderson, 2009). This is described within the methodology as shared structure in the generic space.
Dealing with massive datasets also requires specially developed expertise in the same way that the practice of ethnography and micro-analysis of video recordings requires prior training. On this background, we firmly believe that blending processes should seek to honour these differences in expertise, shifting the focus towards analytical outcomes of diverse methods. Blending thus joins with the growing choir of digital-based scholars who suggest that social scientists abandon the historical ideal of the renaissance person, bound to the individual but genius scholar who masters all methods and theories needed (e.g., Ford, 2014; King, 2014; Marres, 2013; Venturini et al., 2017).
