Sensor network–based application has gained increasing attention where data streams gathered from distributed sensors need to be processed and analyzed with timely responses. Distributed complex event processing is an effective technology to handle these data streams by matching of incoming events to persistent pattern queries. Therefore, a well-managed parallel processing scheme is required to improve both system performance and the quality-of-service guarantees of the system. However, the specific properties of pattern operators increase the difficulties of implementing parallel processing. To address this issue, a new parallelization model and three parallel processing strategies are proposed for distributed complex event processing systems. The effects of temporal constraints, for example, sliding windows, are included in the new parallelization model to enable the processing load for the overlap between windows of a batch induced by each input event to be shared by the downstream machines to avoid events that may result in wrong decisions. The proposed parallel strategies can keep the complex event processing system working stably and continuously during the elapsed time. Finally, the application of our work is demonstrated using experiments on the StreamBase system regardless of the increased input rate of the stream or the increased time window size of the operator.
Wireless sensor networks (WSNs) composed of geographically distributed autonomous sensor nodes are cooperatively monitoring the physical environment. WSNs have been gaining importance in a variety of applications including health-care monitoring, fire detection, environmental monitoring, habitat monitoring, financial services, military surveillance, vehicle tracking systems, and earthquake observation. Distributed sensors in these applications often generate massive data in the form of streams, where processing, analyzing, managing, and detecting patterns from the data streams are unavoidably complex.1 Therefore, how to extract valuable information from the data streams with timely responses plays a very important role. There are two main technologies to process and analyze the data streams. One technology is about data stream processing (DSP).2–7 Another technology is about complex event processing (CEP).8–15
CEP is a method of processing and analyzing the data streams of information by making use of patterns over sequential primitive events for detecting and reporting composite events.16 Because of advantages such as an expressive rule language and an efficient event detection model, CEP systems have been applied extensively in both academic circles and industry in recent times.17–23 In CEP systems, event streams are processed in or near real time for a variety of applications ranging from wireless sensor networks to financial tickers.24–27 In those fields of application, continuous and highly available event stream processing with high throughput is critical to deal with real-world events.
Subsequently, in DSP systems, many types of parallel strategies have been developed to handle massive amounts of distributed data streams.28–38 Because of the differences between DSP and CEP systems,16,26 most of these strategies, which focus exclusively on aggregate queries or binary equi-joins in DSP systems, cannot be simply and directly used in CEP systems, which focus on multi-relational non-equi-joins in the time dimension, possibly with temporal ordering constraints, such as sequences (SEQ) and conjunctions (AND). For example, Figure 1 illustrates a concept hierarchy of medical diagnosis and treatment work and eight pattern queries for monitoring patients’ conditions within 24 h. In this example, let us assume that the patients and equipment are radio-frequency identification (RFID)-tagged, and the information is gathered and collected using the tags. When the information from the RFID-tags are received by the RFID-readers, the system will make decision in terms of matching the data streams according to the pre-determined patterns. The structure of the query language and the definitions of the operations in the queries are explained in section “Preliminaries.” Some of the detailed descriptions, including the time constraints of the queries, are omitted from Figure 1 for simplicity. Such complex pattern queries involve nests of sequences (SEQ) and conjunctions (AND), which can have negative event type(s), and combinations of them.11,13,15,39 The volume and input rates of the data, however, would become large, similar to event stream processing, especially for big data applications.40,41 Increasing the size of the time window of an operator or the input rate of a stream may cause bottlenecks, which gives rise to query results of poor quality, losing the quality-of-service (QoS) guarantees of the system.
Example of concept hierarchy and pattern queries for medical diagnosis and treatment work.
To address these issues, we propose a new parallelization model and three parallel processing strategies that can be used to keep the CEP system running stably and continuously. Specifically, the CEP system based on the proposed parallelization model splits the input stream into parallel sub-streams to execute a continuous pattern query over event streams using a scalable model. The three parallel processing strategies keep the CEP system working stably and continuously during the elapsed time under the increased time window size of the operator and input rate of the stream. Our work is validated through experiments on the StreamBase21 system.
The contributions of this study are as follows:
We propose a novel parallelization model that includes imposed temporal constraints, for example, sliding windows, to enable sharing of the processing load for the overlap between windows of a batch induced by each input event by the downstream machines.
Applying the proposed parallelization model, we design three parallel processing strategies for the CEP system, that is, , , and , which keep the CEP system running stably and continuously.
Using the StreamBase system,21 we perform empirical studies to validate our work, with increased input rates and larger time windows.
The rest of this article is organized as follows: The related work is discussed in section “Related work.” The background for this study is briefly introduced in section “Preliminaries.” A new parallelization model and three parallel processing strategies are proposed for distributed CEP systems in section “The proposed methods.” The model is validated against the experiments on the StreamBase system and the results are presented in section “Experimental results.” Finally, the conclusions and future work are presented in section “Conclusion and future work.”
Related work
CEP plays a very important role in detecting and integrating events by pattern matching in situation-aware applications from financial trading to health care.8–15 Most of the conventional CEP systems are centralized, running on a single machine. With the need for high performance, distributed CEP systems are developed by distributing the detection logic over a network of operators, where individual operators can be a bottleneck, requiring operator parallelization.42 As far as we know, only a few recent works focus on parallelization of pattern-matching processing in which pattern matching is processed as a stateful operator in a general-purpose streaming system. Hirzel30 exploits the partitioning constructs provided by the queries, such as PARTITION BY. However, this approach is sufficient only when queries contain such constructs. Wu et al.31 proposed a parallelization framework for stateful stream processing operators that split events using a round-robin methodology to replicate different versions of an operator with a shared state. Despite being an effective parallelization framework, their framework is not feasible for pattern matching. Schneider et al.36 implemented intra-operator parallelism through data-partitioning; they introduced a compiler and a run time system that can automatically extract data parallelism from streaming applications. Brenna et al.38 proposed a novel approach to non-deterministic finite automata (NFA)-based distributed event processing where the NFA is decomposed into separate states running on different machines. In contrast to other related works, they also focus on running multiple queries in parallel. Balkesen et al.43 proposed a run-based intra-query parallelism (RIP) technique for scalable pattern matching in event streams. RIP distributes input events that belong to individual run instances of a pattern’s finite state machine (FSM) to different processing units.
In this study, the focus is the parallelization of a pattern operator previously described in the literature.10–12,15 This approach is both orthogonal and complementary to the previous approaches. Our approach can be used even to parallelize a pattern operator with CEP queries that contain no “PARTITION BY” clauses. In addition, our parallelization approach includes imposed temporal constraints, for example, sliding windows, where the overlap between the windows of each batch influences the processing load induced by the following events. By implementing our approach, events generated by matching the coming input events from different streams are produced on different machines, where one input stream is replicated and sent to the downstream machine and the other input stream is split and sent to the downstream machine. Therefore, the processing load is shared by the downstream machines, while it can avoid omitting the detection of some events that may result in incorrect decisions.
Preliminaries
In this section, we briefly introduce the basic event model, nested pattern query language, and pattern operators based on related studies.10–12,15,39
Event model
An event, which represents an instance and an atomic, is an occurrence of interest at a point in time. Basically, events can be classified into primitive events and composite events. A primitive event instance is a pre-defined single occurrence of interest that cannot be split further into smaller events. A composite event instance occurs over an interval and is composed of primitive events.
Definition 1
A primitive event is typically modeled multi-dimensionally and is denoted as , where, for simplicity, we use the subscript attached to a primitive to denote the timestamp , is the event type that describes the essential features of , is the start timestamp of , is the end timestamp of , and other attributes of and the number of attributes in are the dimensions of interest.
Definition 2
Based on Definition 1, a composite event is denoted as .
Nested pattern query language
The following nested complex event query language is used to specify nested pattern queries:
PATTERN (Event Expression: composite event expressed by the nesting of SEQ and AND, which can have negative event type(s), and their combination operators);
WHERE (Qualification: value constraint);
WITHIN (Window: time constraint).
The composite event expression in the PATTERN clause specifies the nested pattern queries, which support nests of SEQ and AND that can have negative event type(s), and their combination operators, as explained above. Sub-expressions denote inner parts of a pattern query expression. The value constraint in the WHERE clause defines the context for the composite events by imposing predicates on event attributes. The time constraint in the WITHIN clause describes the time window during the time difference between the first and the last event instances, which is matched by a pattern query that falls within the window.
Pattern operators
We define the operators that appear in the PATTERN clause of a query. More details can be found in the literature.39 In the following, denotes an event type.
Definition 3
An SEQ operator10,15 specifies a particular order, according to the start-time stamp, in which the event must occur to match the pattern and thus form a composite event
Definition 4
An AND operator15 takes the event types as input, and events occur within a specified time window without a specified time order
The proposed methods
In this section, a new parallelization model and three parallel processing strategies are proposed for distributed CEP systems. The proposed strategies comprise a scalable execution of a continuous pattern query of event streams by splitting the input stream into parallel sub-streams.
Parallelization model
We propose a new parallelization model for pattern operators. An example scenario of an application with a pattern operator using our proposed framework for parallelizing the pattern operator is shown in Figure 2. In this example, we assume that two input streams are sent to the CEP system. Because of the specific property of a pattern operator, as described in section “Preliminaries,” we cannot split both of the inputs and at the same time. Doing so may miss some events, resulting in an incorrect decision. Once an event of arrives, the compute function of the pattern operator is initiated. Then, the pattern operator creates a new window for every input tuple of . The input stream is split into parallel sub-streams with input rate , while the replicate of input is sent to the back-end operators with input rate . The assembly facilitates the parallelization of pattern operators.
An example scenario of (a) a pattern operator and (b) its parallelized implementation.
Split-(process*)-merge
The assembly shown in Figure 2(b) replaces the pattern operator in the application data flow shown in Figure 2(a). In the parallelized version of the application data flow, is split to the back-end process operators, and the output of the pattern operator in Figure 2(a) is replaced by the output coming from the merge operator, as shown in Figure 2(b).
Split. The split operator splits an input stream into parallel sub-streams. The split operator outputs the incoming events to a number of back-end pattern operators by an event splitting policy that will be explained in section “Parallel processing strategies.”
Process. The process operator performs the events from the output of the front-end operators. Multiple process operators with the same function execute in parallel.
Merge. The merge operator consumes the output events from the process operators to generate the final output events. The merge operator by default simply forwards the output events to the output port.
Intra-operator parallelization
The analysis in section “Split-(process*)-merge” showed that this method efficiently works with two input streams. If the operator in Figure 2(a) has multiple input streams, rather than just and , an intra-operator parallelization is implemented in advance. Using this approach, each operator with more than two input streams (as shown in Figure 3) will be separated into multiple assemblies, as shown in Figure 4.
A pattern operator with multiple input streams.
The use of the proposed framework for parallelizing the pattern operator with multiple input streams.
Parallel processing strategies
Round-Robin
means that an incoming event is sent to a downstream server with equal probability. This policy equalizes the expected number of events at each server.
Algorithm 1 is the implementation of the strategy, where is used to count the number of arrived events, represents the event, denotes the server label, is the total number of servers, and denotes the expected number of events that can be pre-defined by the users. As shown in Algorithm 1, after receiving an event , the algorithm determines whether the number of expected events will arrive at the system (Lines 1–3). If so, the next number of expected events will be sent to the updated downstream server (Lines 4–10). Otherwise, it will continuously send arrived events to the current server and count the arrived events (Lines 12–13).
Algorithm 1. An algorithm for realizing the strategy.
Require: Initializing , , ; Setting the expected number of events, ; Obtaining the number of all servers, ;Ensure:1: for event do2: ;3: if then4: ;5: if do6: sending the events to downstream server ;7: else8: ;9: sending the events to downstream server ;10: end if11: else12: sending the events to downstream server ;13: ;14: end if15: end for
For simplicity and without loss of generality, we use an example of the flow chart of parallel processing (as shown in Figure 5) in terms of two inputs and with input rates and , respectively, based on the parallelization model in Figure 2(b).
The flow chart of the parallel processing strategy.
The “” is used to count the number of arrived events of , and is used to split and equalize the expected number of events at downstream and , while is replicated and sent to downstream and . Note that and have the same process functionality. Subsequently, the output is generated by merging the results from the upstream servers.
Join-the-Shortest-Queue
means that the expected number of events are assigned to the downstream server with the shortest queue length for further processing.
Algorithm 2 implements the strategy, where is used to count the arrived events, denotes the label of the server, represents the total number of servers, represents the event, denotes the queue length of the downstream server , is the preferred server to be used for processing the next expected batch of events, and denotes the expected number of events that can be pre-defined by the users. Algorithm 2 consists of two functions, and , where is responsible for processing the arrived events and is responsible for selecting the downstream server with the shortest queue length. In the function, after receiving an event , the algorithm judges whether the number of expected events have arrived (Lines 1–3). If they have, it will call the function to obtain the preferred server (Lines 4–5). Then, the next number of expected events will be sent to the preferred downstream server (Line 6). Otherwise, the algorithm will continuously send arrived events to the current server and add to the count of arrived events (Lines 8–9). In the function, for number of servers, the algorithm obtains the queue length of all downstream servers, storing each value of (; Lines 1–3). Next, it compares the queue length of all downstream servers by the value of and obtains the minimum value of (Line 4). The value of is then updated to the server with the minimum value of (Line 5).
Algorithm 2. An algorithm for realizing the strategy.
Require: Initializing , , , ; Setting the expected number of events, ; Obtaining the number of all servers, ;Ensure:1: ProcessEvents (, );2: UpdateServer (, );ProcessEvents (, )1: for event do2: ;3: if then4: UpdateServer ();5: 6: send events to downstream server with the shortest queue length;7: else8: send the events to downstream server ;9: ;10: end if11: end forUpdateServer ()1: for () do2: store the value of ;3: end for4: obtain the ;5: update the value of ;
For simplicity and without loss of generality, the flow chart for parallel processing shown in Figure 6 in terms of two inputs and with input rates and , respectively, based on the parallelization model in Figure 2(b) is used as an example. The variable “” is used to count the event of . Unlike the strategy, the queue lengths of downstream and are monitored in real time by and , respectively. The queue lengths of downstream and , , and , are stored, and the label of the server with the shortest queue length (app = 1 or 2) is obtained and assigned to . The “” process is used to update for the next expected number of events, and the value of is assigned to to enable the selection of the preferred downstream server to process the expected number of events. Meanwhile, is replicated and sent to downstream and , who have the same process functionality. After that, the output is generated by merging the upstream servers’ results.
Flow chart of the parallel processing strategy.
Least-Loaded-Server-First
The strategy dynamically assigns the expected number of events to the downstream server with the smallest load. The least-loaded server is the server with the least-used memory.
The strategy is illustrated in Algorithm 3, in which is used to count the arrived events, denotes the label of the server, is the total number of servers, is the event, is the used memory of the downstream server , represents the preferred server to be used for processing the next expected number of events, and is the expected number of events, which can be pre-defined by the users. As shown in Algorithm 3, the strategy consists of two functions, and , where processes the arrived events and selects the downstream server with the least used memory. Specifically, in the function, after receiving an event , the algorithm judges whether the number of expected events have arrived at the system (Lines 1–3). If they have, it will call the function to obtain the preferred server (Lines 4–5). The next number of expected events will then be sent to the preferred downstream server (Line 6). Otherwise, arrived events will continuously be sent to the current server , counting the arrived events (Lines 8–9). The function obtains the used memory for all the downstream servers and stores each value of (; Lines 1–3). The used memory of all downstream servers is compared using , and the minimum value of (Line 4) indicates the next available server. Therefore, the value of will be updated to indicate the server with the minimum (Line 5).
Algorithm 3. An algorithm for realizing the strategy.
Require: Initialize , , , ; Set the expected number of events, ; Obtain the number of all servers, ;Ensure:1: ProcessEvents (, );2: UpdateServer (, );ProcessEvents (, )1: for event do2: ;3: if then4: UpdateServer ();5: 6: send events to downstream server with the least used memory;7: else8: send the events to downstream server ;9: ;10: end if11: end forUpdateServer ()1: for () do2: store the value of ;3: end for4: obtain the ;5: update the value of ;
For simplicity and without loss of generality, the flow chart of parallel processing (as shown in Figure 7) in terms of two inputs and with input rates and , respectively, based on the parallelization model in Figure 2(b) is used as an example. The event of is counted by updating “.” Unlike the strategy, the used memory values of downstream and are monitored in real time by and , respectively. The values for the used memory of downstream and , that is, and , are stored, while the server label with the least used memory (app = 1 or 2) is obtained and assigned to the variable. The variable is updated using “” for the next expected number of events, and the value of is assigned to to enable the preferred downstream server to be selected for processing the expected number of events. Meanwhile, is replicated and sent to downstream and , who have the same process functionality. The output is generated by merging the upstream servers’ results.
Flow chart of the parallel processing strategy.
Experimental results
Experimental setup
Based on the parallelization model in Figure 2, we implemented experiments on the StreamBase21 system for the following query:
To validate our proposal, we compared the three proposed strategies, , , and , with the base method. In the base method, the operator relies on NFA to represent the structure of an event sequence.12 It means that the base method expresses a pattern as a non-deterministic finite automaton and manages and matches the events by making use of stacks. The strategies , , and are mutually independent. The relative performances of , , and are determined in varying experimental environments. The experiments are run on machines with AMD Opteron(tm) 6376 processors and 4.00 GB of main memory. The streams used in the experiments are synthetically generated where for each stream we set three attributes, namely, the event type, timestamp, and event id. To implement the base method, we utilized three machines, including one machine to create the input data, one to process the input data, and the third to receive data and output throughput. In the parallel experiments, for simplicity and without loss of generality, we implemented four machines for the , , and strategies, including one machine to create input data, two machines to process the input data in parallel, and the forth machine to receive data and output throughput. We define the throughput as the number of events per second.
Comparison of the methods under an increased input rate environment
When the time window size is 1 s and the input rate is 400 tuples/s, the data in Figure 8(a) show that the , , and strategies result in higher throughput than the base method. The data in Figure 8(b) show that results in the highest throughput because it assigns events to the downstream server with the least load for further processing. Furthermore, as we increase the input rate, the strategy runs stably and continuously; however, the increased input rate caused bottlenecks when the base method was implemented, as shown by the green stars in Figure 9.
Comparison of the throughput of each method under an increased input rate environment.
Comparison of with the base method under an increased input rate environment.
Comparison of the methods under an increased time window size environment
The data in Figure 10(a) and (b) show that the strategy outperformed the strategy, the strategy and the base method, even with experimental settings of 16 s for the time window size and 100 tuples/s for the input rate. In addition, with increased time window size, the strategy ran stably and continuously; however, the increased time window size resulted in bottlenecks in the base method, shown as blue circles in Figure 11.
Comparison of the throughput of each method under an increased time window size environment.
Comparison of with the base method under an increased time window size environment.
However, no matter when the input rate of the streams increased to 400 tuples/s or the time window size of the operators increased to 16 s, the throughput of the and strategies was more higher than the throughput of the strategy as shown in Figures 8 and 10. This is because the and strategies could realize the load balance better comparing with the strategy.
Conclusion and future work
In the beginning of this article, we identified the general problems of parallel processing with respect to pattern operators. We proposed a new parallelization model for a pattern operator and three parallel processing strategies, , , and , for distributed CEP systems. Three parallel processing strategies were chosen to achieve stable and continuous event stream processing. The model was validated through experiments on the StreamBase system.
For future work, we propose the design of an adapting mechanism by fully leveraging the proposed parallel processing strategies to realize distributed CEP, deciding the parallelization degree and adopting real streaming data in the experiments. Furthermore, as discussed by Flouris et al.,44 distributed monitoring over probabilistic event streams is also an open and interesting issue that should be explored further. Conventional CEP systems inevitably involve various types of intrinsic uncertainties, such as imprecision, fuzziness, and incompleteness, caused by the sensor environment. Math tools such as D-numbers and evidence theory are widely used to handle the uncertainties and the data fusion in these systems.45–48 In these cases, a fuzzy approach is recommended over distributed CEP systems in future research. In the future, we also intend to include the hardware implementation and energy requirements for distributed CEP systems.
Footnotes
Academic Editor: James Brusey
Declaration of conflicting interests
The author(s) declared no potential conflicts of interest with respect to the research,authorship,and/or publication of this article.
Funding
The author(s) disclosed receipt of the following financial support for the research,authorship,and/or publication of this article: This research is supported by the Fundamental Research Funds for the Central Universities (Grant Nos XDJK2015C107,SWU115008,XDJK2016C043,and SWU115091),the Education Teaching Reform Program of Higher Education (No. 2015JY030),the 1000-Plan of Chongqing by Southwest University (No. SWU116007),the National Natural Science Foundation of China (Nos. 61672435,61702427,and 61702426),the Project Funded by the Priority Academic Program Development of Jiangsu Higher Education Institutions (PAPD),and the Jiangsu Collaborative Innovation Center on Atmospheric Environment and Equipment Technology (CICAEET).
References
1.
BhargaviRVaidehiV. Semantic intrusion detection with multisensor data fusion using complex event processing. Sadhana2013; 38(2): 169–185.
2.
AbadiDJAhmadYBalazinskaMet al. The design of the Borealis stream processing engine. In: Proceedings of the second biennial conference on innovative data systems research, Asilomar, CA, 4–7 January 2005, pp.277–289, www.cidrdb.org.
3.
AbadiDJCarneyDÇetintemelUet al. Aurora: a new model and architecture for data stream management. VLDB J2003; 12(2): 120–139.
4.
BalazinskaMBalakrishnanHMaddenSRet al. Fault-tolerance in the Borealis distributed stream processing system. ACM T Database Syst2008; 33(1): 3.
5.
ChandrasekaranSCooperODeshpandeAet al. TelegraphCQ: continuous dataflow processing for an uncertain world. In: Proceedings of the first biennial conference on innovative data systems research (CIDR), Asilomar, CA, 5–8 January 2003, www.cidrdb.org.
6.
HwangJ-HXingYÇetintemelUet al. A cooperative, selfconfiguring high-availability solution for stream processing. In: Proceedings of the 23rd international conference on data engineering (ICDE 2007), Istanbul, 11–15 April 2007, pp.176–185. New York: IEEE.
7.
ShahMAHellersteinJMBrewerE. Highly available, fault-tolerant, parallel dataflows. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 2004), Paris, 13–18 June 2004, pp.827–838. New York: ACM.
8.
DiaoYStahlbergPAndersonG. SASE: complex event processing over streams. In: Proceedings of the 3rd biennial conference on innovative data systems research (CIDR 2007), Asilomar, CA, 7–10 January 2007, pp.407–411, www.cidrdb.org
9.
DemersAJGehrkeJPandaBet al. Cayuga: a general purpose event monitoring system. In: Proceedings of the 3rd biennial conference on innovative data systems research (CIDR 2007), Asilomar, CA, 7–10 January 2007, pp.412–422, www.cidrdb.org
10.
LiuMRundensteinerEGreenfieldKet al. E-cube: multi-dimensional event sequence analysis using hierarchical pattern query sharing. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 2011), Athens, 12–16 June 2011, pp.889–900. New York: ACM.
11.
MeiYMaddenS. Zstream: a cost-based query processor for adaptively detecting composite events. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 2009), Providence, RI, 29 June–2 July 2009, pp.193–206. New York: ACM.
12.
WuEDiaoYRizviS. High-performance complex event processing over streams. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 2006), Chicago, IL, 27–29 June 2006, pp.407–418. New York: ACM.
AgrawalJDiaoYGyllstromDet al. Efficient pattern matching over event streams. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 2008), Vancouver, BC, Canada, 9–12 June 2008, pp.147–160. New York: ACM.
15.
LiuMRundensteinerEDoughertyDet al. High-performance nested CEP query processing over event streams. In: Proceedings of the 27th international conference on data engineering (ICDE 2011), Hannover, 11–16 April 2011, pp.123–134. New York: IEEE.
16.
CugolaGMargaraA. Processing flows of information: from data stream to complex event processing. ACM Comput Surv2012; 44(3): 15.
Terroso-SáenzFValdés-VelaMSotomayor-MartínezCet al. A cooperative approach to traffic congestion detection with complex event processing and VANET. IEEE Intell Transp2012; 13(2): 914–929.
25.
GuYYuGLiC. Deadline-aware complex event processing models over distributed monitoring streams. Math Comput Model2012; 55(3): 901–917.
HirzelM. Partition and compose: parallel complex event processing. In: Proceedings of the 6th ACM international conference on distributed event-based systems (DEBS 2012), Berlin, 16–20 July 2012, pp.191–200. New York: ACM.
31.
WuSKumarVWuK-Let al. Parallelizing stateful operators in a distributed stream processing system: how, should you and how much? In: Proceedings of the 6th ACM international conference on distributed event-based systems (DEBS 2012), Berlin, 16–20 July 2012, pp.278–289. New York: ACM.
32.
JohnsonTMuthukrishnanMSShkapenyukVet al. Query-aware partitioning for monitoring massive network data streams. In: Proceedings of the 24th international conference on data engineering (ICDE 2008), Cancún, Mexico, 7–12 April 2008, pp.1135–1146. New York: ACM.
33.
LiuBRundensteinerEA. Revisiting pipelined parallelism in multi-join query processing. In: Proceedings of the 31st international conference on very large data bases (VLDB 2005), Trondheim, 30 August–2 September 2005, pp.829–840. New York: VLDB Endowment.
34.
ChaikenRJenkinsBLarsonP-Ået al. Scope: easy and efficient parallel processing of massive data sets. Proc VLDB Endow2008; 1(2): 1265–1276.
35.
UpadhyayaPKwonYBalazinskaM. A latency and fault-tolerance optimizer for online parallel query plans. In: Proceedings of the ACM SIGMOD international conference on management of data (SIGMOD 2011), Athens, 12–16 June 2011, pp.241–252. New York: ACM.
36.
SchneiderSHirzelMGedikBet al. Auto-parallelizing stateful distributed streaming applications. In: Proceedings of the 21st international conference on parallel architectures and compilation techniques (PACT 2012), Minneapolis, MN, 19–23 September 2012, pp.53–64. New York: ACM.
37.
SafaeiAAHaghjooMS. Dispatching stream operators in parallel execution of continuous queries. J Supercomput2012; 61(3): 619–641.
38.
BrennaLGehrkeJHongMet al. Distributed event stream processing with non-deterministic finite automata. In: Proceedings of the third ACM international conference on distributed event-based systems (DEBS 2009), Nashville, TN, 6–9 July 2009, p.3. New York: ACM.
39.
XiaoFAritsugiM. Nested pattern queries processing optimization over multi-dimensional event streams. In: 37th annual IEEE computer software and applications conference (COMPSAC 2013), Kyoto, Japan, 22–26 July 2013, pp.74–83. New York: IEEE.
40.
CarneyDÇetintemelUCherniackMet al. Monitoring streams: a new class of data management applications. In: Proceedings of the 28th international conference on very large data bases (VLDB 2002), Hong Kong, China, 20–23 August 2002, pp.215–226. Burlington, MA: VLDB Endowment.
41.
XiaoFKitasukaTAritsugiM. Economical and fault-tolerant load balancing in distributed stream processing systems. IEICE T Inf Syst2012; 95(4): 1062–1073.
42.
MayerRKoldehofeBRothermelK. Predictable low-latency event detection with parallel complex event processing. IEEE Internet Things2015; 2(4): 274–286.
43.
BalkesenCDindarNWetterMet al. Rip: run-based intraquery parallelism for scalable complex event processing. In: Proceedings of the 7th ACM international conference on distributed event-based systems, Arlington, TX, 29 June–3 July 2013, pp.3–14. New York: ACM.
44.
FlourisIGiatrakosNDeligiannakisAet al. Issues in complex event processing: status and prospects in the big data era. J Syst Software2017; 127: 217–236.
45.
ZhouXDengXDengYet al. Dependence assessment in human reliability analysis based on D numbers and AHP. Nucl Eng Des2017; 313: 243–252.
46.
ZhouXShiYDengXet al. D-DEMATEL: a new method to identify critical success factors in emergency management. Safety Sci2017; 91: 93–104.
47.
MoHDengY. A new aggregating operator in linguistic decision making based on D numbers. Int J Uncertain Fuzz2016; 24(6): 831–846.
48.
WangJXiaoFDengXet al. Weighted evidence combination based on distance of evidence and entropy function. Int J Distrib Sens N. Epub ahead of print 25July2016. DOI: 10.1177/155014773218784.