25:
353:
involved databases. But, since the data of a data streams is unknown in advance, there are no such statistics in a DSMS. However, it is possible to observe a data stream for a certain time to obtain some statistics. Using these statistics, the query can also be optimized later. So, in contrast to a DBMS, some DSMS allows to optimize the query even during runtime. Therefore, a DSMS needs some plan migration strategies to replace a running query plan with a new one.
379:
of the graph is connected to the outgoing sinks, which may be for example a visualization. Since most DSMSs are data-driven, a query is executed by pushing the incoming data elements from the source through the query plan to the sink. Each time when a data element passes an operator, the operator performs its specific operation on the data element and forwards the result to all successive operators.
217:
the continuous calculation of an average. Instead of memorizing each data point, the synopsis only holds the sum and the number of items. The average can be calculated by dividing the sum by the number. However, it should be mentioned that synopses cannot reflect the data accurately. Thus, a processing that is based on synopses may produce inaccurate results.
231:
windows. There are, for example, approaches that use timestamps or time intervals for system-wide windows or buffer-based windows for each single processing step. Sliding-window query processing is also suitable to being implemented in parallel processors by exploiting parallelism between different windows and/or within each window extent.
361:
Since a logical operator is only responsible for the semantics of an operation but does not consist of any algorithms, the logical query plan must be transformed into an executable counterpart. This is called a physical query plan. The distinction between a logical and a physical operator plan allows
352:
Furthermore, there are also cost-based optimization techniques like in DBMS, where a query plan with the lowest costs is chosen from different equivalent query plans. One example is to choose the order of two successive join operators. In DBMS this decision is mostly done by certain statistics of the
225:
Instead of using synopses to compress the characteristics of the whole data streams, window techniques only look on a portion of the data. This approach is motivated by the idea that only the most recent data are relevant. Therefore, a window continuously cuts out a part of the data stream, e.g. the
216:
The idea behind compression techniques is to maintain only a synopsis of the data, but not all (raw) data points of the data stream. The algorithms range from selecting random data points called sampling to summarization using histograms, wavelets or sketching. One simple example of a compression is
378:
Since the physical query plan consists of executable algorithms, it can be directly executed. For this, the physical query plan is installed into the system. The bottom of the graph (of the query plan) is connected to the incoming sources, which can be everything like connectors to sensors. The top
230:
lists or tumbling windows that cut out disjoint parts. Furthermore, the windows can also be differentiated into element-based windows, e.g., to consider the last ten elements, or time-based windows, e.g., to consider the last ten seconds of data. There are also different approaches to implementing
114:
One important feature of a DSMS is the possibility to handle potentially infinite and rapidly changing data streams by offering flexible processing at the same time, although there are only limited resources such as main memory. The following table provides various principles of DSMS and compares
101:
that is not only performed once, but is permanently installed. Therefore, the query is continuously executed until it is explicitly uninstalled. Since most DSMS are data-driven, a continuous query produces new results as long as new data arrive at the system. This basic concept is similar to
328:
In the next step, the declarative query is translated into a logical query plan. A query plan is a directed graph where the nodes are operators and the edges describe the processing flow. Each operator in the query plan encapsulates the semantic of a specific operation, such as filtering or
207:
One of the biggest challenges for a DSMS is to handle potentially infinite data streams using a fixed amount of memory and no random access to the data. There are different approaches to limit the amount of data in one pass, which can be divided into two classes. For the one hand, there are
349:, a query optimizer can use the algebraic equivalences to optimize the plan. These may be, for example, to push selection operators down to the sources, because they are not so computationally intensive like join operators.
243:
processing in DBMS by using declarative languages to express queries, which are translated into a plan of operators. These plans can be optimized and executed. A query processing often consists of the following steps.
226:
last ten data stream elements, and only considers these elements during the processing. There are different kinds of such windows like sliding windows that are similar to
362:
more than one implementation for the same logical operator. The join, for example, is logically the same, although it can be implemented by different algorithms like a
256:
in DBMS. Since there are no standardized query languages to express continuous queries, there are a lot of languages and variations. However, most of them are based on
333:, so that there are operators for selection, projection, join, and set operations. This operator concept allows the very flexible and versatile processing of a DSMS.
341:
The logical query plan can be optimized, which strongly depends on the streaming model. The basic concepts for optimizing continuous queries are equal to those from
97:. A DBMS also offers a flexible query processing so that the information needed can be expressed using queries. However, in contrast to a DBMS, a DSMS executes a
275:
The language strongly depends on the processing model. For example, if windows are used for the processing, the definition of a window has to be expressed. In
325:
This stream continuously calculates the average value of "price" of the last 10 tuples, but only considers those tuples whose prices are greater than 100.0.
208:
compression techniques that try to summarize the data and for the other hand there are window techniques that try to portion the data into (finite) parts.
451:
508:
370:. Notice, these algorithms also strongly depend on the used stream and processing model. Finally, the query is available as a physical query plan.
438:
471:
698:
679:
476:
272:. There are also graphical approaches where each processing step is a box and the processing flow is expressed by arrows between the boxes.
601:
46:
639:
68:
720:
227:
329:
aggregation. In DSMSs that process relational data streams, the operators are equal or similar to the operators of the
424:
39:
33:
90:
239:
Since there are a lot of prototypes, there is no standardized architecture. However, most DSMS are based on the
50:
741:
498:
261:
103:
503:
345:. If there are relational data streams and the logical query plan is based on relational operators from the
269:
640:"Chandrasekaran, S. et al, "TelegraphCQ: Continuous Dataflow Processing for an Uncertain World." CIDR 2003"
572:
530:"Parallel Patterns for Window-Based Stateful Operators on Data Streams: An Algorithmic Skeleton Approach"
240:
446:
421:
397:
435:
577:
456:
549:
346:
330:
736:
694:
675:
646:
541:
363:
442:
367:
342:
714:
593:
730:
486:
553:
481:
413:
NiagaraST: A Research Data Stream
Management System at Portland State University
402:
86:
545:
417:
715:
Processing Flows of
Information: From Data Stream to Complex Event Processing
461:
279:, a query with a sliding window for the last 10 elements looks like follows:
276:
265:
388:
252:
The formulation of queries is mostly done using declarative languages like
94:
623:
594:"NiagaraCQ: A Scalable Continuous Query System for Internet Databases"
407:
717:- Survey article on Data Stream and Complex Event Processing Systems
466:
412:
93:(DBMS), which is, however, designed for static data in conventional
529:
257:
253:
18:
430:
392:
592:
Jianjun Chen; David J. DeWitt; Feng Tian; Yuan Wang (2000).
85:(DSMS) is a computer software system to manage continuous
528:
De
Matteis, Tiziano; Mencagli, Gabriele (25 March 2016).
723:- Introduction to streaming data management with SQL
106:
so that both technologies are partially coalescing.
626:
STREAM: The
Stanford Data Stream Management System.
427:-based framework for Data Stream Management Systems
196:Variable data arrival and data characteristics
534:International Journal of Parallel Programming
8:
153:(Theoretically) unlimited secondary storage
628:Technical Report. 2004, Stanford InfoLab.
576:
69:Learn how and when to remove this message
509:Relational data stream management system
164:Consideration of the order of the input
117:
32:This article includes a list of general
569:Aurora: A Data Stream Management System
520:
172:Potentially extremely high update rate
693:. Waterloo, USA: Morgan and Claypool.
689:Golab, Lukasz; Ă–zsu, M. Tamer (2010).
124:Data stream management system (DSMS)
7:
672:Data Streams: Models and Algorithms
161:Only the current state is relevant
121:Database management system (DBMS)
38:it lacks sufficient corresponding
14:
248:Formulation of continuous queries
188:Assumes outdated/inaccurate data
599:. Computer Sciences Department.
23:
602:University of Wisconsin–Madison
203:Processing and streaming models
177:Little or no time requirements
1:
83:data stream management system
129:Persistent data (relations)
670:Aggarwal, Charu C. (2007).
457:SAS Event Stream Processing
193:Plannable query processing
169:Relatively low update rate
758:
721:Stream processing with SQL
447:webMethods Business Events
115:them to traditional DBMS.
91:database management system
546:10.1007/s10766-016-0413-x
357:Transformation of queries
262:Continuous Query Language
499:Complex Event Processing
441:24 December 2016 at the
393:StreamBase Systems, Inc.
281:
104:Complex event processing
16:Computer software system
504:Event stream processing
337:Optimization of queries
180:Real-time requirements
53:more precise citations.
691:Data Stream Management
674:. New York: Springer.
132:Volatile data streams
487:WSO2 Stream Processor
89:. It is similar to a
408:NIAGARA Query Engine
398:Hortonworks DataFlow
374:Execution of queries
156:Limited main memory
110:Functional principle
567:Abadi; et al.
185:Assumes exact data
148:Continuous queries
652:on 7 February 2014
624:Arasu, A., et al.
347:Relational algebra
331:Relational algebra
140:Sequential access
700:978-1-608-45272-9
681:978-0-387-47534-9
200:
199:
145:One-time queries
79:
78:
71:
749:
704:
685:
662:
661:
659:
657:
651:
645:. Archived from
644:
636:
630:
621:
615:
614:
612:
610:
598:
589:
583:
582:
580:
564:
558:
557:
525:
364:Nested loop join
343:database systems
321:
318:
315:
312:
309:
306:
303:
300:
297:
294:
291:
288:
285:
235:Query processing
118:
99:continuous query
74:
67:
63:
60:
54:
49:this article by
40:inline citations
27:
26:
19:
757:
756:
752:
751:
750:
748:
747:
746:
742:Data management
727:
726:
711:
701:
688:
682:
669:
666:
665:
655:
653:
649:
642:
638:
637:
633:
622:
618:
608:
606:
596:
591:
590:
586:
571:. SIGMOD 2003.
566:
565:
561:
527:
526:
522:
517:
495:
443:Wayback Machine
385:
376:
368:Sort-merge join
359:
339:
323:
322:
319:
316:
313:
310:
307:
304:
301:
298:
295:
292:
289:
286:
283:
250:
237:
223:
214:
205:
112:
75:
64:
58:
55:
45:Please help to
44:
28:
24:
17:
12:
11:
5:
755:
753:
745:
744:
739:
729:
728:
725:
724:
718:
710:
709:External links
707:
706:
705:
699:
686:
680:
664:
663:
631:
616:
584:
578:10.1.1.67.8671
559:
540:(2): 382–401.
519:
518:
516:
513:
512:
511:
506:
501:
494:
491:
490:
489:
484:
479:
474:
469:
464:
459:
454:
449:
433:
428:
415:
410:
405:
400:
395:
384:
381:
375:
372:
358:
355:
338:
335:
282:
260:, such as the
249:
246:
236:
233:
222:
219:
213:
210:
204:
201:
198:
197:
194:
190:
189:
186:
182:
181:
178:
174:
173:
170:
166:
165:
162:
158:
157:
154:
150:
149:
146:
142:
141:
138:
137:Random access
134:
133:
130:
126:
125:
122:
111:
108:
77:
76:
31:
29:
22:
15:
13:
10:
9:
6:
4:
3:
2:
754:
743:
740:
738:
735:
734:
732:
722:
719:
716:
713:
712:
708:
702:
696:
692:
687:
683:
677:
673:
668:
667:
648:
641:
635:
632:
629:
627:
620:
617:
604:
603:
595:
588:
585:
579:
574:
570:
563:
560:
555:
551:
547:
543:
539:
535:
531:
524:
521:
514:
510:
507:
505:
502:
500:
497:
496:
492:
488:
485:
483:
480:
478:
477:StreamInsight
475:
473:
470:
468:
465:
463:
460:
458:
455:
453:
450:
448:
444:
440:
437:
434:
432:
429:
426:
423:
419:
416:
414:
411:
409:
406:
404:
401:
399:
396:
394:
390:
387:
386:
382:
380:
373:
371:
369:
365:
356:
354:
350:
348:
344:
336:
334:
332:
326:
302:examplestream
280:
278:
273:
271:
267:
263:
259:
255:
247:
245:
242:
234:
232:
229:
220:
218:
211:
209:
202:
195:
192:
191:
187:
184:
183:
179:
176:
175:
171:
168:
167:
163:
160:
159:
155:
152:
151:
147:
144:
143:
139:
136:
135:
131:
128:
127:
123:
120:
119:
116:
109:
107:
105:
100:
96:
92:
88:
84:
73:
70:
62:
59:November 2018
52:
48:
42:
41:
35:
30:
21:
20:
690:
671:
654:. Retrieved
647:the original
634:
625:
619:
607:. Retrieved
600:
587:
568:
562:
537:
533:
523:
377:
360:
351:
340:
327:
324:
274:
251:
238:
224:
215:
206:
113:
98:
87:data streams
82:
80:
65:
56:
37:
609:21 November
482:TelegraphCQ
472:StreamGlobe
431:Pipeline DB
422:open source
403:IBM Streams
51:introducing
731:Categories
515:References
34:references
656:26 August
573:CiteSeerX
462:SQLstream
277:StreamSQL
266:StreamSQL
95:databases
737:Big data
605:. SIGMOD
493:See also
439:Archived
418:Odysseus
383:Examples
212:Synopses
452:QStream
264:(CQL),
221:Windows
47:improve
697:
678:
575:
554:255600
552:
467:STREAM
389:AURORA
284:SELECT
36:, but
650:(PDF)
643:(PDF)
597:(PDF)
550:S2CID
436:PIPES
420:, an
366:or a
308:value
305:WHERE
293:price
241:query
695:ISBN
676:ISBN
658:2011
611:2018
425:Java
311:>
299:FROM
268:and
228:FIFO
542:doi
314:100
287:AVG
270:ESP
258:SQL
254:SQL
733::
548:.
538:45
536:.
532:.
445:,
391:,
81:A
703:.
684:.
660:.
613:.
581:.
556:.
544::
320:0
317:.
296:)
290:(
72:)
66:(
61:)
57:(
43:.
Text is available under the Creative Commons Attribution-ShareAlike License. Additional terms may apply.