A Flink Library to Support Reasoning on Uncertain Events


Stream processing is concerned with analyzing data as they are created. Many use cases require such analysis on-the-fly. IoT applications in smart cities [1], smart homes [2], and healthcare [3] are just a few examples of such scenarios. In all these scenarios, data originate from sensors. As most of the sensors communicate their readings wirelessly, there is a large potential for interference. Moreover, sensors might malfunction and start to produce inaccurate readings. All these are forms of the uncertainty of the data [4,5]. In general, there are two main types of uncertainty: value uncertainty and existential uncertainty [6]. An event has value uncertainty when its value is represented either by a probability density function (PDF) or by discrete samples. However, an event has existential uncertainty when the sum of the existential probabilities of its possible values is less than 1 At the end, such readings are consumed by stream processing pipelines. Without a built-in immunity mechanism against uncertainty or errors in sensor readings, serious consequences can take place. Imagine incorrect readings of a blood pressure sensor that gives a false positive normal pressure of a patient. 

Current stream processing systems provide generic computational capabilities that allow developers to apply one or more data transformations to process data. One aim of this work is to bring probabilistic management of data streams to a large scale stream processing system. Another aim of this research is to build a comprehensive library of uncertainty handling techniques that are made available for stream application developers in order to obtain reliable results.


[1] Zanella, Andrea, et al. "Internet of things for smart cities." IEEE Internet of Things journal 1.1 (2014): 22-32.

[2] Alaa, Mussab, et al. "A review of smart home applications based on Internet of Things." Journal of Network and Computer Applications 97 (2017): 48-65.

[3] Muhammed, Thaha, et al. "UbeHealth: A Personalized Ubiquitous Cloud and Edge-Enabled Networked Healthcare System for Smart Cities." IEEE Access 6 (2018): 32258-32285.

[4] Mao, Na, and Jie Tan. "Complex Event Processing on uncertain data streams in product manufacturing process." Advanced Mechatronic Systems (ICAMechS), 2015 International Conference on. IEEE, 2015.

[5] Tran, Thanh T., et al. "CLARO: modeling and processing uncertain data streams." The VLDB Journal—The International Journal on Very Large Data Bases 21.5 (2012): 651-676.

[6] Dayarathna, Miyuru, and Srinath Perera. "Recent advancements in event processing." ACM Computing Surveys (CSUR) 51.2 (2018): 33.