Azure Stream Analytics

Setumo Raphela
3 min readAug 1, 2022

· Stream processing refers to the continuous ingestion, transformation, and analysis of data streams generated by applications, IoT devices and sensors, and other sources to derive actionable insights in near-real-time.

· An event is a small packet of information (a datagram) that contains a notification.

· Events can be published individually or in batches, but a single publication (individual or batch) can’t exceed 1 MB.

· Azure Stream Analytics provides you with the ability to ingest, process, and analyze streaming data from Azure Event Hubs (including Azure Event Hubs from Apache Kafka) and Azure IoT Hub.

· Stream Analytics also supports various inputs and outputs also provides the capability to use Azure Machine Learning functions to make it a robust tool for analyzing data streams.

· Job-level billing ensures low startup costs (three Streaming Units, by default) and, jobs are scalable up to 192 Streaming Units to provide the performance necessary to run even the most demanding jobs effectively.

· Streaming Units (SUs) represents the computing resources designated to execute Stream Analytics jobs.

· Increasing the number of SUs means more CPU and memory resources are allocated to the job.

· Azure Stream Analytics jobs perform all processing in memory to achieve the low latency required for efficient stream processing.

· Windowing functions are operations performed against the data contained within a temporal or time-boxed window.

· By default, windows are inclusive of the end of the window and exclusive of the beginning

· Tumbling window functions segment a data stream into a contiguous series of fixed-size, non-overlapping time segments and operate against them.

· Hopping window functions model scheduled overlapping windows, jumping forward in time by a fixed period

· Sliding windows generate events for points in time when the content of the window actually changed-To limit the number of windows it needs to consider, Azure Stream Analytics outputs events for only those points in time when an event entered or exited the window.

· Session window functions cluster together events that arrive at similar times, filtering out periods of time where there is no data. It has three primary parameters: timeout, maximum duration, and partitioning key

· Snapshot windows groups events by identical timestamp values, unlike other windowing types, a specific window function (such as SessionWindow(), is not required, you can employ a snapshot window by adding System.Timestamp() to your query’s GROUP BY clause.

· There are two approaches to processing data streams: live and on-demand.

Stream Analytics guarantees exactly once event processing and at-least-once event delivery, so events are never lost

--

--

Setumo Raphela

Entrepreneur | Data Scientist | AI | Jet Skier | Author |Oracle