Power BI Real-time Streaming
With Power BI real-time streaming, you can stream data and update dashboards in real-time. Any visual or dashboard that can be created in Power BI can also be created to display and update real-time data and visuals. When learning how to do this, I found it a bit difficult, so wanted to write this blog to hopefully make it easier for you.
At a high level, you create real-time visuals by logging into the Power BI Service and choosing “Streaming dataset” from the Create menu, choosing the streaming dataset type (API, Azure Stream, or PubNub), then adding report visuals or tiles to your dashboard that uses that streaming dataset. You will then push data into the streaming dataset using various methods (Power BI REST APIs, Streaming Dataset UI, Azure Stream Analytics).
Note you can’t create a streaming dataset in Power BI Desktop, but can connect to a streaming dataset created in the Power BI Service.
First off, let’s define the three categories of real-time datasets which are designed for display on real-time dashboards. These are not specific options you will find in Power BI, rather, the choices you make when building the streaming datasets will result in the dataset fitting into one of these categories:
- Push dataset – data is pushed into the Power BI service. When the dataset is created, the Power BI service automatically creates a new database in the service to store the data. Since there is an underlying database that continues to store the data as it comes in (up to 5M rows per table), reports can be created with the data. These reports and their visuals are just like any other report visuals, which means you can use all of Power BI’s report building features to create visuals, including custom visuals, data alerts, pinned dashboard tiles, and more. Once a report is created using the push dataset, any of its visuals can be pinned to a dashboard. On that dashboard, visuals update in real-time whenever the data is updated. Within the service, the dashboard is triggering a tile refresh every time new data is received. The push dataset is a special case of the streaming dataset in which you enable Historic data analysis in the Streaming data source configuration dialog
- Streaming dataset – data is also pushed into the Power BI service, with an important difference: Power BI only stores the data into a temporary cache, which quickly expires. The temporary cache is only used to display visuals which have some transient sense of history, such as a line chart that has a time window of one hour (there is currently no way to clear data from a streaming dataset, though the data will clear itself after an hour). With a streaming dataset, there is no underlying database and therefore limited history, so you cannot build report visuals using the data that flows in from the stream. As such, you cannot make use of report functionality such as filtering, custom visuals, and other report functions. The only way to visualize a streaming dataset is, while editing a dashboard, choose “Add tile” and choose “Custom Streaming Data” under “Real-time Data” and then choose the streaming dataset. The custom streaming tile that is based on a streaming dataset is optimized for quickly displaying real-time data. There is very little latency between when the data is pushed into the Power BI service and when the visual is updated, since there’s no need for the data to be entered into or read from a database. In practice, streaming datasets and their accompanying streaming visuals are best used in situations when it is critical to minimize the latency between when data is pushed and when it is visualized (they update on change, meaning that if your data changes every second, so will the tiles). In addition, it’s best practice to have the data pushed in a format that can be visualized as-is, without any additional aggregations. Examples of data that’s ready as-is include temperatures, and pre-calculated averages. You disable Historic data analysis in the Streaming data source configuration dialog to create a Streaming dataset (but you can always change this afterwards to switch to a push dataset)
- PubNub streaming dataset – the Power BI web client uses the PubNub SDK to read an existing PubNub data stream, and no data is stored by the Power BI service. As with the streaming dataset, there is no underlying database in Power BI, so you cannot build report visuals against the data that flows in, and cannot take advantage of report functionality such as filtering, custom visuals, and so on. As such, the PubNub streaming dataset can also only be visualized by adding a tile to the dashboard, and configuring a PubNub data stream as the source. Tiles based on a PubNub streaming dataset are optimized for quickly displaying real-time data. Since Power BI is directly connected to the PubNub data stream, there is very little latency between when the data is pushed into the Power BI service and when the visual is updated. PubNub is third-party data service (http://pubnub.com)
To clarify, using report visuals (i.e line chart), which requires Historic data analysis to be turned on, gives you added functionality like filtering, but does not update as fast as tiles (in my testing, pushing data every 1-2 seconds will update 6-8 points on the report visual every 6-8 seconds). Using tiles with Historic data analysis turned on will update immediately (but tiles are limited to only showing the current value). Using tiles with Historic data analysis turned off did not result in a faster update, so it seems you should always turn it on (other than to save some storage space).
And tiles will update faster when using streaming datasets or PubNub datasets instead of push datasets. Note when using a tile, you can choose from five different visualization types: Card, Line chart, Clustered bar chart, Clustered column chart, and Gauge. These “real time tiles” will have a lightning bolt on the upper left of the tile when displayed on a dashboard.
There are three primary ways you can push data into a dataset (notice there is no need for you to create a database to handle the streaming data). Be aware with these options you can also create a dataset:
- Using the Power BI REST APIs – Can be used to create and send data to push datasets and to and streaming datasets. Once a dataset is created, use the REST APIs to push data using the PostRows API
- Using the Streaming Dataset UI – In the Power BI Service, choose “Streaming dataset” from the Create menu, choose the streaming dataset type of “API”, then configure the values to be used in the stream. Select Create, and you will be given a Push URL (REST API URL endpoint). Then create an application (i.e. C# in Azure Functions) that uses POST requests to the Push URL to push the data. Another option is to choose the streaming dataset type of PubNub and follow the same instructions
- Using Azure Stream Analytics – You can add Power BI as an output within Azure Stream Analytics (ASA), which uses the Power BI REST APIs to create its output data stream to Power BI. ASA creates the dataset which stores 200,000 rows, and after that limit is reached, rows are dropped in a first-in first-out (FIFO) fashion
Hope this helps!
More info:
Create a Power BI streaming dataset for real-time dashboards
Real-time streaming in Power BI
Power BI Streaming Data Sets: The Good, the Great, and the Gotchas
Thanks James! Does Microsoft have a time-series analytics strategy? This seems like a big part of it. I’m doing research on this topic and wonder who the best person to talk to about it.
Thanks for a good explanation. There is obviously a hole when it comes to near real time capabilities in BI related Microsoft Azure services. So it is definitely an interesting direction to follow. One of the bigger concerns regarding Push (and Streaming) is the restriction of Max rate of data ingestion which is 1 request per second for Push and 5 r/s for Streaming which makes is hardly usable for big enterprises.