Thoughts Cory Carpenter Thoughts Cory Carpenter

Addressing The Streaming Video Data Challenges

There is no doubt that data is the lifeblood of streaming video. Although having the right content is critical to keeping and attracting subscribers, data provides the insight to what that content library should be. In the streaming video workflow, dozens of components throw off data which provide insight into everything from viewer behavior to revenue opportunities to performance information. Imagine the components of the streaming workflow as islands all connected together by bridges (APIs) suspended in a roaring river of data. And it’s important to remember that the data relevant to streaming video isn’t just from within the workflow. That river is fed by countless tributaries, including such sources as Google Ad Manager and content delivery networks. The fact that this is a river of data, and not just a trickle, emphasizes how challenging it is for streaming operators to make sense of the information within, to take action against what might amount to billions of data points. It’s like trying to identify two fish from the same clutch by just looking at the water. And yet that’s exactly what many streaming operators try to do in real-time: make sense of all the connections within the massive river of data.

The Impact Of Issues With Data

Of course, handling a large volume of data is only one challenge. There are others (as detailed below), but ultimately, any of these challenges result in one thing: slowing down the ability to take action. These challenges represent blockers to using the data in real-time to make critical business decisions. If there’s too much data coming at one time (rather than a sample of the data, for example), it can take too long to process and display in a visualization tool. Even if the visualization tool is connected to unlimited computational resources, it still takes time to process the data. Of course, unlimited resources are often not available so processing mass amounts of data can add significant time. This, and other kinds of delays with handling the data from the streaming workflow, keeps operators from putting that data to use and that undermines the value of the data in the first place. Consider this example: understanding five minutes after an outage where the outage happened doesn’t mitigate customer discontent. But the outage didn’t just suddenly happen. There was probably data which hinted at the impending problem yet it was lost in the river. Only when the outage happened or was noticed (minutes after processing was completed), and the aftermath evident (such as a suddenly spike in customer emails) did it become impossible to miss.

Understanding the Challenges of Streaming Video Data

As was already pointed out, the volume of data is only one of the challenges facing streaming operators with respect to putting data to use. There are several other challenges which can have just as much of a negative impact as having too much data:

  • Delivery time. How fast does the data need to get where it’s going? Many streaming operators employ software in their player to capture information about the viewer experience. But what if that data comes two, three, or even 10 minutes after an issue is detected? Of course, the issue is not in the player. It’s most likely upstream. But having visibility into the viewer experience provides an indicator of other problems in the workflow. So the data needs to be delivered as quickly as it’s needed. The time constraints of individual pieces of data is not a one-size-fits-all approach. Different data needs to be delivered at different speeds.

  • Post-processing. Countless hours are spent processing data once it has been received. That post-processing may be automated, such as through programming attached to a datalake or a visualization dashboard, or it may be manual. However it’s carried out, it takes time. But that post-processing must happen to turn the data into usable information. For example, it doesn’t help the ad team to provide them raw numbers on time spent watching a particular ad. What helps is telling them if a particular ad, across all views, has hit a certain threshold of viewing percentage (which is probably a contractual number). In other words, post-processing makes data usable. But when it takes too much time, the value of the data can diminish.

  • Standardization. Streaming video can be a unique monster when it comes to data sets. Lots of providers are collecting similar (if not identical) data but may represent it differently. When this happens, that data must be sanitized and scrubbed (post-processed) to ensure that it can be compared with similar values from other providers and used as part of larger roll-ups, such as KPIs. Content delivery network logs are a great example of this. Without any standardized approach to variable representation, streaming operators are forced to come up with their own lingua franca which has to be maintained and enforced with new providers.

Yes, data is critical to the success of streaming platforms. But actually using that data in a meaningful way is fraught with challenges: volume, delivery, processing, and standardization. So just as important as identifying and gathering the right sources of data is having a strategy to deal with these challenges. With the right data and the right strategy, streaming operators can ensure that their viewers are always having the best experience because the operator has access to the right amount of data, optimized and transformed and delivered right where, and when, it’s needed.


In the next blog post of this series, we’ll take a look at data volume in more detail and the ways that it might be mitigated. Getting the right data to the right people is critical for streaming platform success. But that’s sometimes more easier said than done.

Read More
Thoughts Cory Carpenter Thoughts Cory Carpenter

Solving The Challenges of SSAI Measurement

Many streaming operators, especially those offering live streaming and FAST services, are moving to server-side ad insertion (SSAI). There are a multitude of reasons for this. First, SSAI offers more of a broadcast-like ad experience as ads are “stitched” into the stream in real-time. Second, because ad insertion happens prior to the player, there is less impact on the client of fetching and integrating ads (which can require additional code). And finally, perhaps most important, ads served through SSAI are not addressable by client-side ad blockers. Despite these pros, there is a significant con of SSAI: measurement accuracy. Thankfully, there is a solution to filing the gaps in SSAI ad measurement.

Understanding the Measurement Issues with SSAI

The core measurement issue revolves around how SSAI is integrated into the ad tech stack. Ads delivered in this way require an Ad Decisioning Server, whose job is to supply the ad for insertion. Ads are supplied to an Ad Insertion Server which transcodes the ad to match the video’s bitrate ladder, packages the ad, produces a manifest for the ad, and then stitches the ad’s manifest to the video manifest.

The issue is the data upon which the operator has to ensure a view or impression. Data from the Ad Decisioning Server can only tell them that an ad was requested and it was sent to the Ad Insertion Server. The Ad Insertion Server can only tell them that the ad was received (or not received) and stitched into the manifest (or not). But there is nothing which indicates whether the ad actually played to the viewer.

The solution to this actually lies in the client!

Solving the Server Issues With a Player Solution

At the heart of understanding the ad experience, and, more importantly, determining ad playback and if it counts towards a contractual view or impression, is the client. Datazoom can already collect a variety of client-side data elements to help in ad tracking. But we have recently announced client-side data collection specifically for SSAI.

https://www.youtube.com/watch?v=JEyh5iklFmg

As you can see in the demo video, there are a number of ways this new capability can improve ad data for operators employing SSAI:

Benefit Description Applicable Client-side Data Points
Visibility into ad delivery See exactly where and when ads are being delivered, which can make it easier to optimize ad delivery and troubleshoot issues. device_type, user_agent, OS, custom_metadata
Measuring viewability Data gathered about what ads were actually viewed, can only be gathered at the client. When collected with Datazoom, this data can be married with the data from the server (using an ad ID), in a single dashboard (such as a Looker Block) to provide true measurement of ad views. player_viewable, player_viewable_percent
Accurate ad impressions tracking Instead of relying on tracking from the server, ad impressions and other key metrics can be tracked with parity to CSAI. ad_impression, ad_id (coupled with enrichment)
End-to- end measurement Client-side tracking removes limitations on the ability to measure the entire ad delivery process, from ad insertion to ad delivery on the client-side, which can make it easier to understand the full user experience. Ad_session_id encompassing the entire event stream

With Datazoom’s new beta release, our customers can capture verified impressions of users playing video ads embedded/inserted within a live or on-demand video stream from our SDKs and libraries.

Improving SSAI Targeting With More Client Data

Using client-side data to augment SSAI for better measurement also has another benefit: improving SSAI ad delivery. One of the main arguments for CSAI is for contextualization. There is a host of data which can be gathered from the client and used to better target ads. This can, ultimately, help increase the value of ad placement for the operator. This new feature from Datazoom provides a unique way to add that same value to ads delivered through SSAI. For example, device-level data, such as IP address and device type, can be fed from client to SSAI Decisioning Server, where SSAI occurs.

As you can see in the illustration above, by connecting SSAI and client-side ad data, operators can now see, in near real-time, a user event stream that shows exactly when an ad break starts, the ad request is made, the impression is recorded, and how long a user sticks around before click the ad_skip button. Each of those events are decorated with a number of dimensions for one to determine where ads are working and where they are not.

CSAI Is No Longer The Clear Winner

The decision of CSAI over SSAI has long been driven by the prospects of personalization and the increase in the value of each impression. But with this new Datazoom feature, that tradeoff no longer needs to happen. Streaming operators can choose SSAI to avoid ad-blockers, for example, but still get access to valuable client-side data which can be used to target ads before they are stitched into the final stream.


Interested in seeing this in action? Reach out to Datazoom today to schedule a demo.

Read More
Thoughts Cory Carpenter Thoughts Cory Carpenter

So You Want To Personalize The Streaming Experience...

When it comes to the core difference between traditional broadcast and streaming, it’s all about data. Broadcast provides very little information about the viewer. Even for those with Nielsen boxes attached to their television sets, it’s only basic demographic information tied to what’s been watched. It’s near impossible to understand the relationship between those demographics and viewer behavior to everything else the viewer might be doing across the digital universe. But with streaming, it’s completely different. Viewing behavior within streaming content can be related to other digital activities to create comprehensive insights which can be used to recommend or suggest new content, tailor ads for greater impact, and more. Many would argue that personalization is not only the future of the streaming video experience, but a critical element that streaming operators must provide to improve engagement and reduce subscriber churn. Below, we’ve provided a step-by-step understanding of how you can find the data, collect it, and piece it together for a richer data picture of individual viewers which you can use to personalize the streaming video experience.

 

Why Personalization?

It’s not hard to understand why a personalized video experience might be better than a generic one. When a viewer feels that an app interface, or content recommendation, or the player experience is tailored for them, they are probably more likely to stay engaged longer (whether that’s in the streaming platform or with a specific piece of content). And for ad-supported models, that means more impressions because that viewer is going to watch more video for longer. There isn’t much data about the impact of personalization on viewer engagement and subscriber retention, but some early indicators, such as in a survey by Concentrix, indicate that, “over 70% of subscribers say they only engage with personalized messaging, and nearly 65% will stop buying from brands that use poor personalization tactics.”

 

The Many Opportunities of Personalization

Personalization, though, isn’t relegated to just one dimension of the video experience. One of those, of course, is content. Raviteja Dadda, a Forbes Council Member, identified a number of ways the content could be personalized to a viewer including curation, recommendations, and promotions. For FAST providers, curating content could mean a personalized channel for each viewer that includes both recommended and promoted assets. But there are other places to personalize as well including advertising, the interface itself (adding contextual menus to content categories the viewer often watches), and even commerce opportunities. For example, a streaming platform could blend third-party data, such as purchasing history at a major retailer, with first-party data about content watching, to suggest purchases during the video experience itself. In all of these cases, the personalized experience communicates to the viewer that the streaming brand (the platform operator) understands their wants and needs. This makes the subscription all the more valuable.

Convincing the streaming industry that personalization is the future, though, isn’t hard. What’s hard is understanding how to carry it out and getting the data you need to make it happen.

 

What’s The Data And Where Can You Find It

At the heart of personalizing the streaming experience is data. Data about what’s watched (and for how long), when a viewer abandons a video (and where in the video that was), how other people like the viewer are behaving (cohorts), and third-party data such as that from Google, Amazon, and other sources. Although it’s up to you from where to source any third-party data, there is quite a lot of first-party data that might already be available to pull from within your streaming video technology stack.

When we built the Streaming Video Datatecture, we tried to capture all of the various systems that throw off data which might be used to personalize the video experience.

As you can see, there are not only multiple categories of data but multiple providers within each category. Depending upon how your stack is put together, you may be looking at ten, 20, or even 30 sources of data which can be connected and correlated to provide a more in-depth picture of an individual user. The data is not just about what happens in the video player, though. It’s also about other metrics such as user behavior within the interface, abandonment, clicks, and even the bitrate ladder. For example, if a viewer often selects 4K videos but the data shows that bitrate is never reached, it might make more sense to suggest content to them that is 1080p HDR (which may also reveal a wider library of assets for the viewer).

To personalize the video experience, there are three key steps. First is to collect the data. Second is to correlate the data together. And, third is to connect the resulting data set to your video platform in real-time.

 

Step 1: Collecting the Data for Streaming Personalization

It might seem like an easy task, but it’s actually not. Although many components within the streaming video technology stack make their data available (often through APIs), some don’t. And even if you can gather the data easily, it’s not a one-off task. You’ll need to build a proper data pipeline with connectors to each data source. This can be quite resource intensive out of the gate. You’ll need developers to build those connectors and ensure that they can be maintained easily going forward in the event that an API call or end-point is changed by the component owner.

 

Step 2: Correlate the Data for Streaming Personalization

Once you have access to all of the data in a pipeline, you’ll need to send it somewhere for post-processing. Usually, that’s a visualization tool. And although such a tool may be useful for analytics purposes, it’s not necessarily useful for real-time personalization. Because of that, your data pipeline will need to feed a datalake against which business rules can act on the data to normalize it and stitch it together (imagine data cubes for each individual viewer). In some cases, this may require standardizing specific metrics, like a user ID. Because of the need to normalize, just like with Step 1, there’s some heavy lifting that needs to happen up-front. But, once you get your business logic set up, you probably won’t need to change it very often unless the data sources themselves change.

 

Streaming Personalization Won’t Succeed Without A Good Data Model

The business logic that you build around the individual viewers isn’t just about normalizing and processing. It’s also about thresholds, or indexes. For example, you’ll want to build a data model for personalization that includes scores for specific personalized elements. When thinking about personalizing the user interface, each dynamic menu item will need to have a score to appear. So if a specific viewer has an index of .33 for a given interface item, it won’t display. But a viewer with an index of .79 would see that interface element. To make this more concrete, imagine that your interface includes menu items for specific genres of video. If the index is above the threshold for a viewer in a specific genre, that genre menu item would appear helping them to discover content more easily that the data indicates they prefer to watch. This can also apply to recommended content as well. Just because a viewer has watched one content asset doesn’t mean they are immediately going to be interested in something related (either by actor or genre, or within their cohort). Each content asset should, for each viewer, eventually have an index that determines how it fits into individual cubes of viewer data. As you can imagine, this is no easy task but it pays off in the long run because those index scores become the knobs and switches that you can use to optimize personalization.

 

Step 3: Connecting The Personalized Data To Your Video Platform

It’s probably safe to say that your platform isn’t built with personalization in mind. Although some elements, such as a video asset reel, are probably dynamic and can be easily integrated with a personalized data set, your interface may not be. So you’ll invariably have to do some development work to enable personalization for other aspects of your platform. But, once this is done, you’ll be able to leverage those individual viewer datasets to truly create a customized experience.

 

Streaming Personalization Isn’t a One-And-Done Activity

Even if you follow those three steps, your work isn’t done. That’s because the underlying challenges aren’t just acquiring, normalizing, and integrating the data. There’s another obstacle that can actually undermine the whole effort: real-time. Personalization can’t happen once for each viewer. It must be continually updated. And, to make it truly impactful, it should be carried out even as the viewer is interacting with the platform. Think about it this way: each piece of content the viewer watches, each click, each ad viewer, adds to their cube and can affect recommended content or a changed interface element. For example, what if a viewer watching a horror movie finally tipped them past the threshold of displaying the “horror” category in their interface menu? You wouldn’t want them to wait until logging in again to see that. You would want to display it immediately, perhaps even with a “new” badge, so that they would jump right into the next piece of content without having to find it.

But getting data in real-time can be difficult. Most streaming operators aren’t staffed to create real-time pipelines. Yet if personalization is truly the next evolution of the streaming experience, no operator can afford to ignore the need to build out a data function within the organization, one that can create connectors, build business logic, define a data model, and integrate everything with the platform itself. It’s hard work, no doubt, but the benefits of improved engagement rates and less subscriber attrition, are well worth the efforts.

Datazoom specializes in enabling streaming operators to create data pipelines. If you’re interested in understanding more, and even seeking a resource to help you collect the data you need (and assist you in buildind a real-time data pipeline), contact us today!


Interested to learn more about how you can join data collection from end devices to the Chromecast? Contact Datazoom today.

Read More