Thoughts Cory Carpenter Thoughts Cory Carpenter

How to build a new Data Infrastructure to support your streaming strategy

We all agree that a business’ performance is only as good as the data backing its insights. Yet, many streaming media organizations struggle with approaching, let alone identifying, the best way to renew their data strategy. Change may be the only certainty in the market, but that doesn’t mean all change strategies are the same. 

When you aren’t sure of the combination of data, metrics, and tools you want to use, rebuilding your data strategy becomes even more of a headache. We’re prone to encountering a “chicken-and-egg-like” problem: you need data to understand the metrics and tools you want to use, but today we simultaneously need these metrics and tools if we are to have any access to data at all.

A New Approach to Data

In response to that conundrum, our team at Datazoom wants to share an approach that’s been successful for some of our customers who are currently planning their renewed data strategy with us, while also facing resource constraints. The strategy rests on establishing the minimum datapoints and toolsets for their web platform and then replicating and expanding onto other platforms like mobile and OTT. 

Initiating your Datazoom experience on web SDKs is a worthwhile place to start because they: (1) are drop-in integrations that (2) can be updated using the Datazoom UI to add or remove data points and Connectors, while (3) enabling our team to act as a resource to code and build-out net new data points for your chosen Collectors which can then be activated from our UI with zero development or code changes. 

A Plan for Action

Once we’ve established the right combination of datapoints and Connectors for the web properties, we use this as the foundation for developing, replicating, and expanding into other Collectors for efficient development and timely deployment.

For the above approach we:

1) Scope out 30 minutes with the web development team to perform the integration and “turn it on.”

2) Identify one SDK to remove/intermediate, using 3-5 “sanity metrics” to ensure parity between our SDK and the native SDK before phasing out the latter. 

3) Repeat this process with other Connectors, one by one.

4) Deliver a finalized a list of Data Dictionary datapoints that all Collectors should supply, and begin the gap analysis (and any development planning) for other Collectors.

This approach minimizes the use of your team’s development resources and enables an iterative process for scoping and ongoing product planning. The reality is that defining the product and analytics stack before data is akin to “putting the cart before the horse,” which could result in issues when the strategy is deployed. 

These problems include compatibility issues or setting timelines which are unachievable. Using Datazoom as a way to plan and test implementations gives you a place to perform experimentation while providing feasibility and timing feedback for structured, realistic, deployments.


You can set this plan into action right now when you visit app.datazoom.io/signup  and begin your 15-day, 5GB free trial of Datazoom. Reach out to us if you want more information on how to get started or customizing your plan. We would love to hear out your use-case so that together, we can create an action plan to help you operationalize your video data.

Read More
Thoughts Cory Carpenter Thoughts Cory Carpenter

Improving the performance of CDNs with Real-Time Video Data

CDNs and Publishers

Unless your business has access to significant resources and has a large footprint like that of Facebook, Amazon, Netflix or Google, delivering video around the world to global audiences is a particular type of challenge. In order for video content publishers (VCPs) like NBC or Hulu to reach audiences at all times, they use one or many public Content Delivery Networks (CDNs) which take on the task of transporting content on their behalf. 

The relationship between VCP and CDN is often quite strategic. While prices in the CDN market have decreased significantly, these costs remain among the largest cost centers for VCPs. Considering the significant responsibility placed on CDNs, one would assume that VCPs would have mechanisms in place to monitor CDN performance. But this is not the case. 

VCPs know that maintaining a high quality of experience (QoE) is mandatory if they want to keep audiences engaged. Unfortunately, CDN vendors are often the first parties blamed by VCPs when QoE issues arise, even when the issue originated at another link in the delivery chain. 

Recently, I was part of a discussion with the CTO of a growing OTT video company. He expressed to me a concern weighing heavily on his mind – a lack of useful data. He said: 

We have plenty of analytics that tells us if there’s an issue with our streaming experience. We know that some of our devices and viewers experience (QoE) issues, such as buffering (among other) problems…because analytics provide the metrics that can tell us that. But we don’t know why devices are buffering. We don’t know why it’s happening, and if the problem is caused by [the CDN] or another part of the video delivery ecosystem.

This problem persists because the video delivery ecosystem has become much more complex. 

The Quest for Better Data

In response, CDNs developed combined offerings for core components that video publishers needed: CDNs, video players, and analytics. These all-in-one solutions could leverage the same data flowing between systems to adjust and thus create a homogenous video technology stack. However, adoption of these end-to-end offerings has not been wide-spread. VCPs have instead opted for building their own heterogeneous mix of what has grown to become 10-20 “best-of-breed” core technologies (CDNs among them), to customize and to better differentiate their services for end-users. 

However, the rise of superior independent technologies obscured a bigger picture: that they had to work synchronously, and aligning these different and diverging systems requires the standardized data all in one offerings report to provide. 

With regard to data, no standards were proposed, and almost none have been adopted en masse, to encourage the interoperability and communication between systems. Incongruencies in data mean that VCPs are blind to the pain-points in their own video delivery chain. Attempts to identify the root causes of failure within the mix of diverging systems are impossible. 

For CDNs, a lack of consistent data translates into a lack of visibility into how a video transmits to audiences. Therefore, VCPs are without the resources required to perform the difficult task of achieving parity with the QoE viewers have come to expect from traditional television. For instance, identifying specific “smoking gun” nodes or peering issues with a CDN can take hours, even days and requires the laborious task of manually organizing and interpreting CDN logs. 

Matters for CDNs are complicated further as they themselves rely on other entities, like Transit providers and ISPs, to complete the delivery chain. VCPs have no insight into what happens to their content during these exchanges. In response to CDN instability, VCPs adopted CDN-switching technologies. Yet, while such technologies assist VCPs in avoiding certain types of delivery failures, subjective metrics and not objective data is the basis for switches. Thus the crux of the issue is unaddressed – the lack of homogenous, centrally-collected and contextualized, data. 

Win-Win: Aligning the Video Delivery Stack

Streaming video providers need a tool for centralizing data collection and control. With a common dataset as a point of reference in lieu of metrics, CDNs, and VCPs to begin to construct a transparent, contextualized, and collaborative dialogue. 

Enabling two-way data-share and building a real-time feedback loop between VCP & CDN, a video data control platforms offers the centralized control required to accomplish this task and serve as a foundation for truly innovative projects such as incorporating AI/ML into video delivery. Such tools securely capture very granular data from disparate sources and normalize the huge expanse of information obtained into a correlatable dataset from which to draw insight and put to work. 

CDNs and VCPs can build a real-time feedback loop for mutual improvement so long as they agree on a central “point of exchange” through which captured data be collected and standardized. It is a win-win scenario for every player in the delivery ecosystem, from CDN to publisher to end-user.

Read More
Thoughts Cory Carpenter Thoughts Cory Carpenter

Configuring Splunk for Video Analytics with Datazoom

Visualize real-time, video data to extract deep insights

Splunk Cloud captures, indexes, and correlates real-time data in a searchable repository from which it can generate graphs, reports, alerts, dashboards, and visualizations. When integrated with Datazoom’s real-time, video data infrastructure, the combination becomes a powerful pair for visualizing video streaming analytics.

In this article, we will review the process for setting up Splunk Cloud + Datazoom at a high level. Whether you’re an existing Splunk Cloud user interested in expanding your current use of the platform to include video analytics or a prospective Splunk user, it’s easy to get started – click here and sign up for a Datazoom account (complete with free access + 5 GB of data for your first 15 days). If you’re totally new to Splunk Cloud (and Datazoom) but also interested in seeing this combination in action, you can sign up for Splunk’s own 15 day-trial here. Both trials offer the full functionality of Datazoom and Splunk Cloud, respectively.


Before we get started…

Please review the following prerequisite steps necessary before you can visualize video data in Splunk Cloud. As always, our Help Center is a valuable resource for additional information along the way.


Building Metrics

Splunk has its own query language to write the expressions which are used to calculate metrics. We’ll review four now.

Creating a metric for the Number of Minutes Watched:

1) Login to your Splunk Cloud instance and navigate to Search.
2) Copy and paste the following into Search

index="<your configured index>" 
| stats  sum(event.metrics.timeSinceLastFluxData) AS timeWatched  
| eval  timeWatched=timeWatched /1000/60

3) Select the time period you’d like to search against in the right side of the “Search” tab found in the upper-left of the Splunk UI, then click the hourglass icon.
4) Select the Visualization tab from the results window and then format as necessary. We changed the result’s precision to two decimal places and added the unit “Minutes” after the result.

5) Select “Save As” under the “Search and Reporting” text in the upper right to save the search as a Dashboard Panel.
6) Select “New” then enter a name for your new dashboard next to “Dashboard Title” then click “Save.”

Creating a metrics for Number of Play Requests:

1) Copy and paste the following into Search:

index="<your index here>" 
| spath "event.type" 
| search "event.type"=Play_Request 
| stats count

2) Select the time period you’d like to search against in the right side of Search then click the hourglass icon.
3) Select the Visualization tab from the results window. Format as necessary. We added the unit “Play Requests” after the result.

4) Select “Save As” under the “Search and Reporting” text in the upper right to save the search as a Dashboard Panel.
5) Select “Existing” then select the name of the dashboard you created previously.

Creating a metric for the Number of Play Starts:

1) Copy and paste the following into Search:

index="<your configured index>"
| spath "event.type"
| search "event.type"=First_Frame
| stats count

2) Select the time period you’d like to search against in the right side of Search then click the hourglass icon.
3) Select the Visualization tab from the results window. Format as necessary. We added the unit “Play Starts” after the result.

4) Select “Save As” under the “Search and Reporting” text in the upper right to save the search as a Dashboard Panel.
5) Select “Existing” then select the name of the dashboard you created previously.


Creating a metric to visualize User Location(s):

1) Copy and paste the following into Search:

index="<your configured index>"
| spath "user_details.session_id"
| search "user_details.session_id">"0"
| dedup user_details.session_id 
| geostats count latfield=geo_location.latitude  longfield=geo_location.longitude

2) Select the time period you’d like to search against in the right side of Search then click the hourglass icon.
3) Select the Visualization tab from the results window. Format as necessary.

4) Select “Save As” under the “Search and Reporting” text in the upper right to save the search as a Dashboard Panel.
5) Select “Existing” then select the name of the dashboard you created previously.

Editing the Dashboard

Once you’ve created and saved your searches, you can click on the Dashboard tab and navigate to the dashboard you created earlier. From here you can edit the dashboard further and adjust as necessary.


The Advantages of Custom Metric Design

Following these steps will yield a basic dashboard for tracking some simple video statistics. At Datazoom, we’re currently amassing a database of sample queries you can use as a starting point to building more sophisticated dashboards and reports.

Using customized queries such as Splunk’s, Datadog’s, and New Relic’s NRQL provides users with the ability to obtain more objective metrics tailored to their team’s specifications. Today, some of the nicest, most convivial video analytics vendors use unobjective formulas to compute metrics which do not align when compared against one another. While out of the box analytics packages are convenient, it is high time the industry pivot away from one-size-fits all approached to online video.

Read More