Thoughts Cory Carpenter Thoughts Cory Carpenter

Why Video Data Telemetry is Critical to DevOps Success (And How to Get It)

DevOps has been a revolution in software development. To support agile methodologies, in which developers can incrementally release features that are tested by users in real-time, developers needed to have an increasingly larger role in operationally supporting their code. Waiting for operations to make configuration changes to environments or push out new code didn’t support the new scrum worldview. So the role of DevOps was born and how software was created and released never looked the same. But the success of this new role in the software ecosystem, both programmer and operations, depends on data which is especially relevant in streaming video. New features or application versions can negatively impact Quality of Service (QoS) or Quality of Experience (QoE) and ultimately influence whether subscribers keep paying. Video data telemetry is critical to DevOps managing components within the streaming video technology stack.

Not All Data is Created Equal

There are several kinds of data which can influence software development:

  • User feedback. This is the kind of direct feedback from users which can be instrumental in understanding the impact of a new feature or version. And although this data is very helpful, it’s not quantitative which makes it difficult to use objectively. Often times, the mechanism by which this kind of data is collected (i.e., a survey) can impact the usefulness.

  • Performance metrics. This data is very useful in determining how well changes to software are impacting the overall user experience or how well new software is operating. For example, if parts of a feature or interface are dynamic, are network factors inhibiting the loading of certain elements? How quickly does the software respond to user interaction? With this information, DevOps can ensure that software is operating within acceptable levels of speed.

But when it comes to streaming video, there is another kind of data which is far more valuable in understanding the impact of new features or software on the end-user experience: experience data.

The Data of Experience

In streaming video, the software experience is concentrated in the player which can reside on a variety of client devices ranging from computers to smartphones to boxes that plug directly into the television. This player is full of additional software, encapsulated in Software Development Kits (SDKs), that provide both functionality and data collection. Often supplied by third-parties, these SDKs are installed directly within the player. This software architecture is well understood by DevOps. Additional functionality within the player can easily be encapsulated in SDKs and deployed quickly.

But these SDKs can also provide that crucial experience data. Many of the components loaded by the video player also send back telemetry data which is critical for understanding overall QoS and QoE. This data can tell operations, or DevOps, how critical KPIs such as rebuffer ratio are performing which provides information about the overall user experience. Of course, that is subjective, as each user has a different threshold for certain metrics but the data ultimately provides a general indicator of the positive or negative impact on the user experience.

A System for Collecting and Normalizing Data

Rather than developing functionality to collect that data into new SDKs or other player functionality, DevOps needs access to technology that can be easily integrated into existing systems. When a streaming operator has already embraced the idea of a video data platform, the mechanisms by which data is collected from components within the streaming technology stack, such as a player, are well defined. DevOps needs access to such a system so that approved and proven approaches can be employed as part of the development process to ensure user feedback, performance metrics, and experience data can be collected.

Only collecting data isn’t the only consideration. For the data to be useful to DevOps, and the rest of operations, it must be normalized first. When there’s a video data platform, this is easily accomplished through a standardized schema that can take data elements from a variety of SDK sources and process them before dropping the data into a storage solution such as a cloud provider, like Google BigQuery.

Without the Right Tools…

DevOps is the way software is created now: programmers with operational knowledge and access to the tools and systems for deployment. But to be truly effective, their efforts must be tied into larger systems like video data platforms. In this way, the software they create, whether new versions, new applications, or new features, can also be instrumented to collect the data necessary to evaluate the impact on both quality of service and quality of experience.

Read More
Thoughts Cory Carpenter Thoughts Cory Carpenter

Understanding the Datatecture Part 1: The Core Categories

The relationship of technologies within the streaming video stack is complex. Although there might seem to be a linear progression of components, from content acquisition to playback, in many workflows, the connection between the technologies is often far from such. Data from one piece of technology may be critical for another to function optimally. APIs can be aptly leveraged to connect optimization data from different technologies to each other, and to higher-level systems like monitoring tools and operational dashboards. That’s why the datatecture was created: to better visualize the interconnection between the technologies and ultimately document the data they make available.

How to Visualize the Datatecture

Think of the datatecture as a fabric which lays over the entire workflow and represents the underlying flow of data within the technology stack. How the datatecture is organized, then, is critical to not only understanding the basis of that fabric but how to categorize your own version of it, suited specifically to your business. Regardless of the individual technologies you end up employing in your datatecture, they will be ultimately categorized into three major areas: Operations, Infrastructure, and Workflow.

Datatecture Core Category: Operations

A major category within any streaming workflow is the group of technologies which help operate the service. The other categories and technologies within this group are critical to ensuring a great viewer experience:

  • Analytics. One of the primary categories within the Operations group, this subgroup includes a host of components found in any streaming video technology stack. The technologies found here may include tools for analyzing logs (such as for CDN logs), tools for analyzing data received from the video player, and even data useful in understanding viewer behavior regarding product features and subscriber identity. Without these and other technologies in this subgroup, it would be nearly impossible to provide the high-quality viewing experience subscribers demand.

  • Configuration Management. Although not as sexy as analytics, this is a critical subgroup of the Operations category as it covers such technology as multi-CDN solutions. Many streaming providers employ multiple CDNs to deliver the best experience. But switching from one CDN to another can be complex. The technologies in this subgroup can help provide the functionality in a much easier way.

  • Monitoring. Perhaps one of the lynch pins of the streaming video technology stack, this subgroup enables operational engineers to continually assess the performance of every other technology within the stack, whether they are on-prem, cloud-based, or even third-party. The data pulled into these monitoring tools ensures operations and support personnel can optimize and tune the workflow for the best possible user experience.

Read more about the Operations category.

Datatecture Core Category: Infrastructure

Underlying the entire workflow is the infrastructure. From databases to storage to virtualization, these fundamental technologies power the very heart of the streaming stack:

  • Containers and Virtualization. As the streaming stack has moved to the cloud, virtualization, and the tools to manage containers and instances, has become a crucial technology. These technologies ensure scalability and elasticity as well as providing a means to quickly and easily deploy new versions of workflow components.

  • Storage and Caching. At its heart, streaming video is about data. Whether those are the segments which comprise an HTTP-chunked file or the data gathered from throughout the workflow, it’s all about bits and bytes. The challenge is how to store them and, in the case of caching, how to make them available to the users and applications that need it. The subgroups and technologies in this group are critical to building and managing that data.

  • Queueing Systems. Scale is a major challenge to streaming providers. How do you handle all the user requests for content, and the influx of QoE and QoS data, when parallel sessions climb into the millions or tens of millions? Queuing Systems provide a means by which to organize and handle those requests to prevent systems, such as caches or databases, from being overrun and tipping over.

Read more about the Infrastructure category.

Datatecture Core Category: Workflow

This core category is where the magic happens. It’s all of the subgroups and technologies which enable the transformation, security, delivery, and playback of streaming video which makes sense that it’s the deepest category with the most technologies:

  • Delivery. From CDNs to Peer-to-Peer, this category deals with well known and established technologies for getting the streaming segments to the users who are requesting them. This subgroup also contains other technologies, such as Multicast ABR and Ultra-low latency, which are becoming increasingly important in delivering a great viewer experience, with high-quality video, at scale especially for live events such as sports.

  • Security. The streaming industry has long contended with piracy. That’s because of the nature of streams: they are just data. It is much more difficult to pirate a broadcast feed because there is no way to steal the signal. But with streaming, which employs well known web-based technologies like HTTP, that’s not the case. So this subgroup includes technologies like DRM, watermarking, and Geo IP to do everything from encrypt the content to determining where it can be played.

  • Playback. Without a player, there would be no streaming video. This subgroup addresses the myriad of playback options from mobile devices to gaming consoles. But players also come in many shapes and sizes. While some are commercially available, and provide a lot of support, others are open-source and present highly-configurable options for streaming providers that want a greater degree of control, with less support, than may be available with commercial players.

  • Transformation. Content for streaming doesn’t come ready-made out of the camera. Just like broadcast, it must be encoded into bitrates that are appropriate for transmission to viewers. But unlike broadcast, the players and devices used to consume those streams may require very specific packages or formats, some of which require licenses to decode and others which are open-source. The subgroups in this category cover everything from encoding to packaging and even metadata, the information which is critical for streaming providers to categorize and organize content.

  • Monetization. Of course, most streaming providers aren’t giving away their content for free. They have some sort of strategy to generate revenue. These can range from subscription services to including ads. The subgroups in this category cover a broad range of monetization technologies ranging from subscription management to the many components of advertising integration, such as SSAI and DAI, and tracking.

  • Content Recommendations. This small subgroup is becoming increasingly important in streaming platforms. Suggesting content to viewers, whether it’s based on past viewing behavior or the viewing behavior of similar users, is critical to keeping users engaged which can ultimately impact attrition.

Read more about the Workflow category.

No One Core Category is More Important Than Another In Datatecture

You may be wondering if you can scrimp on one core category, like Operations, for the sake of another, such as Workflow. The short answer is, “no.” These Datatecture core categories are intricately connected, hence the Venn diagram structure. Operations depends on the Infrastructure to store all of the data while Workflow depends on Operations to find and resolve root-cause issues. Of course, there are countless other examples, but you get the picture: these core categories are joined at the hip. So when you are examining your own streaming platform or planning your service, building your own datatecture depends on understanding the relationship between these core categories and ensuring you are providing them the proper balance.

Read More
Thoughts Cory Carpenter Thoughts Cory Carpenter

Observability: The Mindset And Resources OTT Must Bring Onboard To Achieve It

The Growth of OTT Demands a New Mindset From Streaming Providers

Streaming has grown immensely over the past 12 months. Viewing time for U.S. streaming services are 50 percent above 2019 levels in June, likely the result of services launched before and during the pandemic, including Apple TV+, Disney+, HBO Max (AT&T), and Peacock (Comcast). A new forecast projects SVOD subscribers will nearly double from the 650 million worldwide at the end of 2020 to 1.25 billion by the end of 2024. While the number of subscribers is growing, so too is the revenue opportunity. In 2020, the global video streaming market size was valued at USD 50.11 billion, and is expected to expand at a compound annual growth rate (CAGR) of 21.0 percent from 2021 to 2028.

So, what does this continued growth actually mean for the industry?

The growth in the number of subscribers and resulting revenue is forcing OTT providers to take a hard look at how they provide their service. Unlike broadcast television where the network operator had visibility all the way down to the set-top box and could guarantee a three-, four-, or five-nines level of service, OTT doesn’t have that luxury. They must often cobble together monitoring systems that link dozens of distinct and separate technologies together.

To Support These Growing Services, OTT Providers Must Put Data First

Variables. Countless variables to track and monitor. This is the new norm in streaming.

When OTT first began, monitoring (and the data needed for observability into the streaming experience), was really a secondary thought. The primary focus was about reliability: keep the service up-and-running using whatever means necessary. Sometimes that involved a little bubblegum and Duct tape. But, as the previous stats support, OTT providers are now global. The demands of providing a consistent and reliable service on a global scale translates to data becoming a primary focus. This is the mindset driving today’s streaming platforms.

But this situation is more complicated than just elevating the importance of data in the operation of streaming platforms. There is no consistency around the data. Different vendors, providing different technologies within the video stack, all have their own data points. Sometimes, those data points, although named differently, represent the same variable. It’s an issue that complicates the entire process of holistically monitoring the workflow.

Without a commitment to standardizing, enriching and centralizing information, the industry is contending with an explosion of data and no way to really put it to use. It’s like trying to capture all of the water from a leaky dam using a thimble. The result is that many operators are blocked from the OODA loop (Observe, Orient, Decide, Act). To put the video stack data to use, platform operators must be able to see the forest…and the trees. They need to be able to identify patterns while also tracking down individual user sessions. But being able to do that requires consolidating the massive amount of fragmented data coming out of the workflow.

Just consider these examples: at the device level, operators have to be conscientious of device OS, screen size, memory capacity, and supported protocols which can impact client-side caching algorithms, adaptive bitrate ladders, and changes to delivery protocol. Networks can experience sudden peaks in congestion, and so delivery adaptability—such as choosing a different CDN within a specific region—becomes very important to ensuring a great viewer experience; and the content itself can be transformed to accommodate different connection speeds and bitrates.

To address any of those examples requires not only a lot of data, but the variables must be standardized and normalized. Without that, there is no way to get a clear picture of how efficiently or effectively the technologies within the stack are operating, how well third-party partners such as CDNs are performing, and how the viewer is experiencing the video.

Standardized Data + Insights = Observability

The ultimate goal to elevating the importance of data within streaming platform operations is achieving observability. Just having a lot of data is nice. Having a lot of standardized data is better. But being able to derive insights, in real-time, enables the streaming operator to have the observability they need to provide the consistent, reliable, and scalable service their viewers expect. Furthermore, observability also provides the business with the ability to make critical decisions about advertising, marketing, and subscriber retention with more certainty and accuracy.

The Datatecture is a Data Landscape for Streaming Operators

Today, our industry needs to know how to optimize the creation, delivery, transformation, and operationalization of streaming video. Doing so requires harnessing all the data from within the video technology stack: operations, infrastructure, and the delivery workflow components. Each of the technologies, whether they are vendor supplied or open-source, can throw off data critical to seeing the delivery and QoE big picture, and the individual viewer sessions. But which technology throws off which data?

That’s where Streaming Video Datatecture comes in.

Being in the industry for seven years, in order to talk about data, there needed to be a way to map out what was actually happening. There wasn’t a single resource which showed all of the components in those three main stack categories, or which addressed the ever-changing technologies. Having a resource which provided an up-to-date picture of the technology landscape was a critical first step to harnessing all the data within the workflow.

But the concept of the datatecture was more than just an industry landscape. It works as a tool which streaming operators can use to build their own datatectures. Because there is no standardization within the streaming industry, most OTT platforms are different. Every operator has figured out a way to make their technology stack work for them. But the increasing need for observability isn’t specific to one provider. Every provider needs to put data first and to do that means understanding all of the technologies in the stack which can provide data to add to that observability. This industry-wide datatecture is a map which providers can use to build their own, envisions how their datatecture could be.

Release Video Data From Its Silos: What Senior Leaders Can Do

Although it’s the engineers who will make the most use of the datatecture, executives and managers within the organization need to help with the “data first” transition. Hiring  the right people, and ensuring they are focused on data, will help to make sure new software has data collection as a priority. Another strategy for senior leaders is to make sure all future hires, no matter the department, understand that stability and growth of the business relies on observability to drive business decisions. If all teams, whether that be in advertising, marketing, security, or content groups, understand the importance of removing data from its silos, then informed business decisions can be made.

In many ways, data can help bring teams together. When everyone can speak the same language and have access to the same information, it naturally sparks more collaboration. That’s not to say everyone needs to be aware of everything, but context is important. There’s nothing like data to provide context in decisioning to ensure an organization, with all its stakeholders and moving parts, is going in the same direction.

Make Data (And the Datatecture) The Core of Your Streaming Business

The datatecture we’ve created (and the datatecture you will create) should be at the epicenter of your streaming operations, software development, and business objectives. Without a deep understanding of the data role each component within your video stack plays within your observability, it will be far more difficult to make the business and technology decisions to drive your platform forward.

Read More