Understanding the Datatecture Part 1: The Core Categories
The relationship of technologies within the streaming video stack is complex. Although there might seem to be a linear progression of components, from content acquisition to playback, in many workflows, the connection between the technologies is often far from such. Data from one piece of technology may be critical for another to function optimally. APIs can be aptly leveraged to connect optimization data from different technologies to each other, and to higher-level systems like monitoring tools and operational dashboards. That’s why the datatecture was created: to better visualize the interconnection between the technologies and ultimately document the data they make available.
How to Visualize the Datatecture
Think of the datatecture as a fabric which lays over the entire workflow and represents the underlying flow of data within the technology stack. How the datatecture is organized, then, is critical to not only understanding the basis of that fabric but how to categorize your own version of it, suited specifically to your business. Regardless of the individual technologies you end up employing in your datatecture, they will be ultimately categorized into three major areas: Operations, Infrastructure, and Workflow.
Datatecture Core Category: Operations
A major category within any streaming workflow is the group of technologies which help operate the service. The other categories and technologies within this group are critical to ensuring a great viewer experience:
Analytics. One of the primary categories within the Operations group, this subgroup includes a host of components found in any streaming video technology stack. The technologies found here may include tools for analyzing logs (such as for CDN logs), tools for analyzing data received from the video player, and even data useful in understanding viewer behavior regarding product features and subscriber identity. Without these and other technologies in this subgroup, it would be nearly impossible to provide the high-quality viewing experience subscribers demand.
Configuration Management. Although not as sexy as analytics, this is a critical subgroup of the Operations category as it covers such technology as multi-CDN solutions. Many streaming providers employ multiple CDNs to deliver the best experience. But switching from one CDN to another can be complex. The technologies in this subgroup can help provide the functionality in a much easier way.
Monitoring. Perhaps one of the lynch pins of the streaming video technology stack, this subgroup enables operational engineers to continually assess the performance of every other technology within the stack, whether they are on-prem, cloud-based, or even third-party. The data pulled into these monitoring tools ensures operations and support personnel can optimize and tune the workflow for the best possible user experience.
Read more about the Operations category.
Datatecture Core Category: Infrastructure
Underlying the entire workflow is the infrastructure. From databases to storage to virtualization, these fundamental technologies power the very heart of the streaming stack:
Containers and Virtualization. As the streaming stack has moved to the cloud, virtualization, and the tools to manage containers and instances, has become a crucial technology. These technologies ensure scalability and elasticity as well as providing a means to quickly and easily deploy new versions of workflow components.
Storage and Caching. At its heart, streaming video is about data. Whether those are the segments which comprise an HTTP-chunked file or the data gathered from throughout the workflow, it’s all about bits and bytes. The challenge is how to store them and, in the case of caching, how to make them available to the users and applications that need it. The subgroups and technologies in this group are critical to building and managing that data.
Queueing Systems. Scale is a major challenge to streaming providers. How do you handle all the user requests for content, and the influx of QoE and QoS data, when parallel sessions climb into the millions or tens of millions? Queuing Systems provide a means by which to organize and handle those requests to prevent systems, such as caches or databases, from being overrun and tipping over.
Read more about the Infrastructure category.
Datatecture Core Category: Workflow
This core category is where the magic happens. It’s all of the subgroups and technologies which enable the transformation, security, delivery, and playback of streaming video which makes sense that it’s the deepest category with the most technologies:
Delivery. From CDNs to Peer-to-Peer, this category deals with well known and established technologies for getting the streaming segments to the users who are requesting them. This subgroup also contains other technologies, such as Multicast ABR and Ultra-low latency, which are becoming increasingly important in delivering a great viewer experience, with high-quality video, at scale especially for live events such as sports.
Security. The streaming industry has long contended with piracy. That’s because of the nature of streams: they are just data. It is much more difficult to pirate a broadcast feed because there is no way to steal the signal. But with streaming, which employs well known web-based technologies like HTTP, that’s not the case. So this subgroup includes technologies like DRM, watermarking, and Geo IP to do everything from encrypt the content to determining where it can be played.
Playback. Without a player, there would be no streaming video. This subgroup addresses the myriad of playback options from mobile devices to gaming consoles. But players also come in many shapes and sizes. While some are commercially available, and provide a lot of support, others are open-source and present highly-configurable options for streaming providers that want a greater degree of control, with less support, than may be available with commercial players.
Transformation. Content for streaming doesn’t come ready-made out of the camera. Just like broadcast, it must be encoded into bitrates that are appropriate for transmission to viewers. But unlike broadcast, the players and devices used to consume those streams may require very specific packages or formats, some of which require licenses to decode and others which are open-source. The subgroups in this category cover everything from encoding to packaging and even metadata, the information which is critical for streaming providers to categorize and organize content.
Monetization. Of course, most streaming providers aren’t giving away their content for free. They have some sort of strategy to generate revenue. These can range from subscription services to including ads. The subgroups in this category cover a broad range of monetization technologies ranging from subscription management to the many components of advertising integration, such as SSAI and DAI, and tracking.
Content Recommendations. This small subgroup is becoming increasingly important in streaming platforms. Suggesting content to viewers, whether it’s based on past viewing behavior or the viewing behavior of similar users, is critical to keeping users engaged which can ultimately impact attrition.
Read more about the Workflow category.
No One Core Category is More Important Than Another In Datatecture
You may be wondering if you can scrimp on one core category, like Operations, for the sake of another, such as Workflow. The short answer is, “no.” These Datatecture core categories are intricately connected, hence the Venn diagram structure. Operations depends on the Infrastructure to store all of the data while Workflow depends on Operations to find and resolve root-cause issues. Of course, there are countless other examples, but you get the picture: these core categories are joined at the hip. So when you are examining your own streaming platform or planning your service, building your own datatecture depends on understanding the relationship between these core categories and ensuring you are providing them the proper balance.