Monday, January 31, 2022

Base Conceptual Media Tracking Data Model

Over the past years I have been involved in the media world needing either to consume or master all the details of a movie, series, video, ad, or trailer.  I found the data vendors I pulled from didn't quite handle all the requirements for tracking all the various meta data about a video.  Nor did they do a good job setting it up to be extendable and to be populated manually or by an AI engine.  So I created this model as a start to attempt this effort.

This model is mostly focused on the cardinality between core entities.  Typically I would "Subtype" all the various Articles, Content, & Concepts to capture all the "Attributes" that vary from "Subtype" to "Subtype" in order to full document them.   But for the sake of clarity and simplicity of communications in this blog, I reduce the "Subtypes" as a "Details" Attribute that would typically be physically rendered as a JSON formatted column making it extremely flexible and extendable making it more "Future Proof" and less rigid.  This is a choice of mine to mix structured with semi-structured in attempt to use the benefits of both worlds and reduce each of their limitations.  A separate document can be created to document the "Details" for each "Subtype" introduced as you add them into your system. 






Scenarios:

1:  Capture Source and Programing Schedule for any Video: Series, Movie, Trailer, Ad, etc...
2:  Capture any talent, credits, and content relating to any Video 
3:  Capture any aggregated metrics that may be performed on a Single or collection of Videos.
4:  Enable Manual or AI to populate discovered content
5:  Enable any kind of content to be defined and captured.  Talent, Topic, Diarization, Brand, Prop...
6:  Enable captured data to be captured manually and used to feed into a Labeling system
7:  Enable any kind of collection tree structure for organizing Videos.   


How To: Search for overlapping time windows

This logic is particularly useful when looking for all the events that overlap a specific time window.   For example:  You want to know all the TV watch sessions in your DVRs that you have distributed across the world which overlap a range of time.

Here is a Snowflake SQL example of performing this operation complete with test cases as proof it works.

/* This proof tests the logic for Overlapping window filters */
SET FILTER_START = '2020-02-01 00:00:00 +0000';
SET FILTER_END = '2020-02-20 23:59:59 +0000';

CREATE OR REPLACE TEMPORARY TABLE TEST (TEST_TYPE STRING, START_TIMESTAMP TIMESTAMP_NTZ, END_TIMESTAMP TIMESTAMP_NTZ);

INSERT INTO TEST
SELECT COLUMN1 AS TEST_TYPE, COLUMN2::TIMESTAMP_NTZ AS START_TIMESTAMP, COLUMN3::TIMESTAMP_NTZ AS END_TIMESTAMP
 FROM (VALUES('OUTSIDE LEFT', '2020-01-01 00:00:00 +0000', '2020-01-02 00:00:00 +0000'),  
             ('OUTSIDE RIGHT', '2020-02-24 00:00:00 +0000', '2020-02-25 00:00:00 +0000'),  
             ('STRADDLE LEFT', '2020-01-15 00:00:00 +0000', '2020-02-02 00:00:00 +0000'),  
             ('STRADDLE RIGHT', '2020-02-10 00:00:00 +0000', '2020-02-25 00:00:00 +0000'),  
             ('INSIDE', '2020-02-10 00:00:00 +0000', '2020-02-11 00:00:00 +0000'),  
             ('STRADDLE BOTH SIDES', '2020-01-01 00:00:00 +0000', '2020-03-25 00:00:00 +0000')
      );
      
SELECT *
  FROM TEST
 WHERE (END_TIMESTAMP > $FILTER_START AND START_TIMESTAMP < COALESCE($FILTER_END,DATEADD(DAY, 1, $FILTER_START)));