Friday, February 18, 2022

Lessons Learned: An eight year venture in creating data products

Preface

I worked for a data product & AI company for eight years.  The company made many attempts to define a series of data products that could be sold to the media industry.   We were successful in many ways on a technical level, but in the end we failed as a business.   Many things could contribute to that failure.   This posting is a memorial to all the hard work we all did and to morn the loss we all suffered when we closed.   

My hat goes off to our CEO and COO for their dedication and commitment to the company.   To our scientists I sit in awe at what they accomplished in creating machine learning packages that detected so many things we as humans take for granted.   To our data engineers who tirelessly built and integrated all the subsystems of our service.   To our full stack developer who made all our data visible and beautiful.   To our account success managers who walked and talked with our customers through our data products. To our public relations personnel who converted our complex jargen into words that our customers could understand and made YouTube video presentations about the media industry.   To our sales people who opened doors and closed deals and kept the wheels moving for our company.  To our VCs who had faith in us and the technology and financed our way.   To all of you, thank you!  You rock and I will remember you all.

The Customer

We had some strong ties to some studios and networks through our CEO and COO.  It was obvious that we would capitalized on these relationships to get in the door that would in other ways be impenetrable for most startups.   We were very successful at getting conversations going and starting pilot projects with Disney, Paramount, Universal, HBO, Warner Brothers, Sinclair, Weather Channel and others.   It looked very promising and we were all very excited.  And because of that success, we got support by venture capitalists to help us develop our platform and data products.  

It turns out that these media studios and networks all were very curious.   They had keen interest to collect actionable metrics on their movies, series, or news broadcasts.   

Metrics Like the Following:
  • Predict Video Ratings before Release
  • Detect Talent Screen Time
  • Calculate Talent Diversity Scores
  • Detect Topics Presented
  • Detect News Coverage
  • Find Props & Set Pieces
  • Collect the Movie Credits on Videos in their Vault
  • Suggest the Appropriate amount of On Site for news is required (On Site is expensive)
  • Suggest the Appropriate amount of Spanglish for mix language audience
  • Suggest the Appropriate amount of News Coverage for a topic    
  • Measure the Retention power of each 10 second of the video 
  • Measure the Retention power of each episode of a series
  • Determine what Type of Content had Statically Significant Retention for the Audience
The list could go on an on.  So many questions, but which ones would provide meaningful value for our customers and a sustainable future for our company.  

Customers Needed:
  • Help increasing sales or lower their costs  
  • A data product that fits their budget
  • A data product that was simple to understand  
  • A high quality data product 
  • A durable data product that would be available for years to come
  • A data product that could easily integrate & have low friction with their daily business process 
  • A data product with 24/7 technical support
We Needed:
  • High sales in a large enough market space to support the company long term
  • A data product that had a low cost to manufacture 
  • A data product that had a high enough profit margin to survive if not thrive
  • High subscription renewals
  • Low sales turn around
  • A data product we exclusively owned
Once we understood our customer's specific needs and gathered the necessary requirements, we would start with creating a pilot with each customer to run a short list of videos to process and provide them the results to review.   This is where it got "interesting".   On the one hand, all our customers were surprised it could be done. They were satisfied by the results and gave high praise to our work internally and externally of their company.  But on the other hand, after the pilot was over, the customers would typically delay signing any contracts for months.  This delay resulted in one of two outcomes:   One outcome was that they would smile, thank us for the work, and state that they don't have the budget.  The second outcome was that we would find out that our main "point-of-contact" at the studio had moved onto another position and was no longer able to assist us in moving forward with a contract.   From a sales perspective this basically meant we had to the start the sales process all over again. 

What We Built

First of all, we had to analyze the video and audience behavior and report back to the customer a meaningful and actionable result.  At the same time we needed to create a platform that enabled us to create reproducible results in a systematic, predictable, and high volume way.   So we built and productionalized an AI service that could process video and combine it with second by second audience behavior to determine what resonated with the audience.   A visualization UI was built to allow our customers to see the second by second timeline of all the detections in sync with the video playback.  And we summarized a series of scores that enabled our customers to see, at a glance, the performance of the video for talent and content.  It was a dream and now it was a reality.   We created a fully productionalized service that was processing our customer's videos every day and reporting back to our customers the final results.

Our detectors could do the following:
  • Talent
    • Time on Screen 
    • Time Speaking
  • Style
    • Pace
    • Complexity in Vernacular and Length of Sentences
    • Language Classification of use of Personal Viewpoint (Me, I, Feel, Believe, etc...)
  • Emotions
    • Sad, Angry, Fear, Positivity, Surprise
  • Talent's Setting
    • News
      • Local
      • On Site
      • National
    • Scripted
      • Outside Country
      • Outside City
      • Indoors
  • Shot 
  • Screen Motion
  • Topic Classification 
  • News Coverage
    • Crime, Weather, Politics, Lifestyle, Social Justice, Sports, Traffic, Tech/Science, Economy/Business, Education, Public Health
  • Scripted Storyline
    • Custom per series or genre 

The Ultimate Tragedy 

I don't think I could point to just one thing that was the source our ultimate demise.   We were technically adept to the hard problem challenge. We were well networked with the media industry and we were financed well enough to execute.  At our hight we had three data scientists, three data engineers, one full stack (UI) developer, one user experience (UX) specialist, one product manager (PM), two sales representatives, one account manager, one public relations, an excellent COO, and an awesome and seasoned CEO.   Along with a couple temporary advisors we ran pretty cheap in comparison to similar startups.    So what happened?

If one is to do a little bit of "Monday Morning Quarterbacking" here, I would summarize it this way:

ONE: Stockmarket Instability
We had VCs willing to continue funding our efforts and had $1.3 million dollars promised to fund us through to the end of 2022.   But the Stockmarket instability made it difficult for the VCs to make the funds available.   Either their funds dried up or access to them was to distant into the future.  

TWO: Too early to the video content analysis game using AI
We couldn't get enough supporting foundational customers that were willing to subscribe to our service long term at supportive rate. The traditional media world was in the curious stage and wasn't ready for a long term commitment.   We got a lot of feedback about what questions they wanted answered.   But nothing that seemed to keep a customer coming back for more on a daily or weekly basis on a budget that was acceptable.   It seemed that once the questions were answered their curiosity was over and they moved on to something else. 

THREE: Couldn't solidify a strong enough use case for the data products
Neither we nor our customers could figure out what data products would help them in their daily workflow.  Their curiosity was not enough.   It turned out that once the movie was in a form in which our service could process the video, it was too late to have any meaningful contribution to most of the steps of film creation (The Seven Steps to Movie Making).   Maybe we could have used our data products to suggest to a studio the renditions for other countries, but then again there is a firmly established process there.  Or maybe we could have helped a studios' library department perform analysis on the inventory of their existing films, but this was a very finite venture and with an even more finite budget. 

The facts are that the traditional media world is 100 years old, strongly networked, and heavily committed to previous contracts;  their processes are firmly established and their budgets are tight.   A lot of their money is spent on actors, directors, marketing, FX, sets, etc...  To squeeze in the cost of an "Unproven" data product into a firmly establish process and budget is a very hard nut to crack.    Its like getting your first credit card.  You have to prove you are reliable and will make payments on your credit card.   A catch twenty two scenario.

FOUR: Competition Grew Fast
AI competition grew in a very short period of time.  We dedicated our whole company on this effort from 2018 until we closed our doors in 2022.   In that period of time, a lot of competing companies and moneys consolidated and networked faster than we could.

FIVE: Cloud Providers were setting the cost expectations to customers
In the beginning, our services were set at a cost of $30 per minute of video.   At that time there wasn't much competition out there.  Available identity detection services tended to be hit and miss.  We on the other hand used a mixed approach using person tracking, face detection, and identity detection to create a much more accurate representation of a talent on screen, even when the face was obscured.   Our customers compared us with AWS, Azure, and Google and our accuracy beat our competition hands down.     But these cloud providers make money through the usage of compute and storage services.   Which means that the cost of all the other supporting services they provide are offset.   Consequently AWS's service for identity detection is extremely cheap at $0.15 a minute of video.  That sets the customers expectations of cost per minute of video process to be under a $1 a minute and much closer to $0.15.    Ironically we were using AWS compute and storage services ourselves.  All though our direct customers understood that our costs included compute and storage, it was still hard for them to evangelize within their own organization the large price differential.   

SIX: Legal Access to Video was Difficult
There was a bit of friction in gaining access to videos.  This friction slowed our delivery down considerably by a month or two.   If we asked the customer to provide videos, we had to step through a lot of legal hoops and lawyer fees to make this happen.  Then once we had that nailed down, setting up the method of transport was always different per customer due to the requirements of security.    Sometimes the customer didn't desire to go through their own red tape to bother.    So we had to approach it several different ways which added to our complexity.  We finally settled on capturing video from OTA (Over the Air), RTMP, HLS, and S3 buckets.  Each one came with their technical issues and video recording quality problems.  But at least we could move forward with a pilot.     

SEVEN: Customers Want to Build it Themselves
Time and time again, it turns out that companies just want to build it themselves.   They would turn to our company to pilot an idea to see if it can be done.  After they are satisfied, they then proceed to do it themselves.  The tools and technology available now affords them the ability to provide their own solutions and outsource the work to where the wages can be 1/4-1/3 the cost of someone in the USA.   They gain full control of the intellectual property and they can build it the way they want using the tools and platforms they feel have long term future for them.  
    

Just Buy Our Company

You would think that several companies would be knocking down our door offering to buy the company just for its efficient and effective data science and data engineering team, let alone the platform we pioneered.   But no, that wasn't going to happen.   Not to say there wasn't a desire from our partners and customers to do so.  It gets complicated when a company buys another company.  You have to get alignment with all stakeholders on both sides of the company and they all must have the will power to overcome any obstacles.   It only takes one stakeholder to significantly slow or spoil the effort.    And it did.         


Twenty / Twenty Hindsight

Now that I have time to reflect upon all the years we put into building a platform and data products, I think I have a better grasp at where the money is for these kind of data products.   If you study the The Seven Steps to Movie Making you will find that the budget is pretty tight across the steps.   And through experience, I witnessed that it was hard for a customer to find the desire and justification to adjust their budget for any of their steps to accommodate our data products.   But there was one area we didn't look into nor had access to.  The Distribution Step.  

Distribution is where the industry makes it money.  The film must be distributed for the producers to make their money back.  They make lucrative deals for distribution amongst the cinemas and streaming services such as Amazon Prime, Netflix, HBO etc.... Another factor is that these deals help a "Film’s Reach" to rake in the enough money to ensure the return on investment. 

There are 14 billion episodes and 100 million movies internationally (I suspect that those number includes renditions).   How in the world does anyone make a decision on what licenses to buy with that volume?  You'd have to classify all them in a consistent and reliable way.   Humans are ok at classification, but to scale up to hundreds if not tens of thousands of paid people to classify the videos in a consistent manner is not possible.   The only real solution is to apply a machine learning approach that can classify the existing catalog and keep up with the steadily increasing volume of new episodes and movies coming out each month internationally.  

After understanding the distribution space,  I have personally come to the conclusion that we were chasing the wrong customers.   We needed to look into partnering with or becoming a content licensing exchange and research service that broker license deals for distribution.  We could have used the AI's analytical summary of a video to help prospective license buyers to make the best buying choices.  Providing them a searchable interface with filter boxes to find the right videos to examine closer.  

Possible filters:
  • Actual Social Rating Score
  • Predicted Rating Score
  • Standardized set of Genres
  • A Specific Talent's Screen Time
  • Various Diversity scores of the film
  • Pace of the Film
  • Vernacular Complexity of the Film
  • Topic Coverage  
  • Percentage of Emotions Covered
  • Localized and Internationalized Content Rating 
  • Percentage of Each Language Spoken
But there is still one really big problem.   How does one source and temporarily capture 14 billion videos and process them to create the classifications for the potential buyers to filter on?  And what about the costs in doing so?  I have no answer, not one that is legal that is.  Ok I lie, there is one way; create a platform to enable anyone to create their own custom streaming service complete with custom curated licensed content from anywhere around the world with a business model that is equitable to the content owners, the custom streaming service creators, and the audience.   I can only imagine the amount of funding required to accomplish this "Think Big" idea.

One Not So Little Idea Left on the Table

As our runway for launching a successful data product was coming to a close, we were in the midst of defining a data product that would assist short form video content creators on YouTube and Facebook to build their audience.   There is huge money to be made in short form video.   MRBeast, for example, grossed $54 million dollars in 2021,  the highest earning YouTube content creator.   This gives you an idea of how much money can be generated in short form video and it's growing.   We determined that the first 10-30 seconds of a video is most critical in attracting and retaining an audience.  If we could analyze the first 30 seconds of a creator's video and compare it with the videos of their best competitor's, we could provide them useful measurements and suggestions before they even publish their video.

So Long and Thanks for all the Fish

It was fun, it was exhilarating, it was fascinating, it was humbling, it was exhausting, it was terrifying, it was an experience of a life time.   I would not give it up for anything.   What a trip.   I think I can speak for all of us at the company that we gained so much experience with the media industry and the challenges that they face now in the 21 century.  We are forever humbled by the natural biological functions of our minds and how it can so easily detect and classify things in an actionable way with just a tiny energy budget.  We totally respect anyone who can make a successful data product and keep it relevant in an ever changing world in a sea of data products.

These are my points of advice for building data products to anyone stumbling upon my post:  
  • Understand ALL the major processes in a vertical industry
  • Determine the most painful process points
  • Become a part of the industry's process flow
  • Target the RIGHT customer
  • Have the goal to be THE Source of Truth for a prospective dataset
  • Keep it a simple build 
  • Make it a quick and easy sell
  • Keep the "Friction" low for customers to adopt into their daily lives
  • Provide data that enables customer's take action on a daily or weekly basis
  • Network and market like crazy and fast
  • Expect your first several data products to fail
  • Keep churning out data products until something sticks
  • You'll have tones of competition, so expect to the big boys to move in on your territory
Never give up! Never Given!  -- Galaxy Quest






No comments: