Saturday, February 12, 2022

Baseline Model: ARTIST's Ingestion, Data Mastering, & Data Ticketing

Curating media information about a talent, series, or movie, and credits in a growing international media industry is challenging for the following reasons:

  • Yet to be published data
  • Sparsely populated data
  • Slowing changing data
  • Incorrect data
  • Duplicate data
  • Insufficient Synopsis or Bio summaries  
  • Localization Issues - Bad translations or lack of translation
  • Data "Easter Eggs" - Data that someone put in that is not appropriate for the customer
  • Missing Image Talent and Character Portraits
  • Missing Movie, Series, and Episode Posters
  • Image capturing and persistence and refreshing - Talent portraits for example
  • Image "Easter Eggs" - Images that some put in that are not appropriate for the customer

Data Ingestion Process Model

The Following is the Ingestion, Conforming, Data Shaping, Title Matching, Easter Egg Hunt, and Data Quality Measuring.    Notice different storages and human in the loop requirement for the curator to review, edit, and approve the data before it goes into production.   Eventually this structure will allow extension to perform AI auto editing and release.   My brain is just not quite there yet on that process.







Data Issue & Ticketing Process Model

The following is a general baseline process for managing data problems.   The main point is separating the instances of reporting an issue vs issuing a ticket.   Many people can report an issue, but only one ticket should be created for that field for a release.  State management and workflow is critical as well.



Data Issue & Ticketing Data Model

The following is a general baseline data model for tracking issues and tickets.   The mapping back to the entity and field is a critical point.   Also tracking what the original value is in the ticket is important as well for not just future posterity purposes.  We can use it in machine learning to detect issues before data even gets released to the public or the curators review process.




Media Data Model

The following is a general baseline data model for the Media Data Model.   Please see this post for future details.





No comments: