Netflix的Essential Suite:封面图制作辅助工具

0 / 454


Essential Suite — Artwork Producer Assistant


Gingo 白果

Netflix Technology Blog

Netflix Technology BlogFollow

Feb 7, 2020 · 8 min read

By: Hamid Shahid & Syed Haq

Introduction 介绍

Netflix continues to invest in content for a global audience with a diverse range of unique tastes and interests. Correspondingly, the member experience must also evolve to connect this global audience to the content that most appeals to each of them. Images that represent titles on Netflix (what we at Netflix call “artwork”) have proven to be one of the most effective ways to help our members discover the content they love to watch. We thus need to have a rich and diverse set of artwork that is tailored for different parts of the Netflix experience (what we call product canvases). We also need to source multiple images for each title representing different themes so we can present an image that is relevant to each member’s taste.


Manual curation and review of these high quality images from scratch for a growing catalog of titles can be particularly challenging for our Product Creative Strategy Producers (referred to as producers in the rest of the article). Below, we discuss how we’ve built upon our previous work of harvesting static images directly from video source files and our computer vision algorithms to produce a set of artwork candidates that covers the major product canvases for the entire content catalog. The artwork generated by this pipeline is used to augment the artwork typically sourced from design agencies. We call this suite of assisted artwork “The Essential Suite”.

对于我们不断增长的标题目录,手动管理和查看这些高质量图像对于我们的产品创意策略生产者(在本文的其余部分中称为生产者)来说尤其具有挑战性。下面,我们讨论如何基于先前的工作直接从视频源文件收获静态图像以及我们的计算机视觉算法,以生成一组涵盖整个内容目录主要“产品画布“的封面图候选。该方法将来自设计机构的图稿进行增强处理,生成封面图。我们称这套封面图辅助工具为“Essential Suite”。

Supplement, not replace 补充,而不是替代

Producers from our Creative Production team are the ultimate decision makers when it comes to the selection of artwork that gets published for each title. Our usage of computer vision to generate artwork candidates from video sources thus is focussed on alleviating the workload for our Creative Production team. The team would rather spend its time on creative and strategic tasks rather than sifting through thousands of frames of a show looking for the most compelling ones. With the “Essential Suite”, we are providing an additional tool in the producers toolkit. Through testing we have learned that with proper checks and human curation in place, assisted artwork candidates can perform on par with agency designed artwork.

我们创意制作团队的创意生产者是产品呈现最终的决策者,他们为每个影片标题选择合适的封面图候选。因此,我们使用计算机视觉从视频源生成封面图的重点是减轻我们的创意制作团队的工作量。团队宁愿将时间花在创造性和战略性任务上,而不是筛选数千帧图片以寻找最引人注目的那些图。通过“Essential Suite”,我们在生产者工具包中提供了额外的工具。通过测试,我们了解到,通过适当的检查和适当的人工策划,机器选出来的封面图可以和专业设计的封面图相媲美。

Design Agencies 设计机构

Netflix uses best-in-class design agencies to provide artwork that can be used to promote titles on and off the Netflix service. Netflix producers work closely with design agencies to request, review and approve artwork. All artwork is delivered through a web application made available to the design agencies.


The computer generated artwork can be considered as artwork provided by an “Internal agency”. The idea is to generate artwork candidates using video source files and “bubble it up” to the producers on the same artwork portal where they review all other artwork, ideally without knowing if it is an agency produced or internally curated artwork, thereby selecting what goes on product purely based on creative quality of the image.


Assisted Artwork Generation Workflow 封面图辅助生成工作流程

The artwork generation process involves several steps, starting with the arrival of the video source files and culminating in generated artwork being made available to producers. We use an open source workflow engine Netflix Conductor to run the orchestration. The whole process can be divided into two parts.

封面图生成过程涉及几个步骤,从视频源文件的到达开始,最后达到将生成的封面图提供给创意生产者的目的。我们使用开源工作流引擎Netflix Conductor来运行业务流程。整个过程可以分为两部分

  1. Generation
  2. Review

1. Generation 生成

This article on AVA provides a good explanation on our technology to extract interesting images from video source files. The artwork generation workflow takes it a step further. For a given product canvas, it selects a handful of images from the hundreds of video stills most suitable for that particular product canvas. The workflow then crops and color-corrects the selected image, picks out the best spot to place the movie’s title based on negative space, selects and resizes the movie title and places it onto the image.


Here is an illustration of what it means if we had to do it manually

a. Image selection

b. Identify areas of interest


c. Cropped, color-corrected & title placed in the negative space


Image Selection / Analyze Image 图像选择/分析图像

Selection of the right still image is essential to generating good quality artwork. A lot of work has already been done in AVA to extract out a few hundreds of frames from hundreds of thousands of frames present in a typical video source. Broadly speaking, we use two methods to extract movie stills out of video source.


  1. AVA — Ava is primarily a character based algorithm. It picks up frames with a clear facial shot taking into account actors, facial expression and shot detection.
  2. Cinematics — Cinematics picks up aesthetically pleasing cinematic shots.
  3. AVA — Ava主要是基于演员的算法。它将演员,面部表情和镜头检测等因素考虑在内,以清晰的面部镜头拾取帧。
  4. Cinematics—Cinematics选择令人愉悦的镜头。

The combination of these two approaches produce a few hundred movie stills from a typical video source. For a season, this would be a few hundred shots for each episode. Our work here is to pick up the stills that best work for the desired canvas.

=Both of the above algorithms use a few heatmaps which define what kind of images have proven to be working best in different canvases.= The heatmaps are designed by internal artists who are experienced in designing promotional artwork/posters.


=以上两种算法都使用了一些热图,这些热图定义了已证明在不同画布上效果最佳的图像类型。= 热图由在设计促销性艺术品/海报方面经验丰富的内部艺术家设计

Heatmap for a BillboardWe make use of meta-information such as the size of desired canvas, the “unsafe regions” and the “regions of interest” to identify what image would serve best. “Unsafe regions” are areas in the image where badges such as Netflix logo, new episodes, etc are placed. “Regions of interest” are areas that are always displayed in multi-purpose canvases. These details are stored as metadata for each canvas type and passed to the algorithm by the workflow. Some of our canvases are cropped dynamically for different user interfaces. For such images, the “Regions of interest” will be the area that is always displayed in each crop.


Unsafe regionsThis data-driven approach allows for fast turnaround for additional canvases. While selecting images, the algorithms also returns back suggested coordinates within each image for cropping and title placement. Finally, it associates a “score” with the selected image. This score is the “confidence” that the algorithm has on the selection of candidate image on how well it could perform on service, based on previously collected stats.


Image Creation 影像创作

The artwork generation workflow collates image selection results from each video source and picks up the top “n” images based on confidence score.

图稿生成工作流程会整理来自每个视频源的图像选择结果,并根据置信度得分挑选出前“ n”幅图像。

The selected image is then cropped and color-corrected based on coordinates passed by the algorithm. Some canvases also need the movie title to be placed on the image. The process makes use of the heatmap provided by our designers to perform cropping and title placement. As an example, the “Billboard” canvas shown on a movie’s landing page is right aligned, with the title and synopsis shown on the left.


Billboard Canvas

The workers to crop and color correct images are made available as separate titus jobs. The workflow invokes the jobs, storing each output in the artwork asset management system and passes it on for review.


2. Review 审核

For each artwork candidate generated by the workflow, we want to get as much feedback as possible from the Creative Production team because they have the most context about the title. However, getting producers to provide feedback on hundreds of generated images is not scalable. For this reason, we have split the review process in two rounds.


Technical Quality Control (QC) 技术质量控制(QC)

This round of review enables filtering out images that look obviously wrong to a human eye. Images with features such as human actors with an open mouth, inappropriate facial expressions or an incorrect body position, etc are filtered out in this round.


For the purpose of reviewing these images, we use a video/image annotation application that provides a simple interface to add tags for a given list of videos or images. For our purposes, for each image, we ask the very basic question “Should this image be used for artwork?”



The team reviewing these assets treat each image individually and only look for technical aspects of the image, regardless of the theme or genre of the title, or the quantity of images presented for a given title.


When an image is rejected, a few follow up questions are asked to ascertain why the image is not suitable to be used as artwork.


All this review data is fed back to the image selection, cropping and color corrections algorithms to train and improve them.


Editorial QC 编辑质量控制

Unlike technical QC, which is title agnostic, editorial QC is done by producers who are deeply familiar with the themes, storylines and characters in the title, to select artwork that will represent the title best on the Netflix service.


The application used to review generated artwork is the same application that producers use to place and review artwork requests fulfilled by design agencies. A screenshot of how generated artwork is presented to producers is shown below.


Similar to technical QC, the option here for each artwork is whether to approve or reject the artwork. The producers are encouraged to provide reasons why they are rejecting an artwork.


Approved artwork makes its way to the artwork’s asset management system, where it resides alongside other agency-fulfilled artwork. From here, producers have the ability to publish it to the Netflix service.


Conclusion 结论

We have learned a lot from our work on generating artwork. Artwork that looks good might not be the best depiction of the title’s story, a very clear character image might be a content spoiler. All of these decisions are best made by humans and we intend to keep it that way.

However, assisted artwork generation has a place in supporting our creative team by providing them with another avenue to pick up their assets from, and with careful supervision will help in their challenge of sourcing artwork at scale.