Google使用GAN创建怪兽

AI画图  收藏
0 / 565

Using GANs to Create Fantastical Creatures

Tuesday, November 17, 2020

Posted by Andeep Singh Toor, Stadia Software Engineer and Fred Bertsch, Software Engineer, Google Research, Brain TeamCreating art for digital video games takes a high degree of artistic creativity and technical knowledge, while also requiring game artists to quickly iterate on ideas and produce a high volume of assets, often in the face of tight deadlines. What if artists had a paintbrush that acted less like a tool and more like an assistant? A machine learning model acting as such a paintbrush could reduce the amount of time necessary to create high-quality art without sacrificing artistic choices, perhaps even enhancing creativity.

Today, we present Chimera Painter, a trained machine learning (ML) model that automatically creates a fully fleshed out rendering from a user-supplied creature outline. Employed as a demo application, Chimera Painter adds features and textures to a creature outline segmented with body part labels, such as “wings” or “claws”, when the user clicks the “transform” button. Below is an example using the demo with one of the preset creature outlines.
作者:Stadia软件工程师Andeep Singh Toor和Brain Team Google Research软件工程师Fred Bertsch为数字视频游戏创作艺术需要高度的艺术创造力和技术知识,同时还要求游戏美术师经常迭代创意并产生大量资产(通常面对紧迫的期限)。如果艺术家使用的画笔不像工具,而更像助手,该怎么办?充当此类画笔的机器学习模型可以减少创建高质量艺术所需的时间,而不会牺牲艺术选择,甚至可能增强创造力。

今天,我们介绍Chimera Painter,这是一种训练有素的机器学习(ML)模型,该模型可以根据用户提供的生物轮廓自动创建完全充实的渲染。作为演示应用程序,当用户单击“变换”按钮时,Chimera Painter会将特征和纹理添加到按身体部位标签(例如“机翼”或“爪”)分割的生物轮廓。以下是使用带有预设生物轮廓之一的演示的示例。


Using an image imported to Chimera Painter or generated with the tools provided, an artist can iteratively construct or modify a creature outline and use the ML model to generate realistic looking surface textures. In this example, an artist (Lee Dotson) customizes one of the creature designs that comes pre-loaded in the Chimera Painter demo.
使用导入到Chimera Painter或使用提供的工具生成的图像,艺术家可以迭代地构造或修改生物轮廓,并使用ML模型生成逼真的表面纹理。在此示例中,艺术家(李·多森)(Lee Dotson)自定义了Chimera Painter演示中预先加载的其中一种生物设计。

In this post, we describe some of the challenges in creating the ML model behind Chimera Painter and demonstrate how one might use the tool for the creation of video game-ready assets.
在这篇文章中,我们描述了在Chimera Painter背后创建ML模型时遇到的一些挑战,并演示了如何使用该工具来创建可用于视频游戏的资产。

Prototyping for a New Type of Model 新型模型的原型
In developing an ML model to produce video-game ready creature images, we created a digital card game prototype around the concept of combining creatures into new hybrids that can then battle each other. In this game, a player would begin with cards of real-world animals (e.g., an axolotl or a whale) and could make them more powerful by combining them (making the dreaded Axolotl-Whale chimera). This provided a creative environment for demonstrating an image-generating model, as the number of possible chimeras necessitated a method for quickly designing large volumes of artistic assets that could be combined naturally, while still retaining identifiable visual characteristics of the original creatures.
在开发ML模型以生成可用于视频游戏的生物图像时,我们围绕将生物组合成新的可相互竞争的混合体的概念,创建了一个数字纸牌游戏原型。在此游戏中,玩家将以现实世界中的动物(例如,x或鲸)的牌开始,并可以通过组合它们(使它们成为可怕的Axolotl-Whale嵌合体)来使其更加强大。这提供了一个演示图像生成模型的创造性环境,因为可能的嵌合体数量需要一种方法来快速设计可自然组合的大量艺术资产,同时仍保留原始生物的可识别视觉特征。

Since our goal was to create high-quality creature card images guided by artist input, we experimented with generative adversarial networks (GANs), informed by artist feedback, to create creature images that would be appropriate for our fantasy card game prototype. GANs pair two convolutional neural networks against each other: a generator network to create new images and a discriminator network to determine if these images are samples from the training dataset (in this case, artist-created images) or not. We used a variant called a conditional GAN, where the generator takes a separate input to guide the image generation process. Interestingly, our approach was a strict departure from other GAN efforts, which typically focus on photorealism.
由于我们的目标是在艺术家的指导下创建高质量的生物卡片图像,因此我们在艺术家反馈的指导下尝试了生成对抗网络(GAN),以创建适合于我们的幻想卡片游戏原型的生物图像。GAN将两个卷积神经网络相互配对:用于生成新图像的生成器网络和用于确定这些图像是否是训练数据集中的样本(在这种情况下,是由艺术家创建的图像)的鉴别器网络。我们使用了一种称为条件GAN的变体,其中生成器采用单独的输入来指导图像生成过程。有趣的是,我们的方法严格偏离了其他GAN的工作,后者通常侧重于写实主义。

To train the GANs, we created a dataset of full color images with single-species creature outlines adapted from 3D creature models. The creature outlines characterized the shape and size of each creature, and provided a segmentation map that identified individual body parts. After model training, the model was tasked with generating multi-species chimeras, based on outlines provided by artists. The best performing model was then incorporated into Chimera Painter. Below we show some sample assets generated using the model, including single-species creatures, as well as the more complex multi-species chimeras.
为了训练GAN,我们创建了全彩色图像的数据集,其中具有根据3D生物模型改编的单物种生物轮廓。该生物轮廓线描绘了每个生物的形状和大小,并提供了可识别各个身体部位的分割图。经过模型训练后,模型的任务是根据艺术家提供的轮廓生成多物种的嵌合体。然后将性能最好的模型合并到Chimera Painter中。下面,我们显示了使用该模型生成的一些示例资产,包括单物种生物以及更复杂的多物种嵌合体。


Generated card art integrated into the card game prototype showing basic creatures (bottom row) and chimeras from multiple creatures, including an Antlion-Porcupine, Axolotl-Whale, and a Crab-Antion-Moth (top row). More info about the game itself is detailed in this Stadia Research presentation. |
生成的纸艺已集成到纸牌游戏原型中,其中显示了基本生物(下排)和来自多个生物的嵌合体,包括蚂蚁-豪猪,A,鲸鱼和螃蟹-Antion-飞蛾(上排)。在Stasta Research的演讲中,详细介绍了有关游戏本身的更多信息。

Learning to Generate Creatures with Structure 学习生成具有结构的生物
使用GAN生成生物时,一个问题是,渲染图像的细微或低对比度部分时,可能会失去解剖和空间连贯性,尽管这些对人类而言具有很高的感知重要性。这样的示例可以包括眼睛,手指,甚至在具有相似纹理的重叠身体部位之间进行区分(请参阅下面的亲切名称BoggleDog)。
An issue with using GANs for generating creatures was the potential for loss of anatomical and spatial coherence when rendering subtle or low-contrast parts of images, despite these being of high perceptual importance to humans. Examples of this can include eyes, fingers, or even distinguishing between overlapping body parts with similar textures (see the affectionately named BoggleDog below).


|GAN-generated image showing mismatched body parts.
| GAN生成的图像显示了不匹配的身体部位。 |

Generating chimeras required a new non-photographic fantasy-styled dataset with unique characteristics, such as dramatic perspective, composition, and lighting. Existing repositories of illustrations were not appropriate to use as datasets for training an ML model, because they may be subject to licensing restrictions, have conflicting styles, or simply lack the variety needed for this task.

To solve this, we developed a new artist-led, semi-automated approach for creating an ML training dataset from 3D creature models, which allowed us to work at scale and rapidly iterate as needed. In this process, artists would create or obtain a set of 3D creature models, one for each creature type needed (such as hyenas or lions). Artists then produced two sets of textures that were overlaid on the 3D model using the Unreal Engine — one with the full color texture (left image, below) and the other with flat colors for each body part (e.g., head, ears, neck, etc), called a “segmentation map” (right image, below). This second set of body part segments was given to the model at training to ensure that the GAN learned about body part-specific structure, shapes, textures, and proportions for a variety of creatures.
生成嵌合体需要一个新的非摄影幻想风格的数据集,该数据集必须具有独特的特征,例如戏剧性的视角,构图和照明。现有的插图存储库不适合用作训练ML模型的数据集,因为它们可能会受到许可限制,样式冲突或仅缺少此任务所需的多样性。

为了解决这个问题,我们开发了一种新的艺术家主导的半自动化方法,用于从3D生物模型创建ML训练数据集,这使我们能够进行大规模工作并根据需要快速迭代。在此过程中,艺术家将创建或获得一组3D生物模型,每种所需的生物类型(例如鬣狗或狮子)都应建立一个模型。然后,艺术家使用虚幻引擎制作了两套叠加在3D模型上的纹理—一种具有全彩色纹理(下图,左图),另一种具有每个身体部位(例如头,耳,颈等)的单色,称为“分割图”(下图,右图)。在训练中将第二部分身体部位细分提供给模型,以确保GAN了解到各种生物特定于身体部位的结构,形状,纹理和比例。

Example dataset training image and its paired segmentation map.

The 3D creature models were all placed in a simple 3D scene, again using the Unreal Engine. A set of automated scripts would then take this 3D scene and interpolate between different poses, viewpoints, and zoom levels for each of the 3D creature models, creating the full color images and segmentation maps that formed the training dataset for the GAN. Using this approach, we generated 10,000+ image + segmentation map pairs per 3D creature model, saving the artists millions of hours of time compared to creating such data manually (at approximately 20 minutes per image).
再次使用虚幻引擎,将3D生物模型都放置在简单的3D场景中。然后,一组自动化脚本将采用此3D场景,并在每个3D生物模型的不同姿势,视点和缩放级别之间进行插值,以创建全色图像和分割图,从而形成GAN的训练数据集。使用这种方法,我们为每个3D生物模型生成了10,000+张图像+分割图对,与手动创建此类数据(每张图像大约20分钟)相比,为艺术家节省了数百万小时的时间。

Fine Tuning 微调

The GAN had many different hyper-parameters that could be adjusted, leading to different qualities in the output images. In order to better understand which versions of the model were better than others, artists were provided samples for different creature types generated by these models and asked to cull them down to a few best examples. We gathered feedback about desired characteristics present in these examples, such as a feeling of depth, style with regard to creature textures, and realism of faces and eyes. This information was used both to train new versions of the model and, after the model had generated hundreds of thousands of creature images, to select the best image from each creature category (e.g., gazelle, lynx, gorilla, etc).

We tuned the GAN for this task by focusing on the perceptual loss. This loss function component (also used in Stadia’s Style Transfer ML) computes a difference between two images using extracted features from a separate convolutional neural network (CNN) that was previously trained on millions of photographs from the ImageNet dataset. The features are extracted from different layers of the CNN and a weight is applied to each, which affects their contribution to the final loss value. We discovered that these weights were critically important in determining what a final generated image would look like. Below are some examples from the GAN trained with different perceptual loss weights.
GAN具有许多可以调整的超参数,从而导致输出图像的质量不同。为了更好地了解该模型的哪个版本比其他版本更好,向艺术家提供了这些模型生成的不同生物类型的样本,并要求他们将其归纳为几个最佳示例。我们收集了有关这些示例中存在的所需特征的反馈,例如深度感,有关生物纹理的样式以及面部和眼睛的真实感。此信息既用于训练模型的新版本,又用于模型生成成千上万的生物图像之后,从每个生物类别(例如瞪羚,山猫,大猩猩等)中选择最佳图像。

我们通过关注感知损失来调整GAN来完成此任务。该损失函数组件(也用于Stadia的Style Transfer ML中)使用从单独的卷积神经网络(CNN)提取的特征来计算两幅图像之间的差异,该卷积神经网络先前已对ImageNet数据集中的数百万张照片进行了训练。从CNN的不同层提取特征,并对每个特征施加权重,这会影响特征对最终损耗值的贡献。我们发现这些权重对于确定最终生成的图像是至关重要的。以下是来自GAN的一些示例,这些示例使用不同的感知损失权重进行了训练。

Dino-Bat Chimeras generated using varying perceptual loss weights.

Some of the variation in the images above is due to the fact that the dataset includes multiple textures for each creature (for example, a reddish or grayish version of the bat). However, ignoring the coloration, many differences are directly tied to changes in perceptual loss values. In particular, we found that certain values brought out sharper facial features (e.g., bottom right vs. top right) or “smooth” versus “patterned” (top right vs. bottom left) that made generated creatures feel more real.

Here are some creatures generated from the GAN trained with different perceptual loss weights, showing off a small sample of the outputs and poses that the model can handle.

上面图像中的某些变化是由于数据集为每个生物包括多个纹理(例如,蝙蝠的带红色或带灰色的版本)而造成的。然而,忽略颜色,许多差异与感知损失值的变化直接相关。特别是,我们发现某些值带来了更鲜明的面部特征(例如,右下与右上)或“平滑”与“有图案”(右上与左下),使生成的生物感觉更真实。

这是从GAN训练产生的一些生物,它们具有不同的感知损失权重,展示了模型可以处理的输出样本和姿势的一小部分。

Creatures generated using different models.
A generated chimera (Dino-Bat-Hyena, to be exact) created using the conditional GAN. Output from the GAN (left) and the post-processed / composited card (right).

Chimera Painter
The trained GAN is now available in the Chimera Painter demo, allowing artists to work iteratively with the model, rather than drawing dozens of similar creatures from scratch. An artist can select a starting point and then adjust the shape, type, or placement of creature parts, enabling rapid exploration and for the creation of a large volume of images. The demo also allows for uploading a creature outline created in an external program, like Photoshop. Simply download one of the preset creature outlines to get the colors needed for each creature part and use this as a template for drawing one outside of Chimera Painter, and then use the “Load’ button on the demo to use this outline to flesh out your creation.

It is our hope that these GAN models and the Chimera Painter demonstration tool might inspire others to think differently about their art pipeline. What can one create when using machine learning as a paintbrush?
Chimera Painter
现在可以在Chimera Painter演示中使用经过训练的GAN ,允许艺术家迭代使用该模型,而不必从头开始绘制数十种相似的生物。艺术家可以选择一个起点,然后调整生物零件的形状,类型或位置,从而可以快速探索并创建大量图像。该演示还允许上传在外部程序(如Photoshop)中创建的生物轮廓。只需下载预设的生物轮廓之一以获得每个生物部分所需的颜色,并将其用作在Chimera Painter外部绘制一个的模板,然后使用演示中的“加载”按钮使用该轮廓充实您的创建。

我们希望这些GAN模型和Chimera Painter演示工具可以激发其他人对他们的艺术创作思路有所不同。使用机器学习作为画笔时,可以创建什么?

Acknowledgments
This project is conducted in collaboration with many people. Thanks to Ryan Poplin, Lee Dotson, Trung Le, Monica Dinculescu, Marc Destefano, Aaron Cammarata, Maggie Oh, Richard Wu, Ji Hun Kim, Erin Hoffman-John, and Colin Boswell. Thanks to everyone who pitched in to give hours of art direction, technical feedback, and drawings of fantastic creatures.

致谢
此项目是与许多人合作进行的。感谢Ryan Poplin,Lee Dotson,Trung Le,Monica Dinculescu,Marc Destefano,Aaron Cammarata,Maggie Oh,Richard Wu,Ji Hun Kim,Eri​​n Hoffman-John和Colin Boswell。感谢所有参与其中的人,他们提供了数小时的艺术指导,技术反馈以及奇幻生物的绘画。