Using GANs to Create Fantastical Creatures

Tuesday, November 17, 2020

Posted by Andeep Singh Toor, Stadia Software Engineer and Fred Bertsch, Software Engineer, Google Research, Brain TeamCreating art for digital video games takes a high degree of artistic creativity and technical knowledge, while also requiring game artists to quickly iterate on ideas and produce a high volume of assets, often in the face of tight deadlines. What if artists had a paintbrush that acted less like a tool and more like an assistant? A machine learning model acting as such a paintbrush could reduce the amount of time necessary to create high-quality art without sacrificing artistic choices, perhaps even enhancing creativity.

Today, we present Chimera Painter, a trained machine learning (ML) model that automatically creates a fully fleshed out rendering from a user-supplied creature outline. Employed as a demo application, Chimera Painter adds features and textures to a creature outline segmented with body part labels, such as “wings” or “claws”, when the user clicks the “transform” button. Below is an example using the demo with one of the preset creature outlines.
作者:Stadia软件工程师Andeep Singh Toor和Brain Team Google Research软件工程师Fred Bertsch为数字视频游戏创作艺术需要高度的艺术创造力和技术知识,同时还要求游戏美术师经常迭代创意并产生大量资产(通常面对紧迫的期限)。如果艺术家使用的画笔不像工具,而更像助手,该怎么办?充当此类画笔的机器学习模型可以减少创建高质量艺术所需的时间,而不会牺牲艺术选择,甚至可能增强创造力。

今天,我们介绍Chimera Painter,这是一种训练有素的机器学习(ML)模型,该模型可以根据用户提供的生物轮廓自动创建完全充实的渲染。作为演示应用程序,当用户单击“变换”按钮时,Chimera Painter会将特征和纹理添加到按身体部位标签(例如“机翼”或“爪”)分割的生物轮廓。以下是使用带有预设生物轮廓之一的演示的示例。

Using an image imported to Chimera Painter or generated with the tools provided, an artist can iteratively construct or modify a creature outline and use the ML model to generate realistic looking surface textures. In this example, an artist (Lee Dotson) customizes one of the creature designs that comes pre-loaded in the Chimera Painter demo.
使用导入到Chimera Painter或使用提供的工具生成的图像,艺术家可以迭代地构造或修改生物轮廓,并使用ML模型生成逼真的表面纹理。在此示例中,艺术家(李·多森)(Lee Dotson)自定义了Chimera Painter演示中预先加载的其中一种生物设计。

In this post, we describe some of the challenges in creating the ML model behind Chimera Painter and demonstrate how one might use the tool for the creation of video game-ready assets.
在这篇文章中,我们描述了在Chimera Painter背后创建ML模型时遇到的一些挑战,并演示了如何使用该工具来创建可用于视频游戏的资产。

Prototyping for a New Type of Model 新型模型的原型
In developing an ML model to produce video-game ready creature images, we created a digital card game prototype around the concept of combining creatures into new hybrids that can then battle each other. In this game, a player would begin with cards of real-world animals (e.g., an axolotl or a whale) and could make them more powerful by combining them (making the dreaded Axolotl-Whale chimera). This provided a creative environment for demonstrating an image-generating model, as the number of possible chimeras necessitated a method for quickly designing large volumes of artistic assets that could be combined naturally, while still retaining identifiable visual characteristics of the original creatures.

Since our goal was to create high-quality creature card images guided by artist input, we experimented with generative adversarial networks (GANs), informed by artist feedback, to create creature images that would be appropriate for our fantasy card game prototype. GANs pair two convolutional neural networks against each other: a generator network to create new images and a discriminator network to determine if these images are samples from the training dataset (in this case, artist-created images) or not. We used a variant called a conditional GAN, where the generator takes a separate input to guide the image generation process. Interestingly, our approach was a strict departure from other GAN efforts, which typically focus on photorealism.

To train the GANs, we created a dataset of full color images with single-species creature outlines adapted from 3D creature models. The creature outlines characterized the shape and size of each creature, and provided a segmentation map that identified individual body parts. After model training, the model was tasked with generating multi-species chimeras, based on outlines provided by artists. The best performing model was then incorporated into Chimera Painter. Below we show some sample assets generated using the model, including single-species creatures, as well as the more complex multi-species chimeras.
为了训练GAN,我们创建了全彩色图像的数据集,其中具有根据3D生物模型改编的单物种生物轮廓。该生物轮廓线描绘了每个生物的形状和大小,并提供了可识别各个身体部位的分割图。经过模型训练后,模型的任务是根据艺术家提供的轮廓生成多物种的嵌合体。然后将性能最好的模型合并到Chimera Painter中。下面,我们显示了使用该模型生成的一些示例资产,包括单物种生物以及更复杂的多物种嵌合体。

Generated card art integrated into the card game prototype showing basic creatures (bottom row) and chimeras from multiple creatures, including an Antlion-Porcupine, Axolotl-Whale, and a Crab-Antion-Moth (top row). More info about the game itself is detailed in this Stadia Research presentation. |
生成的纸艺已集成到纸牌游戏原型中,其中显示了基本生物(下排)和来自多个生物的嵌合体,包括蚂蚁-豪猪,A,鲸鱼和螃蟹-Antion-飞蛾(上排)。在Stasta Research的演讲中,详细介绍了有关游戏本身的更多信息。

Learning to Generate Creatures with Structure 学习生成具有结构的生物
An issue with using GANs for generating creatures was the potential for loss of anatomical and spatial coherence when rendering subtle or low-contrast parts of images, despite these being of high perceptual importance to humans. Examples of this can include eyes, fingers, or even distinguishing between overlapping body parts with similar textures (see the affectionately named BoggleDog below).

|GAN-generated image showing mismatched body parts.
| GAN生成的图像显示了不匹配的身体部位。 |

Generating chimeras required a new non-photographic fantasy-styled dataset with unique characteristics, such as dramatic perspective, composition, and lighting. Existing repositories of illustrations were not appropriate to use as datasets for training an ML model, because they may be subject to licensing restrictions, have conflicting styles, or simply lack the variety needed for this task.

To solve this, we developed a new artist-led, semi-automated approach for creating an ML training dataset from 3D creature models, which allowed us to work at scale and rapidly iterate as needed. In this process, artists would create or obtain a set of 3D creature models, one for each creature type needed (such as hyenas or lions). Artists then produced two sets of textures that were overlaid on the 3D model using the Unreal Engine — one with the full color texture (left image, below) and the other with flat colors for each body part (e.g., head, ears, neck, etc), called a “segmentation map” (right image, below). This second set of body part segments was given to the model at training to ensure that the GAN learned about body part-specific structure, shapes, textures, and proportions for a variety of creatures.


Example dataset training image and its paired segmentation map.

The 3D creature models were all placed in a simple 3D scene, again using the Unreal Engine. A set of automated scripts would then take this 3D scene and interpolate between different poses, viewpoints, and zoom levels for each of the 3D creature models, creating the full color images and segmentation maps that formed the training dataset for the GAN. Using this approach, we generated 10,000+ image + segmentation map pairs per 3D creature model, saving the artists millions of hours of time compared to creating such data manually (at approximately 20 minutes per image).

Fine Tuning 微调

The GAN had many different hyper-parameters that could be adjusted, leading to different qualities in the output images. In order to better understand which versions of the model were better than others, artists were provided samples for different creature types generated by these models and asked to cull them down to a few best examples. We gathered feedback about desired characteristics present in these examples, such as a feeling of depth, style with regard to creature textures, and realism of faces and eyes. This information was used both to train new versions of the model and, after the model had generated hundreds of thousands of creature images, to select the best image from each creature category (e.g., gazelle, lynx, gorilla, etc).

We tuned the GAN for this task by focusing on the perceptual loss. This loss function component (also used in Stadia’s Style Transfer ML) computes a difference between two images using extracted features from a separate convolutional neural network (CNN) that was previously trained on millions of photographs from the ImageNet dataset. The features are extracted from different layers of the CNN and a weight is applied to each, which affects their contribution to the final loss value. We discovered that these weights were critically important in determining what a final generated image would look like. Below are some examples from the GAN trained with different perceptual loss weights.

我们通过关注感知损失来调整GAN来完成此任务。该损失函数组件(也用于Stadia的Style Transfer ML中)使用从单独的卷积神经网络(CNN)提取的特征来计算两幅图像之间的差异,该卷积神经网络先前已对ImageNet数据集中的数百万张照片进行了训练。从CNN的不同层提取特征,并对每个特征施加权重,这会影响特征对最终损耗值的贡献。我们发现这些权重对于确定最终生成的图像是至关重要的。以下是来自GAN的一些示例,这些示例使用不同的感知损失权重进行了训练。

Dino-Bat Chimeras generated using varying perceptual loss weights.

Some of the variation in the images above is due to the fact that the dataset includes multiple textures for each creature (for example, a reddish or grayish version of the bat). However, ignoring the coloration, many differences are directly tied to changes in perceptual loss values. In particular, we found that certain values brought out sharper facial features (e.g., bottom right vs. top right) or “smooth” versus “patterned” (top right vs. bottom left) that made generated creatures feel more real.

Here are some creatures generated from the GAN trained with different perceptual loss weights, showing off a small sample of the outputs and poses that the model can handle.



Creatures generated using different models.
A generated chimera (Dino-Bat-Hyena, to be exact) created using the conditional GAN. Output from the GAN (left) and the post-processed / composited card (right).

Chimera Painter
The trained GAN is now available in the Chimera Painter demo, allowing artists to work iteratively with the model, rather than drawing dozens of similar creatures from scratch. An artist can select a starting point and then adjust the shape, type, or placement of creature parts, enabling rapid exploration and for the creation of a large volume of images. The demo also allows for uploading a creature outline created in an external program, like Photoshop. Simply download one of the preset creature outlines to get the colors needed for each creature part and use this as a template for drawing one outside of Chimera Painter, and then use the “Load’ button on the demo to use this outline to flesh out your creation.

It is our hope that these GAN models and the Chimera Painter demonstration tool might inspire others to think differently about their art pipeline. What can one create when using machine learning as a paintbrush?
Chimera Painter
现在可以在Chimera Painter演示中使用经过训练的GAN ,允许艺术家迭代使用该模型,而不必从头开始绘制数十种相似的生物。艺术家可以选择一个起点,然后调整生物零件的形状,类型或位置,从而可以快速探索并创建大量图像。该演示还允许上传在外部程序(如Photoshop)中创建的生物轮廓。只需下载预设的生物轮廓之一以获得每个生物部分所需的颜色,并将其用作在Chimera Painter外部绘制一个的模板,然后使用演示中的“加载”按钮使用该轮廓充实您的创建。

我们希望这些GAN模型和Chimera Painter演示工具可以激发其他人对他们的艺术创作思路有所不同。使用机器学习作为画笔时,可以创建什么?

This project is conducted in collaboration with many people. Thanks to Ryan Poplin, Lee Dotson, Trung Le, Monica Dinculescu, Marc Destefano, Aaron Cammarata, Maggie Oh, Richard Wu, Ji Hun Kim, Erin Hoffman-John, and Colin Boswell. Thanks to everyone who pitched in to give hours of art direction, technical feedback, and drawings of fantastic creatures.

此项目是与许多人合作进行的。感谢Ryan Poplin,Lee Dotson,Trung Le,Monica Dinculescu,Marc Destefano,Aaron Cammarata,Maggie Oh,Richard Wu,Ji Hun Kim,Eri​​n Hoffman-John和Colin Boswell。感谢所有参与其中的人,他们提供了数小时的艺术指导,技术反馈以及奇幻生物的绘画。