Pretrained Anime StyleGAN2 — convert to pytorch and editing images by encoder
Allen NgFollow
Jan 31, 2020 · 4 min read
Using a pretrained anime stylegan2, convert it pytorch, tagging the generated images and using encoder to modify generated images.
Recently Gwern released a pretrained stylegan2 model to generating anime portraits. The result looks quite good, let’s try it out.
Convert to pytorch
Since the official implementation of stylegan2 is highly depends on GPUs machine, it maybe very painful to make it able to run on CPU and requires a lot of code modification.
I found a very good repository that able to convert the pretrained stylegan2 weights from tensorflow to pytorch. Able to produce the same result using the official FFHD dataset on pytorch.
You can start from playing with the official FFHD model first, but here we will directly start with the anime model.
After cloning the repository and downloading the model from here, we can simply run
python run_convert_from_tf.py --input=2020-01-11-skylion-stylegan2-animeportraits-networksnapshot-024664.pkl --output checkpoint
Converted model should store inside checkpoint
folder
Let’s try to generate some images
python run_generator.py generate_images --network=checkpoint/Gs.pth --seeds=66,230,389,1518 --truncation_psi=1.0
You should able to see these 4 images generated under result
Tagging generated images
The idea also came from Gwern’s website, We can use KichangKim’s DeepDanbooru to tag the generated images.
I am using the v2 model in this time, after downloading the model we can see it’s a cntk
model, which is not so friendly to mac user like me (They still don’t have official wheel for mac, we can only use it inside docker).
Okay, I don’t want to spin up docker, so I will convert the cntk
model to onnx
and run it via onnxruntime
Convert code can be found here, good thing is both cntk and onnx are microsoft’s product, so the convert is also very easy. I am using kaggle notebook to do the convert since I don’t want to spin up the docker, so kaggle notebook is a good linux env for me to do it!
Let’s use the onnx model to tag some image!
OK seems it’s working, we can now proceed to batch tagging the generated images, here we are referencing the Puzer’s stylegan-encoder, I will generate same size of samples *(n=20000). *But also from halcy’s pervious work, maybe (*n=6000*) already enough.
As stylegan2 have a similar structure as stylegan, we need to store 3 things, latent vector *z, *d-latents vector *d, *and the tag scores. The detailed code can refer to this notebook. The process can take up to 4–5 hours if u are running on a singe GPU machine. If you are running on CPU machine, can try to reduce the sample size.
Train the encoder
A