-
Notifications
You must be signed in to change notification settings - Fork 171
Choosing Parameters for Beginners
By default, almost any image can be used for either the content or style image inputs. Though content and style images with similar shapes, features, color, content/detail ratios (ex: a painted portrait and an image of someone where both figures are roughly the same size, with the same size background), have been found to work particularly well.
It is important to ensure that your style image(s) are high quality. The quality content images will influence the output image to a varying degree, but the style image(s) used will determine the quality of your output image. Things like motion blur in the content image will not result in an output image having motion blur.
-
You can see an example of how content image quality affects output images here.
-
You can see an example of how style image quality affects output images here.
You can use thousands of style images if you choose and that will not increase memory usage. However, it takes time to load style images into the network, and thus using more style images will take longer to load.
Total variance denoising should be set to 0 with -tv_weight 0
if you wish to get as sharp a result as possible. When using the NIN model or another model that produces noticeable artifacts, you can use a really low TV weight to help remove them.
For this guide we are going to use the same parameters for each multiscale generation step, but others have found great success with changing some parameters each step.
For the basic multiscale generation example, we will add a seed value to make our results repeatable. We will also set -tv_weight
to 0
, to make sure our results are as sharp looking as possible. Total variance denoising (TV) works by blurring and smoothing an image, so if you do not want that, then make sure it's set to 0.
python3 neural_style.py -seed 876 -tv_weight 0 -output_image out1.png -image_size 512
python3 neural_style.py -seed 876 -tw_weight 0 -output_image out2.png -init image -init_image out1.png -image_size 720
python3 neural_style.py -seed 876 -tv_weight 0 -output_image out3.png -init image -init_image out2.png -image_size 1024
python3 neural_style.py -seed 876 -tv_weight 0 -output_image out4.png -init image -init_image out3.png -image_size 1536
Choosing content and style weights are an important balancing act which greatly influences your output image.
A higher style weight like: -content_weight 10 -style_weight 4000
this will make the output look a lot more like the style image(s) than these weights: -content_weight 50 -style_weight 100
.
In addition to the ratio of content to style weights, the values themselves are also important:
- Ex:
-content_weight 5 -style_weight 10
will produce a different output than-content_weight 50 -style_weight 100
.
You can also disable the content loss module with -content_weight 0
. As long as you are also using -init image
, the chosen style weight will still be extremely important as the content image still exists in the network.
In general higher style weights seem to produce better results, as long as the content weight isn't too high as well.
Both the VGG and NIN model architectures are hierarchical. That means that each layer is connected to the layer above it and the layer below.
Lower layers focus on details like texture and color, higher layers focus on objects, and even higher layers focus on large objects or even the entire scene in the image.
For the purpose of style transfer, the chosen style layers will affect the output image far more than the chosen content layers.
Generally, you want to avoid making your artwork look like a cheap filter, and more like a new and wonderful work of art.
There is also the issue of whether your artwork looks unique enough from your input images. You an use https://www.tineye.com/ and Google Reverse Image Search to see if they can figure out what my content image is, and then judge just how "unique" your artwork is based on that. If you are using personal images that aren't easily found on the internet, then both TinEye and Google Reverse Image search won't work at all for this test.
The -cudnn_autotune
uses a little extra memory to try and speed things up. You can omit this parameter if you wish to use a bit less GPU memory.
You can find other examples of parameters used by others in these places:
You can also find parameters shared by others at these links (using the location's local search feature):