An implementation of neural style in TensorFlow.
This implementation is a lot simpler than a lot of the other ones out there, thanks to TensorFlow's really nice API and automatic differentiation.
TensorFlow doesn't support L-BFGS (which is what the original authors used), so we use Adam. This may require a little bit more hyperparameter tuning to get nice results.
See here for an implementation of fast (feed-forward) neural style in TensorFlow.
python neural_style.py --content <content file> --styles <style file> --output <output file>
Run python neural_style.py --help
to see a list of all options.
Use --checkpoint-output
and --checkpoint-iterations
to save checkpoint images.
Use --iterations
to change the number of iterations (default 1000). For a 512×512 pixel content file, 1000 iterations take 2.5 minutes on a GeForce GTX Titan X GPU, or 90 minutes on an Intel Core i7-5930K CPU.
Running it for 500-2000 iterations seems to produce nice results. With certain
images or output sizes, you might need some hyperparameter tuning (especially
--content-weight
, --style-weight
, and --learning-rate
).
The following example was run for 1000 iterations to produce the result (with default parameters):
These were the input images used (me sleeping at a hackathon and Starry Night):
The following example demonstrates style blending, and was run for 1000 iterations to produce the result (with style blend weight parameters 0.8 and 0.2):
The content input image was a picture of the Stata Center at MIT:
The style input images were Picasso's "Dora Maar" and Starry Night, with the Picasso image having a style blend weight of 0.8 and Starry Night having a style blend weight of 0.2:
--style-layer-weight-exp
command line argument could be used to tweak how "abstract"
the style transfer should be. Lower values mean that style transfer of a finer features
will be favored over style transfer of a more coarse features, and vice versa. Default
value is 1.0 - all layers treated equally. Somewhat extreme examples of what you can achieve:
(left: 0.2 - finer features style transfer; right: 2.0 - coarser features style trasnfer)
--content-weight-blend
specifies the coefficient of content transfer layers. Default value -
1.0, style transfer tries to preserve finer grain content details. The value should be
in range [0.0; 1.0].
(left: 1.0 - default value; right: 0.1 - more abstract picture)
--pooling
allows to select which pooling layers to use (specify either max
or avg
).
Original VGG topology uses max pooling, but the style transfer paper suggests
replacing it with average pooling. The outputs are perceptually differnt, max pool in
general tends to have finer detail style trasnfer, but could have troubles at
lower-freqency detail level:
(left: max pooling; right: average pooling)
--preserve-colors
boolean command line argument adds post-processing step, which
combines colors from the original image and luma from the stylized image (YCbCr color
space), thus producing color-preserving style trasnfer:
(left: original stylized image; right: color-preserving style transfer)
- TensorFlow
- NumPy
- SciPy
- Pillow
- Pre-trained VGG network (MD5
8ee3263992981a1d26e73b3ca028a123
) - put it in the top level of this repository, or specify its location using the--network
option.
If you use this implementation in your work, please cite the following:
@misc{athalye2015neuralstyle,
author = {Anish Athalye},
title = {Neural Style},
year = {2015},
howpublished = {\url{https://github.com/anishathalye/neural-style}},
note = {commit xxxxxxx}
}
Copyright (c) 2015-2017 Anish Athalye. Released under GPLv3. See LICENSE.txt for details.