Bibtex: @inproceedings{shaham2021asapnet, booktitle = {Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR)}, author = {Tamar Rott Shaham and Michael Gharbi and Richard Zhang and Eli Shechtman and Tomer Michaeli}, title = {Spatially-Adaptive Pixelwise Networks for Fast Image Translation}, year = {2021}, url = {http://arxiv.org/abs/2012.02992v1}, entrytype = {inproceedings}, id = {shaham2021asapnet} }

Abstract

We introduce a new generator architecture, aimed at fast and efficient high-resolution image-to-image translation. We design the generator to be an extremely lightweight function of the full-resolution image. In fact, we use pixel-wise networks; that is, each pixel is processed independently of others, through a composition of simple affine transformations and nonlinearities. We take three important steps to equip such a seemingly simple function with adequate expressivity. First, the parameters of the pixel-wise networks are spatially varying so they can represent a broader function class than simple 1x1 convolutions. Second, these parameters are predicted by a fast convolutional network that processes an aggressively low-resolution representation of the input; Third, we augment the input image with a sinusoidal encoding of spatial coordinates, which provides an effective inductive bias for generating realistic novel high-frequency image content. As a result, our model is up to 18x faster than state-of-the-art baselines. We achieve this speedup while generating comparable visual quality across different image resolutions and translation domains.

Citation Graph
(Double click on nodes to open corresponding papers' pages)

* Showing citation graph for papers within our database. Data retrieved from Semantic Scholar. For full citation graphs, visit ConnectedPapers.