Spatial Transformer Networks - ShortScience.org

#### Problem addressed: A module to spatially transform feature maps conditioned on feature maps itself. Attempts to improve rotation, scale and shift invariance in neural nets. #### Summary: This paper introduces Spatial Transformer Networks (STN) for rotation, shift and scale invariance. The module consists of three parts - Localization function, Grid point generation and Sampling. Each of these modules are differentiable and can be inserted at any point in a standard neural network architecture. The constraints are that the learnt spatial transform must be parametrized. Localization function learns these parameters by looking at the previous layer output (typically HxWxC for convolutional layers) and regressing to the parameters using FC layers or convolutional layers. The source grid generator parameters are learnt the same way. Given these two, the output of the STN is constructed by sampling (using any differentiable kernel) the source grid and using the transform parameters. #### Novelty:
A new module is introduced to increase invariance to rotation, scale and shift #### Drawbacks:
Since only some points in the source feature maps are selected due to grid generation it is unclear how the error is backpropagated to previous layers #### Datasets:
Distorted MNIST, CUB-200-2011 Birds, SVHN #### Resources:
http://arxiv.org/pdf/1506.02025v1.pdf #### Presenter:
Bhargava U. Kota