Open
Description
I would like to implement something like this: https://arxiv.org/pdf/1511.06309.pdf
Essentially, the idea is to nest a temporal autoencoder (probably using LSTMs) inside the spatial autoencoder, in order to warp the encoding to predict the NEXT frame in a video sequence.