- 【论文集】The classical papers and codes about generative adversarial nets
2.【博客】Deep Learning Paper Implementations: Spatial Transformer Networks - Part I
The first three blog posts in my “Deep Learning Paper Implementations” series will cover Spatial Transformer Networks introduced by Max Jaderberg, Karen Simonyan, Andrew Zisserman and Koray Kavukcuoglu of Google Deepmind in 2016. The Spatial Transformer Network is a learnable module aimed at increasing the spatial invariance of Convolutional Neural Networks in a computationally and parameter efficient manner.
3.【博客】GTA V + Universe
The Universe integration with Grand Theft Auto V, built and maintained by Craig Quiter's DeepDrive project, is now open-source. To use it, you'll just need a purchased copy of GTA V, and then your Universe agent will be able to start driving a car around the streets of a high-fidelity virtual world.
GTA V in Universe gives AI agents access to a rich, 3D world. This video shows the frames fed to the agent (artificially slowed to 8FPS, top left), diagnostics from the agent and environment (bottom left), and a human-friendly free camera view (right). The integration modifies the behavior of people within GTA V to be non-violent.
4.【论文&代码】Learning Python Code Suggestion with a Sparse Pointer Network
To enhance developer productivity, all modern integrated development environments(IDEs) include code suggestion functionality that proposes likely next tokens at the cursor. While current IDEs work well for statically-typed languages, their reliance on type annotations means that they do not provide the same level of support for dynamic programming languages as for statically-typed languages. Moreover, suggestion engines in modern IDEs do not propose expressions or multi-statement idiomatic code. Recent work has shown that language models can improve code suggestion systems by learning from software repositories. This paper introduces a neural language model with a sparse pointer network aimed at capturing very longrange dependencies. We release a large-scale code suggestion corpus of 41M lines of Python code crawled from GitHub. On this corpus, we found standard neural language models to perform well at suggesting local phenomena, but struggle to refer to identifiers that are introduced many tokens in the past. By augmenting a neural language model with a pointer network specialized in referring to predefined classes of identifiers, we obtain a much lower perplexity and a 5 percentage points increase in accuracy for code suggestion compared to an LSTM baseline. In fact, this increase in code suggestion accuracy is due to a 13 times more accurate prediction of identifiers. Furthermore, a qualitative analysis shows this model indeed captures interesting long-range dependencies, like referring to a class member defined over 60 tokens in the past.
5.【论文&代码】 StackGAN: Text to Photo-realistic Image Synthesis with Stacked Generative Adversarial Networks
Synthesizing photo-realistic images from text descriptions is a challenging problem in computer vision and has many practical applications. Samples generated by existing text-to-image approaches can roughly reflect the meaning of the given descriptions, but they fail to contain necessary details and vivid object parts. In this paper, we propose stacked Generative Adversarial Networks (StackGAN) to generate photo-realistic images conditioned on text descriptions. The Stage-I GAN sketches the primitive shape and basic colors of the object based on the given text description, yielding Stage-I low resolution images. The Stage-II GAN takes Stage-I results and text descriptions as inputs, and generates high resolution images with photorealistic details. The Stage-II GAN is able to rectify defects and add compelling details with the refinement process. Samples generated by StackGAN are more plausible than those generated by existing approaches. Importantly, our StackGAN for the first time generates realistic 256 × 256 images conditioned on only text descriptions, while state-of-the-art methods can generate at most 128 × 128 images. To demonstrate the effectiveness of the proposed StackGAN, extensive experiments are conducted on CUB and Oxford-102 datasets, which contain enough object appearance variations and are widely-used for text-toimage generation analysis.