1

On Learning Intrinsic Rewards for Policy Gradient Methods

Hierarchical Reinforcement Learning for Zero-shot Generalization with Subtask Dependencies

Self-Imitation Learning

Value Prediction Network

Zero-Shot Task Generalization with Multi-Task Deep Reinforcement Learning

Control of Memory, Active Perception, and Action in Minecraft

Learning Transferrable Knowledge for Semantic Segmentation with Deep Convolutional Neural Network

Action-Conditional Video Prediction using Deep Networks in Atari Games