OCEAN is a principled task inference framework for meta-reinforcement learning. It provides accurate task inference by modeling tasks with global and local latent context variables.
This can be cast as an online task inference problem, where the current task identity, represented by a context variable, is estimated from the agent's past experiences with probabilistic inference. Previous approaches have employed simple latent distributions, e.g., Gaussian, to model a single context for the entire task. However, this formulation lacks the expressiveness to capture the composition and transition of the sub-tasks.
We propose a variational inference framework OCEAN to perform online task inference for compositional tasks.
OCEAN models global and local context variables in a joint latent space, where the global variables represent a mixture of sub-tasks required for the task, while the local variables capture the transitions between the sub-tasks. Our framework supports flexible latent distributions based on prior knowledge of the task structure including Dirichlet distribution, Gaussian distribution, Categorical distribution and Logit-normal distribution. The whole framework can be trained in an unsupervised manner without the need of labels for task or sub-task transition.
We created several new environments based on Mujoco. Each environment requires finishing a sequence of sub-task with HalfCheetah and Humanoid agents. OCEAN outperforms previous methods in both sample efficiency and asymptotic performance.
Please refer to our paper for detailed explanations and more results.
The following people contributed to OCEAN:
Hongyu Ren
Yuke Zhu
Jure Leskovec
Anima Anandkumar
Animesh Garg
OCEAN: Online Task Inference for Compositional Tasks with Context Adaptation. H. Ren, Y. Zhu, J. Leskovec, A. Anandkumar, A. Garg. Conference on Uncertainty in Artificial Intelligence (UAI), 2020.