NeurIPS 2019 Disentanglement Challenge Rules

How to participate / summary

Contestants can participate by implementing a trainable disentanglement algorithm and submitting it to the evaluation server. Participants will have to submit their code to AIcrowd which will be evaluated via the AIcrowd evaluators to come up with their score (as described below).

The submitted method will access a dataset on the evaluation server. The challenge objective is to let the method automatically determine the dataset's independent factors of variation in an unsupervised fashion.

In order to prevent overfitting, the dataset used to compute the final scores is kept completely hidden from participants until the respective challenge stage is terminated. Participants are encouraged to find robust disentanglement methods that work well without the need for manual adjustments.

Additionally, participants are required to submit a three-page report on their method to OpenReview. Detailed requirements for the report are given below.

The final score used to rank the participants and determine winners is a mixture of several disentanglement metrics. Details about the evaluation procedure can be found below.

The library disentanglement_lib [Link to Github repository: https://github.com/google-research/disentanglement_lib ] provides several datasets, disentanglement methods and evaluation metrics, which gives participants an easy way to get started.

A public leaderboard shows rankings of the methods submitted until then (but not taking into account the quality of the report). The final ranking can differ substantially from the one on the public leaderboard because a different dataset will be used to determine the score.

Challenge stages


The challenge is split into two stages. In each stage there are three different datasets (all of which include labels):

  1. a datasets which is publicly available,
  2. a dataset which is used for the public leaderboard,
  3. a dataset which is used for the private leaderboard.

The participants may use dataset (1) to develop their methods. Each method which is submitted will be retrained and evaluated on dataset (2) as well as on dataset (3).

In the first stage, the goal is to transfer from simulated to real images. Dataset (1) will correspond to simplistic simulated images, (2) to more realistically simulate images and (3) will consist of real images of the same setup.

In the second stage the goal is to transfer to unseen objects. (1) will consist of all datasets used in the first phase, (2) will consist of realistically simulated images of objects which are not included in (1) and (3) will consist of real images of those unseen objects.

Note: while the publicly released dataset is ordered according to the factors of variation, the private datasets will be randomly permuted prior to training.


  • June 10th, 2019: Stage 1 starts
  • July 19th, 2019, 11:59pm AoE Submission deadline for methods, Stage 1
  • August 2nd, 2019, 11:59pm AoE: Submission deadline for reports, Stage 1
  • August 5th, 2019: Stage 2 starts.
  • September 10th, 2019, 11:59pm AoE Submission deadline for methods, Stage 2
  • September 24th, 2019, 11:59pm AoE : Submission deadline for reports, Stage 2

[TODO: Provide details about how exactly the submission takes place. What type of containers are used]


Contestants have to provide a three page document (with additional up to 10page references and supplement) providing all necessary details of their approach and corresponding to their code submission on openreview.net. This guarantees the reproducibility of the results and the transparency needed to advance the state of the art for learning disentangled representations. The report has to be submitted according to the deadlines provided above.

Participants are required to use a LaTeX template we provide to prepare the reports, changing formatting is not allowed. The template will be released before Stage 1 ends.

The report has a maximum length of three pages with an appendix of ten pages. However, reviewers are not required to consider the appendix and all essential information must be contained in the main body of the report. Submissions must fulfil some essential requirements in terms of clarity and precision: The information contained in the report should be sufficient for an experienced member of the community to reimplement the proposed method (including hyperparameter optimization) and reviewers may check coherence between the report and the submitted code.

Reports which do not satisfy those requirements will be disqualified along with the corresponding methods.

We encourage all participants to actively engage in discussions on OpenReview. While not a formal challenge requirement, every participant who submits a report should commit to review or comment on at least three submissions which are not your own.

Evaluation criteria

Methods are scored as follows:

The model is evaluated on the full dataset using each of the following metrics (as implemented in disentanglement_lib [link to metrics folder on github: https://github.com/google-research/disentanglement_lib/tree/master/disentanglement_lib/evaluation/metrics ]):

  • IRS
  • DCI
  • Factor-VAE
  • MIG
  • SAP-Score

The final score for a method is determined as follows:

  • All participants' methods are ranked independently according to each of the five evaluation metrics
  • Those 5 ranks are summed up for each method to give the method's final score (lower is better)

Teams whose reports do not satisfy basic requirements of clarity and thoroughness (as detailed above) will be disqualified.

Furthermore, the goal of this challenge is to advance the state-of-the-art in representation learning, hence we reserve the right to disqualify methods which are overly tailored to the type of data used in this challenge. By overly tailored methods we mean methods which will by design not work on slightly different problems, e.g. a slightly different mechanical setup for moving the objects.

The organizers may decide to change the computation of the scores for Stage 2. If so, this will be announced at the end of Stage 1.

Prizes are awarded to the participants in the order of their methods' final scores (the lowest score wins), excluding participants who are not eligible for prizes. Prizes are awarded independently in each of the two challenge stages.


In each of the two stages, the following prizes are awarded to the participants with the best scores.

Stage 1

Winner: 3,000 EUR

Runner-up: 1,500 EUR

Third-place: 1,000 EUR

Best Paper: 3,000 EUR

Runner-up best paper: 1,500 EUR

Stage 2

Winner: 3,000 EUR

Runner-up: 1,500 EUR

Third-place: 1,000 EUR

Best Paper: 3,000 EUR

Runner-up best paper: 1,500 EUR

Additionally, we will try to provide NeurIPS 2019 conference tickets to the top performing teams. Details about how many tickets are awarded will be given before Stage 1 ends.

The reports of the best-performing teams will be published in JMLR proceedings. The number of reports and further details will be announced before Stage 1 ends.

The winners are determined independently in each of the rounds. Winners of Stage 1 are not excluded from winning prizes in Stage 2.

We award two prizes for each stage to the papers that were determined by the jury to be the most innovative and best described.

Cash prizes will be paid out to an account specified by the organizer of each team. It is the responsibility of the team organizer to distribute the prize money according their team-internal agreements.


  • The organizers will not be able to transfer prize money to accounts of any of the following countries or regions. (Please note that residents of these countries or regions are still allowed to participate in the challenge.)
    • The Crimea region of Ukraine
    • Cuba
    • Iran
    • North Korea
    • Sudan
    • Syria
    • Quebec, Canada
    • Brazil
    • Italy
  • Members / employees of the following institutions may participate in the challenge, but are excluded from winning any prizes:
    • Max Planck Institute for Intelligent Systems
    • ETH Zurich
    • Google AI Zurich
  • Reviewers of the paper "On the Role of Inductive Bias From Simulation and the Transfer to the Real World: a New Disentanglement Dataset" may participate in the challenge, but are excluded from winning any prizes

Further rules

  • Participants may participate alone or in a team of up to 6 people in one or both stages of the challenge.
  • Individuals are not allowed to enter the challenge using multiple accounts. Each individual can only be part of one team.
  • To be eligible to win prizes, participants agree to release their code under an OSI approved license.
  • The organizers reserve the right to change the rules if doing so is absolutely necessary to resolve unforeseen problems.
  • The organizers reserve the right to disqualify participants who are violating the rules or engage in scientific misconduct.
  • The organizers reserve the right to disqualify any participant engaging in unscientific behavior or which harm a successful organization of the challenge in any way.