Cross-validating Image Description Datasets and Evaluation Metrics

(@ LREC 2016)

This page provides the full detailed results of our analysis as presented in our paper:

Josiah Wang and Robert Gaizauskas (2016). Cross-validating Image Description Datasets and Evaluation Metrics. LREC 2016 (To Appear).

In the paper, we presented how we used a leave-one-out cross validation (LOOCV) process to analyse and gain different insights to various multiply annotated, human authored image descriptions datasets as well as the evaluation metrics commonly used to evaluate image description generation tasks. We use LOOCV to compute:

  1. Human upper-bounds for eight image description datasets, according to several evaluation metrics
  2. Lower-bounds for the image description datasets, again across metrics


We are currently formatting all those lovely numbers for your viewing pleasure! Please stay tuned!