Yoctol/ADEM

Name: ADEM

Owner: YOCTOL INFO INC.

Description: TOWARDS AN AUTOMATIC TURING TEST: LEARNING TO EVALUATE DIALOGUE RESPONSES

Created: 2017-07-13 03:34:08.0

Updated: 2018-01-05 07:32:16.0

Pushed: 2017-08-25 07:57:40.0

Homepage:

Size: 1070

Language: Python

GitHub Committers

UserMost Recent Commit# Commits

Other Committers

UserEmailMost Recent Commit# Commits

README

Towards An Automatic Turing Test: Learning to Evaluate Dialogue Responses

A Tensorflow Implementation of ADEM - An Automatic Dialogue Evaluation Model
Basic information about ADEM
Brief Introduction

ADEM is an automatic evaluation model for the quality of dialogue, aiming to capture the semantic similarity beyond word overlapping metrics (e.g BLEU, ROUGH, METOER) which correlating badly to human judgement, and calculate its score using extra information the context of conversation besides the reference response and model response.

Learning the vector representations of dialogue context $\mathbf{c} \in \mathcal{R}c$, model response $\hat{\mathbf{r}} \in \mathcal{R}m$ and reference response $\mathbf{r} \in \mathcal{R}^r$ using a hierarchical RNN encoder, ADEM computes the score as follows:

$$\text{score}(c, r, \hat{r}) = (\mathbf{c}TM\hat{\mathbf{r}}+\mathbf{r}TN\hat{\mathbf{r}} -\alpha) / \beta$$

where M, N are learned parameters initialized with identity, $\alpha$, $\beta$ are scalar constants intialized in the range [0, 5]. The first and second term of the score function can be interpreted as the similarity of model response to context and reference response ,respectively in a linear transformation.

ADEM is trained to minimize the model predictions an the human scores with L1 regularizations

$$\mathcal{L} = \sum{i=1:K}[{\text{score}(c_i, r_i, \hat{r_i}) - human_score_i}]^2 + \gamma \|\theta\|1$

where $\theta = \{M, N\}$$

where \gamma is a scalar constant. The model is end to end differentiable and all parameters can be learned by backpropogation.


This work is supported by the National Institutes of Health's National Center for Advancing Translational Sciences, Grant Number U24TR002306. This work is solely the responsibility of the creators and does not necessarily represent the official views of the National Institutes of Health.