Review Report
MetaReview
This paper extents a previous work of energy scores to detect OOD utterances in open dialogue systems. All reviewers agree that the proposed approach is an interesting idea. Main concerns are about experimental comparisons. The authors addressed most “reason to reject” comments of the reviewers. However, reviewer 2’ comments on the weakness are not fully addressed. Especially, the comments on variety and coherency of augmentation.
Overall Recommendation
Review #1: 3.5/5; Review #2: 3/5; Review #3: 3.5/5
Postscript (by Yawen)
———2023-07-26————
The reason why OOD utterances akin to IND utterances are more effective is attributed to spurious correlation. If you are interested in the spurious correlation in NLP, these papers may give you more inspiration:
- Competency Problems: On Finding and Removing Artifacts in Language Data
- Counterfactual VQA: A Cause-Effect Look at Language Bias
Besides, I strongly recommend you to read The book of Why by Judea Pearl. From the lenses of causality, everything becomes clear.
———2021-06-25————
Admittedly, I’m not that satisfied with this paper.
There are some claims in the paper that I didn’t explain carefully, such as why OOD utterances akin to IND utterances could be more effective in shaping the energy gap. If you are interested in this problem, these papers may help you:
- Contrastive Training for Improved Out-of-Distribution Detection
Jim Winkens, Rudy Bunel, Abhijit Guha Roy, Robert Stanforth, Vivek Natarajan, Joseph R. Ledsam, Patricia MacWilliams, Pushmeet Kohli, Alan Karthikesalingam, Simon Kohl, Taylan Cemgil, S. M. Ali Eslami, Olaf Ronneberger
arXiv:2007.05566 - Prevalence of neural collapse during the terminal phase of deep learning training
Vardan Papyan, XY Han, and David L Donoho.
arXiv:2008.08186 - ReduNet: A White-box Deep Network from the Principle of Maximizing Rate Reduction
Kwan Ho Ryan Chan, Yaodong Yu, Chong You, Haozhi Qi, John Wright, Yi Ma
arXiv:2105.10446
I am also exploring how to calibrate the classifier without auxiliary OOD utterances. Let’s see what will happen :)
There are also some typos in our ACL version:
- In section 3.2, “the energy function $-E(\mathbf{u})$” is wrong, the correct version is “the negative energy function $-E(\mathbf{u})$”.
- The parentheses of exponential function in Equation 3 and 4 should be “()”, not “[]”.
- In the implementation paragraph in Section 4.1, “$E_y$” should be “$\mathbf{E}_y$”.
- In Table 1, “validation” should be “Validation”.
I am very sorry if they have confused you in your reading, and this is our updated version: link