It is not True that Transformers are Inductive Learners: Probing NLI Models with External Negation

Michael Sullivan

Main: Interpretability and Model Analysis in NLP Oral Paper

Session 10: Interpretability and Model Analysis in NLP (Oral)
Conference Room: Carlson
Conference Time: March 20, 11:00-12:30 (CET) (Europe/Malta)
TLDR:
You can open the #paper-280-Oral channel in a separate window.
Abstract: NLI tasks necessitate a substantial degree of logical reasoning; as such, the remarkable performance of SoTA transformers on these tasks may lead us to believe that those models have learned to reason logically. The results presented in this paper demonstrate that (i) models fine-tuned on NLI datasets learn to treat external negation as a distractor, effectively ignoring its presence in hypothesis sentences; (ii) several near-SoTA encoder and encoder-decoder transformer models fail to inductively learn the law of the excluded middle for a single external negation prefix with respect to NLI tasks, despite extensive fine-tuning; (iii) those models which are are able to learn the law of the excluded middle for a single prefix are unable to generalize this pattern to similar prefixes. Given the critical role of negation in logical reasoning, we may conclude from these findings that transformers do not learn to reason logically when fine-tuned for NLI tasks. Furthermore, these results suggest that transformers may not be able to inductively learn the role of negation with respect to NLI tasks, calling into question their capacity to fully acquire logical reasoning abilities.