Common flaws in running human evaluation experiments in NLP

C Thomson, E Reiter, A Belz Computational Linguistics 50 (2), 795-805, 2024

Abstract

None