Common Flaws in Running Human Evaluation Experiments in NLP C Thomson, E Reiter, A Belz Computational Linguistics, 1-11, 2024 Paper Abstract None Direct Link Previous Next