Evaluation 1 Foundational Autoraters, Taming Large Language Models for Better Automatic Evaluation 2024/08/01