FACTS Grounding: A new benchmark for evaluating the factuality of large language models | AI 资讯 | 云织星·工具台

Our comprehensive benchmark and online leaderboard offer a much-needed measure of how accurately LLMs ground their responses in provided source material and avoid hallucinations

查看原文

如页面未自动加载,请开启 JavaScript。