FACTS Benchmark Suite: Systematically evaluating the factuality of large language models | AI 资讯 | 云织星·工具台

Systematically evaluating the factuality of large language models with the FACTS Benchmark Suite.

如页面未自动加载，请开启 JavaScript。