New benchmarking tool evaluates the factuality of LLMs
A team of AI researchers and computer scientists from Cornell University, the University of Washington and the Allen Institute for Artificial Intelligence has developed a benchmarking tool called WILDHALLUCINATIONS to evaluate the factuality of multiple […]