hallucinations-leaderboard

community

https://www.neuralnoise.com

Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

pminervini authored a paper about 20 hours ago

VLM-RobustBench: A Comprehensive Benchmark for Robustness of Vision-Language Models

pminervini authored a paper about 23 hours ago

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks

aryopg authored a paper about 23 hours ago

SCOPE: Self-Play via Co-Evolving Policies for Open-Ended Tasks

View all activity

spaces 1

Hallucinations Leaderboard

View and submit LLM evaluations

models 0

None public yet

datasets 2

hallucinations-leaderboard/requests

Preview • Updated Oct 31, 2024 • 323

hallucinations-leaderboard/results

Updated Oct 31, 2024 • 28.9k • 2