Hugging Face's logo Hugging Face
  • Models
  • Datasets
  • Spaces
  • Buckets new
  • Docs
  • Enterprise
  • Pricing
    • Website
      • Tasks
      • HuggingChat
      • Collections
      • Languages
      • Organizations
    • Community
      • Blog
      • Posts
      • Daily Papers
      • Learn
      • Discord
      • Forum
      • GitHub
    • Solutions
      • Team & Enterprise
      • Hugging Face PRO
      • Enterprise Support
      • Inference Providers
      • Inference Endpoints
      • Storage Buckets

  • Log In
  • Sign Up

WildEval

non-profit
wild_eval
WildEval
Activity Feed Request to join this org

AI & ML interests

None defined yet.

Recent Activity

yuntian-deng  authored a paper 1 day ago
Long Grounded Thoughts: Distilling Compositional Visual Reasoning Chains at Scale
yuntian-deng  authored a paper 1 day ago
DetectZoo: A Unified Toolkit for AI-Generated Content Detection Across Text, Audio, and Image Modalities
yuntian-deng  authored a paper 1 day ago
Code2LoRA: Hypernetwork-Generated Adapters for Code Language Models under Software Evolution
View all activity

Bill Yuchen Lin's profile pictureYuntian Deng's profile pictureAbhilasha Ravichander's profile pictureValentina Pyatkin's profile pictureKhyathi Raghavi Chandu's profile pictureFaeze Brahman's profile pictureRonan Le Bras's profile pictureDongfu Jiang's profile pictureChengsong Huang's profile picture

spaces 1

pinned
Runtime error
Agents
6

Zebra Logic Bench

🦓

Explore and evaluate Zebra Logic models

Apr 11, 2025

models 0

None public yet

datasets 9

WildEval/ZebraLogic

Viewer • Updated Feb 4, 2025 • 4.26k • 1.33k • 15

WildEval/G-PlanET

Viewer • Updated Aug 1, 2024 • 1.42k • 25 • 1

WildEval/ZeroEval

Viewer • Updated Jul 23, 2024 • 4.61k • 853

WildEval/WildBench-V2

Viewer • Updated May 22, 2024 • 2.05k • 171

WildEval/WildBench-Results-v2-internal

Viewer • Updated May 21, 2024 • 30k • 229

WildEval/WildBench-Results-V2

Viewer • Updated May 20, 2024 • 10.2k • 220

WildEval/WildBench-v2-dev

Viewer • Updated Apr 19, 2024 • 5.99k • 7

WildEval/WildBench-dev

Viewer • Updated Apr 19, 2024 • 14.1k • 13 • 1

WildEval/NaturalChats

Viewer • Updated Apr 18, 2024 • 641k • 5
Company
TOS Privacy About Careers
Website
Models Datasets Spaces Pricing Docs