Om AI Lab

company

https://github.com/om-ai-lab

OmAI_lab

om-ai-lab

Activity Feed

AI & ML interests

Multimodal AI, Agents

Recent Activity

tianchez submitted a paper 1 day ago

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

kyusonglee updated a model about 2 months ago

omlab/opentrackvla-qwen06b

Zilun updated a dataset 4 months ago

omlab/SARDet_REC6_NORM-FS

View all activity

Papers

Which Pretraining Paradigm Better Serves Spatial Intelligence? An Empirical Comparison of Vision-Language and Video Generation Models

VLM-FO1: Bridging the Gap Between High-Level Reasoning and Fine-Grained Perception in VLMs

View all Papers

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Mar 25, 2025

• 2

Improving Object Detection through Reinforcement Learning with VLM-R1

Mar 25, 2025

• 3

omlab 's Spaces 5

Open Agent Leaderboard

🥇

Open Agent Leaderboard

VLM R1 Referral Expression

💬

Mark regions in images based on text descriptions

OmAgent

💬

Process and answer questions about webpage videos

VLM R1 OVD

👁

VLM-R1 model for Open-Vocabulary Object Detection

README

🏆

AI & ML interests

Recent Activity

Papers

Articles

Trials, Errors, and Breakthroughs: Our Rocky Road to OVD SOTA with Reinforcement Learning

Improving Object Detection through Reinforcement Learning with VLM-R1

Team members 5

omlab 's Spaces 5 Sort: Recently updated

Open Agent Leaderboard

VLM R1 Referral Expression

OmAgent

VLM R1 OVD

README

omlab 's Spaces 5