Collections
Discover the best community collections!
Collections including paper arxiv:2512.17012
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 10 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 17 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 25
-
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
Paper • 2312.15715 • Published • 20 -
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
Paper • 2505.23747 • Published • 69 -
VideoPrism: A Foundational Visual Encoder for Video Understanding
Paper • 2402.13217 • Published • 40 -
Scaling RL to Long Videos
Paper • 2507.07966 • Published • 162
-
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper • 2508.14879 • Published • 69 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 43 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Multi-View 3D Point Tracking
Paper • 2508.21060 • Published • 23
-
ReVSeg: Incentivizing the Reasoning Chain for Video Segmentation with Reinforcement Learning
Paper • 2512.02835 • Published • 10 -
Joint 3D Geometry Reconstruction and Motion Generation for 4D Synthesis from a Single Image
Paper • 2512.05044 • Published • 17 -
Entropy Ratio Clipping as a Soft Global Constraint for Stable Reinforcement Learning
Paper • 2512.05591 • Published • 17 -
SpaceControl: Introducing Test-Time Spatial Control to 3D Generative Modeling
Paper • 2512.05343 • Published • 25
-
MeshCoder: LLM-Powered Structured Mesh Code Generation from Point Clouds
Paper • 2508.14879 • Published • 69 -
VoxHammer: Training-Free Precise and Coherent 3D Editing in Native 3D Space
Paper • 2508.19247 • Published • 43 -
Pixie: Fast and Generalizable Supervised Learning of 3D Physics from Pixels
Paper • 2508.17437 • Published • 37 -
Multi-View 3D Point Tracking
Paper • 2508.21060 • Published • 23
-
UniRef++: Segment Every Reference Object in Spatial and Temporal Spaces
Paper • 2312.15715 • Published • 20 -
Spatial-MLLM: Boosting MLLM Capabilities in Visual-based Spatial Intelligence
Paper • 2505.23747 • Published • 69 -
VideoPrism: A Foundational Visual Encoder for Video Understanding
Paper • 2402.13217 • Published • 40 -
Scaling RL to Long Videos
Paper • 2507.07966 • Published • 162