Projects
WM-ABench
An atomic evaluation of the world modeling abilities of Vision-Language Models
SimWorld
An Open-ended Simulator for Agents in Physical and Social Worlds

Voice-Language Foundation Models for Real-Time Autonomous Interaction and Speech Roleplay
ReasonerAgent
A fully open source, ready-to-run agent that does research in a web browser and answers your queries
Decentralized Arena
Building Automated, Robust, and Transparent LLM Evaluation for Numerous Dimensions
Pandora
Towards General World Model with Natural Language Actions and Video States
LLM Reasoners
Library and Evaluation of State-of-the-art Advanced Reasoning with LLMs
MMToM-QA
Multimodal Theory of Mind Question Answering