Projects
Decentralized Arena
Building Automated, Robust, and Transparent LLM Evaluation for Numerous Dimensions
Pandora
Towards General World Model with Natural Language Actions and Video States
LLM Reasoners
Library and Evaluation of State-of-the-art Advanced Reasoning with LLMs
MMToM-QA
Multimodal Theory of Mind Question Answering