AVMeme Exam: A Multimodal Multilingual Multicultural Benchmark for LLMs' Contextual and Cultural Knowledge and Thinking Paper • 2601.17645 • Published Jan 25 • 23
LiveMCP-101: Stress Testing and Diagnosing MCP-enabled Agents on Challenging Queries Paper • 2508.15760 • Published Aug 21, 2025 • 47
MMTok: Multimodal Coverage Maximization for Efficient Inference of VLMs Paper • 2508.18264 • Published Aug 25, 2025 • 25