OpenVLThinker: An Early Exploration to Complex Vision-Language Reasoning via Iterative Self-Improvement Paper β’ 2503.17352 β’ Published Mar 21 β’ 24
DuoGuard: A Two-Player RL-Driven Framework for Multilingual LLM Guardrails Paper β’ 2502.05163 β’ Published Feb 7 β’ 22
Flow-DPO: Improving LLM Mathematical Reasoning through Online Multi-Agent Learning Paper β’ 2410.22304 β’ Published Oct 29, 2024 β’ 18
Enhancing Large Vision Language Models with Self-Training on Image Comprehension Paper β’ 2405.19716 β’ Published May 30, 2024
MIRAI: Evaluating LLM Agents for Event Forecasting Paper β’ 2407.01231 β’ Published Jul 1, 2024 β’ 18
Mitigating Object Hallucination in Large Vision-Language Models via Classifier-Free Guidance Paper β’ 2402.08680 β’ Published Feb 13, 2024 β’ 1
Robust Learning with Progressive Data Expansion Against Spurious Correlation Paper β’ 2306.04949 β’ Published Jun 8, 2023
Rephrase and Respond: Let Large Language Models Ask Better Questions for Themselves Paper β’ 2311.04205 β’ Published Nov 7, 2023 β’ 5
Towards Understanding Mixture of Experts in Deep Learning Paper β’ 2208.02813 β’ Published Aug 4, 2022 β’ 1
Understanding Transferable Representation Learning and Zero-shot Transfer in CLIP Paper β’ 2310.00927 β’ Published Oct 2, 2023 β’ 1
Self-Play Fine-Tuning Converts Weak Language Models to Strong Language Models Paper β’ 2401.01335 β’ Published Jan 2, 2024 β’ 68