ST-Think openinterx/ST-R1-mcq 8B • Updated Mar 17, 2025 • 7 openinterx/Ego-ST-video Viewer • Updated Mar 15, 2025 • 803 • 53 • 1 openinterx/Ego-ST-bench Viewer • Updated Mar 29, 2025 • 93 • 258 • 1 ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16, 2025 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16, 2025 • 1
UGC-VideoCap openinterx/UGC-VideoCap Updated Aug 20, 2025 • 379 openinterx/UGC-VideoCaptioner Video-Text-to-Text • 6B • Updated Jul 19, 2025 • 83 • 3 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15, 2025 • 6
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15, 2025 • 6
ST-Think openinterx/ST-R1-mcq 8B • Updated Mar 17, 2025 • 7 openinterx/Ego-ST-video Viewer • Updated Mar 15, 2025 • 803 • 53 • 1 openinterx/Ego-ST-bench Viewer • Updated Mar 29, 2025 • 93 • 258 • 1 ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16, 2025 • 1
ST-Think: How Multimodal Large Language Models Reason About 4D Worlds from Ego-Centric Videos Paper • 2503.12542 • Published Mar 16, 2025 • 1
UGC-VideoCap openinterx/UGC-VideoCap Updated Aug 20, 2025 • 379 openinterx/UGC-VideoCaptioner Video-Text-to-Text • 6B • Updated Jul 19, 2025 • 83 • 3 UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15, 2025 • 6
UGC-VideoCaptioner: An Omni UGC Video Detail Caption Model and New Benchmarks Paper • 2507.11336 • Published Jul 15, 2025 • 6