Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning Paper • 2504.03784 • Published Apr 3 • 2
Robust Reinforcement Learning from Human Feedback for Large Language Models Fine-Tuning Paper • 2504.03784 • Published Apr 3 • 2
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees Paper • 2510.01268 • Published Sep 29 • 2
An Instrumental Variable Approach to Confounded Off-Policy Evaluation Paper • 2212.14468 • Published Dec 29, 2022
AdaDetectGPT: Adaptive Detection of LLM-Generated Text with Statistical Guarantees Paper • 2510.01268 • Published Sep 29 • 2