OpenRLHF/Llama-3-8b-rm-700k
Text Ranking
•
8B
•
Updated
•
107
•
3
OpenRLHF/Llama-3-8b-rm-mixture
8B
•
Updated
•
64
•
1
OpenRLHF/Llama-2-7b-rm-anthropic_hh-lmsys-oasst-webgpt
7B
•
Updated
•
4
•
1
OpenRLHF/Mistral-7b-PRM-Math-Shepherd
7B
•
Updated
•
12
•
1
OpenRLHF/Llama-3-8b-iter-dpo-179k
Text Generation
•
8B
•
Updated
•
4
OpenRLHF/Llama-3-8b-rlhf-100k
Text Generation
•
8B
•
Updated
•
18
•
4
OpenRLHF/Llama-3-8b-sft-mixture
Text Generation
•
8B
•
Updated
•
1.56k
•
•
1
OpenRLHF/Llama-2-7b-sft-model-ocra-500k
Text Generation
•
7B
•
Updated
•
10
OpenRLHF/Llama-2-13b-rm-anthropic_hh-lmsys-oasst-webgpt
13B
•
Updated
•
4
OpenRLHF/Llama-2-13b-sft-model-ocra-500k
Text Generation
•
13B
•
Updated
•
8
•
1