Text Data Filtering
π
36
Search through ROOTS corpus using queries
Use AI to translate text between languages
Analyze and visualize dataset characteristics and statistics
Search code snippets in the StarCoder dataset for matches
Explore the OBELICS dataset with an interactive map
Search large text corpora for information
Generate a curated webβtext dataset for LLM training