I woke up to another mind-blowing AI breakthrough — this time from Salesforce Research.
They’ve unveiled a new AI training pipeline called Webscale-RL, which delivers the same performance as traditional continual pre-training — using 99% less data.
That’s right: it achieves equivalent results with 100× fewer tokens.
Here’s how the Webscale-RL pipeline works:
- 🧠 Extract Value: It crawls massive web datasets, automatically filtering out junk and keeping only content with verifiable factual grounding.
- 👥 Define Perspective: Each document is viewed through multiple lenses — assigning “personas” (like expert, novice, or critic) to create a spectrum of diverse questions.
- 🧩 Generate Puzzles: An LLM then generates question–answer pairs — effectively “puzzles” for another AI model to solve during RL training.
- ✅ Verify Integrity: A specialized checker validates that every puzzle can be solved directly from the source text — ensuring it’s not just pattern-matching or word-finding.
The result? Models trained via Webscale-RL reach the same benchmark scores as continual pre-training, while consuming a fraction of the data and compute.
This isn’t just a technical upgrade — it’s a fundamental shift in AI efficiency. If scaling laws defined the last era, data laws might define the next one.
The 1.2M-example Webscale-RL dataset and pipeline are fully open-sourced — available now via @SFResearch.
This could redefine how we think about scaling intelligence — not with more data, but with smarter data.

