In a groundbreaking development that showcases the rapid advancement of artificial intelligence, xAI’s Grok 4 Fast Reasoning has claimed the top position on the Extended New York Times Connections Benchmark, processing an impressive 759 puzzles and establishing a new performance record. This achievement marks a significant milestone in AI reasoning capabilities and demonstrates xAI’s growing dominance in the competitive AI landscape.
Breaking Down the Extended NYT Connections Benchmark Results
The Extended NYT Connections Benchmark represents one of the most challenging tests for large language models (LLMs), evaluating AI systems using 759 NYT Connections puzzles with additional words included to increase difficulty. Unlike the standard benchmark that has been nearing saturation, this extended version pushes AI models to their limits by introducing up to four extra trick words to each puzzle.
Key Performance Highlights:
- Grok 4 Fast Reasoning: Ranks #1 with record-breaking performance
- Grok 4: Secures #2 position, showcasing xAI’s comprehensive model strength
- Outperforms major competitors including OpenAI’s GPT-5, o3-pro medium reasoning, Google’s Gemini 2.5 Pro, DeepSeek, and Qwen 3
What Makes This Achievement Significant?
The New York Times Connections puzzle requires sophisticated abstract reasoning abilities, pattern recognition, and linguistic understanding. Previous research has shown that even the best-performing LLMs could only fully solve 8% of standard Connections games, making this breakthrough particularly impressive.
The Enhanced Challenge
The Extended benchmark increases difficulty by:
- Adding up to four extra trick words per puzzle
- Creating more complex semantic relationships
- Testing deeper reasoning capabilities
- Requiring more nuanced pattern recognition
xAI’s Dominance in AI Reasoning
This achievement represents more than just a single benchmark victory. xAI recently launched Grok-4-Fast, delivering top-tier performance with 98% cost reduction compared to previous models, demonstrating the company’s ability to combine efficiency with superior performance.
Technical Advantages of Grok 4 Fast Reasoning
The success of Grok 4 Fast Reasoning can be attributed to several key innovations:
Unified Architecture: The model merges “reasoning” and “non-reasoning” behaviors into a single set of weights controllable via system prompts, allowing for more flexible and efficient processing.
Advanced Training Methods: The model was trained end-to-end with tool-use reinforcement learning, enabling more sophisticated problem-solving approaches.
Extensive Context Window: Features a 2 million token context window, allowing for comprehensive understanding of complex puzzle relationships.
Competitive Landscape Analysis
The benchmark results reveal a shifting competitive landscape in AI reasoning:
xAI’s Double Victory
- Grok 4 Fast Reasoning: #1 position
- Grok 4: #2 position
This one-two finish demonstrates xAI’s comprehensive approach to AI development, offering both speed-optimized and performance-maximized solutions.
Major Competitors’ Performance
The results show that established AI leaders like OpenAI, Google, and others are facing increased competition from xAI’s innovative approaches. This benchmark specifically highlights the challenges that even advanced models like GPT-5 and Gemini 2.5 Pro face when confronting complex reasoning tasks.
Implications for AI Development
Cost-Effective Intelligence
Grok 4 Fast performs on par with Grok 4 in most tasks but uses about 40 percent less compute, with price per task dropping by as much as 98 percent. This efficiency breakthrough suggests that high-performance reasoning capabilities can be achieved without proportional increases in computational costs.
Real-World Applications
The success on the Extended NYT Connections Benchmark indicates potential applications in:
- Complex problem-solving scenarios
- Pattern recognition tasks
- Educational assessment tools
- Creative writing and content generation
- Strategic planning and analysis
Technical Specifications and Availability
Model Features
- Performance: Record-breaking Extended NYT Connections Benchmark results
- Efficiency: 40% less compute usage compared to standard Grok 4
- Context: 2 million token context window
- Pricing: $0.20/million input tokens and $0.50/million output tokens
Access Options
Grok 4 is available to SuperGrok and Premium+ subscribers, as well as through the xAI API, with free access available on various platforms.
Looking Forward: The Future of AI Reasoning
This benchmark achievement represents more than a technical milestone; it signals a new era in AI reasoning capabilities. The ability to excel at abstract reasoning tasks like the Extended NYT Connections puzzles suggests that AI models are approaching human-like cognitive flexibility in specific domains.
Industry Impact
The results demonstrate that:
- Competition in AI reasoning is intensifying
- Cost-effective high performance is achievable
- Abstract reasoning capabilities are rapidly advancing
- New benchmarks may be needed as current ones approach saturation
Conclusion
Grok 4 Fast Reasoning’s record-breaking performance on the Extended NYT Connections Benchmark with 759 puzzles marks a significant milestone in AI development. By achieving the #1 ranking while maintaining cost efficiency, xAI has demonstrated that the future of artificial intelligence lies not just in raw computational power, but in intelligent optimization and innovative architectural approaches.
This achievement, combined with Grok 4’s #2 ranking, establishes xAI as a formidable force in the AI landscape, challenging established players and setting new standards for what’s possible in artificial reasoning capabilities. As these models become more accessible and cost-effective, we can expect to see rapid adoption across industries requiring sophisticated problem-solving and abstract reasoning capabilities.
For the latest updates on AI benchmarks and model performances, stay tuned to emerging research and official announcements from leading AI companies.