Grok-2, the latest AI chatbot integrated into and trained on content from X, has just launched in beta, marking a significant improvement over its predecessor. This new version now ranks among top AI chatbots like ChatGPT, Claude, and Google Gemini.
Shortly after its debut, Grok-2 secured a spot in the top five on the LMSys chatbot arena leaderboard, which rates leading language models based on human evaluations. Typically dominated by Google, OpenAI, and Anthropic, Grok-2’s performance was a major achievement for its creator, xAI.
Grok-2 has also received a visual update, resembling other chatbot interfaces more closely, and now includes the ability to generate images using Flux, an AI image generation model from Black Forest Labs, which rivals the quality of industry leader Midjourney.
After testing Grok-2 for several days, I found it to be as responsive as ChatGPT, with a better sense of humor and the added advantage of responding to real-time events via X.
Grok-2-mini is available to Premium subscribers of X. When you first open Grok, you’ll see a standard “ask” box, a list of suggested ideas, and trending topics from X that Grok can explain or provide answers to.
1. The Ego Check
My initial test was a search for “who is Ryan Morrison,” and I quickly realized the humbling experience of sharing a name with someone more famous (in this case, the video game attorney Ryan Morrison). However, adding “AI journalist” to the query yielded accurate results.
Grok pulled in information from my X bio, my Tom’s Guide profile, and other content I’ve posted on X or that others have posted about me—almost entirely related to AI.
It also displayed X posts that referenced me, though only one was relevant, with the rest coming from random users named Ryan who mentioned AI.
I then tested the ‘who is’ query with my more well-known boss, Mark Spoonauer, the Global Editor-in-Chief of Tom’s Guide. Grok provided a detailed bullet-point summary of his career, editorial philosophy, and X posts, along with some unrelated X content.
2. The Coding Test
I asked Grok-2 to create a Python text adventure game called “The Enchanted Forest” with specific features, including interconnected locations, items to collect, and a puzzle to solve. The prompt also requested a player class, room class, and commands for various actions, as well as a clear win condition, and runnable code.
The code worked well, producing a simple text adventure game I could play in the Terminal on my MacBook.
However, when I asked Grok-2 to create a version with a user interface, it generated code that resulted in multiple errors, which it couldn’t fix. Its coding ability is on par with GPT-3.5.
3. Trending Topics
One of Grok’s standout features, including in this latest version, is its ability to analyze trending topics and pull in content from across the X platform.
This feature makes Grok particularly useful for staying updated on news stories. For example, when I asked for information on the release of Luma Labs Dream Machine 1.5, Grok provided a concise summary and X posts showcasing examples. When I requested more details, it offered a bullet-point breakdown of the new features and additional X posts from users demonstrating the new model.
Final Thoughts
I’ve previously mentioned that Grok’s integration within X makes it a powerful AI search tool, mainly due to the X integration rather than the model itself. But with Grok-2, that narrative changes. The new model is now on par with ChatGPT and Claude in terms of responsiveness, with the added benefit of being more open and less likely to decline a request.
The addition of Flux gives Grok-2 the ability to generate images for the first time, even in connection with news stories, adding a compelling layer to understanding current events.
With version 2, Grok has become a serious contender among the major AI chatbot platforms and supports the idea that for an app to be a true “everything app,” it must include robust AI integration, especially with live data access to bring everything together.