X.ai, Elon Musk’s AI startup, has unveiled its latest AI production model, the Grok-1.5. Set to launch social network X’s Grok chatbot in the not-too-distant future (“in the next few days,” X.ai writes in a suspension), the Grok-1.5 appears to be a measurable upgrade over its predecessor, the Grok-1 — at least judging by the benchmark results and specifications X has released.
Grok-1.5 benefits from “enhanced reasoning,” according to X.ai, particularly when it comes to coding and math-related tasks. The model more than doubles the Grok-1’s score on a popular math benchmark, MATH, and scores over ten percentage points better on the HumanEval test of programming language generation and problem-solving abilities.
Of course, it is difficult to predict how these results will translate into real-world use. As we wrote recently, commonly used AI benchmarks, which measure things as intrinsic as performance on graduate-level chemistry exam questions, don’t quite capture how the average human interacts with models today.
An improvement that must lead to noticeable gains is the frame size the Grok-1.5 can take compared to the Grok-1.
Grok-1.5 has a framework of 128,000 tokens — “tokens” that refer to chunks of raw text (e.g., the word “fantasy” is split into “fan,” “tas,” and “tic”). The context box or window refers to input data (in this case, text) that a model examines before generating output (more text). Models with small context windows tend to forget the content of even very recent conversations, while models with larger contexts avoid this pitfall — and, as an added bonus, better understand the flow of data they receive.
“[Grok-1.5 can] use information from significantly larger documents,” X.ai writes in the aforementioned blog post. “Furthermore, the model can handle larger and more complex prompts while still maintaining its ability to follow instructions as the context window expands.”
What has historically set X.ai’s Grok models apart from other AI production models is that they answer questions about topics that are typically off-limits to other models, such as conspiracies and more controversial political ideas. The models also answer questions with “a revolutionary streak,” as Musk has described it, and with frankly rude language if asked.
It is unclear what changes, if any, Grok-1.5 brings to these areas. X.ai doesn’t mention this in the blog post.
Grok-1.5 will soon be available to early testers on X, X.ai says, accompanied by “several new features.” Musk previously hinted at summarizing threads and replies and suggesting content for posts. we’ll see if they come soon enough.
The announcement of Grok-1.5 comes on the heels of the X.ai open source Grok-1, albeit without the necessary code to improve or train it further. More recently, Musk said that more users on X — especially those paying for X’s $8 per month Premium plan — would gain access to Grok, the chatbot, which was previously only available to X Premium+ customers (who pay 16 $ per month).