Hyperwrite's Reflection 70b, the World's Most Powerful Open Source AI Model

Matt Shumer, co-founder and CEO of AI writing startup HyperWrite, has introduced Reflection 70B, a new large language model (LLM) built on Meta’s Llama 3.1-70B Instruct, which incorporates an innovative error self-correction feature and delivers top-tier performance in third-party benchmarks. Shumer shared on the social media platform X that Reflection 70B may now be “the world’s leading open-source AI model.”

Reflection 70B has been extensively tested on various benchmarks, including MMLU and HumanEval, with results validated by LMSys’s LLM Decontaminator to ensure accuracy. The model consistently outperforms Meta’s Llama series and even competes with high-end commercial models. A demo of the model is available on a playground site, though high demand has caused traffic overload, prompting Shumer’s team to source more GPUs to meet user interest.

Reflection 70B holds its own against even the top closed-source models (Claude 3.5 Sonnet, GPT-4o).

It’s the top LLM in (at least) MMLU, MATH, IFEval, GSM8K.

Beats GPT-4o on every benchmark tested.

It clobbers Llama 3.1 405B. It’s not even close. pic.twitter.com/win7cHUOob
— Matt Shumer (@mattshumer_) September 5, 2024

What sets Reflection 70B apart is its error detection and correction capability, a feature Shumer has been working on for months. Unlike typical LLMs, which can hallucinate without correcting themselves, Reflection 70B can assess its output for accuracy before presenting it to users. This functionality is powered by a technique known as reflection tuning, allowing the model to catch and fix errors during the reasoning process. The model also introduces new special tokens for reasoning and error correction, enhancing user interaction and enabling real-time error resolution.

The playground demo — Hyperwrite's Reflection 70b, the World's Most Powerful Open Source AI Model 2

Shumer revealed that a larger model, Reflection 405B, is in the works, expected to launch soon and surpass the performance of leading closed-source models. HyperWrite plans to integrate Reflection 70B into its AI writing assistant, promising to enhance its capabilities further.

I want to be very clear — @GlaiveAI is the reason this worked so well.

The control they give you to generate synthetic data is insane.

I will be using them for nearly every model I build moving forward, and you should too. https://t.co/I789UIa5Yg
— Matt Shumer (@mattshumer_) September 5, 2024

A crucial factor in Reflection 70B’s rapid development is the synthetic data generated by Glaive, a startup that specializes in creating task-specific datasets. Glaive’s technology accelerated the training process, allowing the HyperWrite team to create custom data in hours instead of weeks. The training of Reflection 70B took three weeks, with five iterations of the model completed during that period, thanks to Glaive’s platform.

Founded in 2020 by Shumer and Jason Kuperberg, HyperWrite initially focused on email generation through a Chrome extension. The company has since expanded its AI-driven offerings, gaining over two million users and securing funding from investors like Madrona Venture Group. With continued emphasis on accuracy and safety, HyperWrite is refining its AI tools based on user feedback.

Looking ahead, Shumer has ambitious plans for the Reflection series, with the upcoming Reflection 405B model expected to outshine even the most advanced closed-source LLMs. This could pose a challenge for companies like OpenAI, Anthropic, and Microsoft, as the balance of power in the generative AI space continues to shift toward open-source models like Reflection. For developers and researchers, the release of Reflection 70B offers access to a powerful, cutting-edge tool that could redefine the potential of open-source AI.

Hyperwrite’s Reflection 70b, the World’s Most Powerful Open Source AI Model

What Made Nvidia the World’s Most Valuable Company

Nvidia Introduces Device Aimed at Small Businesses and Hobbyists

For the First Time in Its History, Tesla’s Sales Declined Year Over Year

To Combat Scams, Telegram Adds Third Party Verification