New OpenAI model said to be “at genius level”, but still fails simple math
Also: Startup prompts its way to antibodies, Meta seeking financial partners for model development, and more.
A new “smartest to date” model from OpenAI naturally attracts some attention, but in this week’s newsletter we’re also zooming in on a lesser-known startup as well as what could be perceived as struggling by one of the larger players in the industry.
And apologies for not sending out the newsletter these last two weeks, there were some personal matters that needed my attention.
Number of the Week
136. The IQ of OpenAI’s new o3 model, according to its results on the test of Mensa Norway. For comparison, Albert Einstein is thought to have had an IQ of 160. [Source]
New OpenAI model: “At genius level”, but still fails simple math
OpenAI’s new o3 model has been out for a few days now, and while some calls it a “total game-changer”, there are also some caveats to be aware of, highlighted by the company itself and users scrutinizing the new tool.
According to the company, scientists are telling them that the model produces “legitimately good and useful novel ideas”, as OpenAI’s president Greg Brockman put it during the presentation, and prominent voices chime in on the same message.
In a review on X, AI philosopher David Shapiro notes that it feels like we’ve passed a tipping point with the model:
“An epochal ”point of no return“ where AI has not just surpassed most of us in terms of knowledge and low-level tasks, but logic, reasoning, and high-level, practical, utilitarian functionality. “
Game-changer
The AI positive professor of immunology and biomedical scientist Derya Unutmaz takes it a level up, after having had early access to the model:
“I feel o3’s intelligence is at or near genius level!,” he writes in a post also retweeted by OpenAI’s CEO Sam Altman.
”When I throw challenging clinical or medical questions at o3, its responses sound like they’re coming directly from a top subspecialist physicians: precise, thorough, confidently evidence-based, and remarkably professional, exactly what you’d expect from a genuine expert on the topic,” Derya Unutmaz continues, calling it a “total game-changer” for science, medicine and likely many other fields.
Unutmaz also writes, though, that the model “never hallucinates”, which does not follow what OpenAI says itself, actually the contrary: o3 hallucinates about twice as much as its predecessor o1, and the company is not sure why — “more research is needed” to understand why hallucinations are getting worse as it scales up reasoning models.
One example of this is pointed out by Colin Fraser, a data scientist with OpenAI’s competitor Meta, giving it a rather simple math question, which it fails.
When I tried the same prompt, o3 nailed it on the first try, so it seems to be more a matter of missing reliability, rather than a consistent failure, which Fraser also recognizes.
“At what point should it be reasonable to expect it not to fall for such gotchas? Ever?,” he asks.
It seems plausible at least that a human math expert would nail it every time.
Startup prompts its way to novel antibodies
Biotech startup Profluent has just announced a new AI model that is able to design antibodies for 20 drug targets with $660 billion in historical sales, the California company reports.
The model is trained on over 3.4 billion protein sequences with the goal of devising new ones that have not emerged through evolution, such as treatments for diseases.
Further, Profluent is seeing that the results get even better as it scales up the compute and training data, mimicking what we also see in large language models (LLMs), known as scaling laws. This points at the possibility of even better results as the training gets ramped up.
“With continued scaling, we anticipate emergent capabilities that will sweep the antibody field,” the company writes.
Meta looking for partners to finance their AI flagship
The toll of AI investments can be felt even for a company which made $62.4 billion in profits last year.
Meta, the owner of Facebook, Instagram, and WhatsApp, has been seeking out financial partners to chip in on the training of their Llama series of AI models, The Information reports.
Other tech giants like Microsoft and Amazon were courted to participate in the so-called Llama Consortium reportedly as recently as the start of this year.
But Meta insisting on keeping the models open-source was a challenge in convincing the partners to contribute financially, according to two people familiar with the matter.
Meta released the first two installments of their latest Llama model 4 on April 5. On April 29 they will host LlamaCon, the first developer conference dedicated to generative AI.
Image of the Week
The robot Tien Kung Ultra won a half-marathon in Beijing after covering the 21 kilometers in about two hours and 40 minutes. Twenty robot companies participated in the race. [Source]
Exciting news out there?
A team of researchers from Lausanne University Hospital in Switzerland has developed a series of AI models to automate tuberculosis assessment from lung ultrasound images, which is useful for instance in areas where they don’t have access to the needed x-ray systems. The system showed promising results in detecting the disease, outperforming human experts by 9%. [Source]
Feedback?
Thanks for reading, and feel free to respond directly with suggestions for improvement, tips, and anything in between :)




