OpenAI launched EVMbench. This can be a new software that enables measuring the efficiency of synthetic intelligence brokers in a set of exams (benchmarks) by detecting, fixing and exploiting good contracts on Ethereum. The corporate developed EVMbench along with Paradigm, a agency that funds cryptocurrency community tasks.
OpenAI AI bots on Ethereum are able to evaluating three key points within the protocol.
First, actual vulnerabilities detected in Ethereum open supply utilizing knowledge from public audits. They then price the vulnerability’s danger and audit rewards for the remainder of the builders within the ecosystem.
Second, proposes patches to appropriate these vulnerabilities with out breaking the operation of the protocol. “Brokers modify susceptible contracts and should protect the meant performance whereas eliminating exploitability, which is verified by automated testing and exploitation checks,” says OpenAI.
Third, simulate assaults to take advantage of these flaws and drain funds in managed simulation environments (sandbox) protected. Nevertheless, OpenAI clarifies that EVMbench “doesn’t characterize the complete complexity of good contract safety in the true world.”
EVMbench makes use of 120 actual vulnerabilities extracted from 40 public competency audits like Code4rena. Efficiency exams performed to this point point out that the GPT-5.3-Codex agent mannequin achieved 72.2% success in exploiting flaws. The earlier mannequin, GPT-5, solely achieved a 31.9% success price.
In vulnerability detection and patching, nonetheless, The outcomes of the mannequin are much less encouraging.
Efficiency is decrease in detection and patching duties. Within the discovery section, brokers generally cease after figuring out a single problem somewhat than totally auditing the codebase. Within the patching section, sustaining full performance whereas eliminating refined vulnerabilities stays a problem.
OpenAI, synthetic intelligence firm.
Why is EVMbench vital?
Based on the corporate, the significance of this agent audit software lies in the truth that good contracts routinely assure $100 billion in crypto property inside open supply protocols.
“As AI brokers enhance their expertise in studying, writing, and executing code, it turns into more and more vital to measure their capabilities in economically related environments and encourage using AI methods defensively to audit and strengthen carried out contracts,” the corporate mentioned within the product presentation assertion.
OpenAI brokers on Ethereum come at a time of fast development in autonomous brokers throughout the cryptoasset ecosystem. As reported by CriptoNoticias, these can already work together with complicated environments such because the Lightning Community. On this second layerbrokers can handle liquidity channels and financial exchanges with different AIs.
Coinbase, for its half, launched Agentic Wallets, wallets that permit AI brokers to function on the Base community with out paying commissions. And Phantom, Solana’s hottest pockets, activated an MCP server for AI brokers to handle balances and function autonomously.
Based on knowledge from Token Terminal, the Ethereum community reached 17.3 million weekly transactions. The explosion of transactions on the community It happens after the launch of ERC-8004 in August 2025. This can be a customary that enables “discovering, selecting and interacting with brokers throughout organizational boundaries with out pre-existing belief” in open agent economies.
