nof1.ai: AI experiment with stock trading — Deepseek and Grok-4 can earn money (but it's not certain)

By: Anry Sergeev | 20.10.2025, 13:56

The platform nof1.ai launched the Alpha Arena experiment, where six top AI models received $10,000 in virtual capital and hit the Hyperliquid stock exchange to see who among them is the real 'wolf of Wall Street'. Participants included GPT-5, Gemini 2.5 Pro, Grok-4, Claude Sonnet 4.5, Deepseek V3.1, and Qwen3 Max — all traded with the same input data. 


Screenshot of the nof1.ai site at the time of writing

The results showed the main thing: AI can analyze data, but it still doesn't understand risk. Deepseek acted like a cold-blooded trader, while GPT-5 was like an overly smart student who forgot about market emotions. If at the start Deepseek was the favorite, now alongside it is Grok-4, which sharply regained position and showed excellent results. Claude Sonnet 4.5 maintained stability, while GPT-5 and Gemini 2.5 Pro continued to fall.

Table. The state of trading at the time of writing

AI Model Current Account Value Profit / Loss Strategy Type
Deepseek Chat V3.1 $13,499.67 +35 % Aggressive scalping, short trades
Grok-4 $13,023.46 +30 % Trend approach without adaptation
Claude Sonnet 4.5 $12,318.24 +23 % Cautious trading, minimum risk
Qwen3 Max $10,650.24 +6.5 % Unstable behavior, frequent mistakes
BTC Buy Hold (benchmark) $10,359.13 +3.6 % Passive 'buy and hold'
GPT-5 $7,171.31 −28 % Excessive volatility, chaotic entries
Gemini 2.5 Pro $6,665.16 −33 % Balanced risk, moderate activity
  • Aggressive scalping — is when AI doesn't wait for the 'ideal moment' but grabs profits in crumbs, opening dozens of deals a day.
  • Trend approach — is a strategy where the bot tries to earn by moving in the same direction as the market: if the price rises — buy, if it falls — sell. But without adaptation means the algorithm does not react to sharp changes or trend reversals. As a result, it is late with decisions — and can continue to buy when the market has already turned.
  • Cautious trading — means that AI avoids risks and makes few deals, focusing on maintaining balance, not on quick profits. Such a strategy is suitable for stability, but rarely brings great earnings.
  • Unstable behavior means that the algorithm acts without a clear strategy: it opens deals too early or reacts late to market changes. Because of this, the results are unpredictable — alternating small profits and large losses.
  • Passive 'buy and hold' — A passive strategy involves a one-time purchase of an asset — in this case, bitcoin — without further deals. This is a control scenario that shows how well or poorly the AI models perform compared to simply holding the asset.
  • Excessive volatility means that AI constantly opens and closes positions, reacting to the slightest price fluctuations. As a result, the strategy looks chaotic: the bot enters the market too often, without waiting for signal confirmations, creating more noise than profit.
  • The balanced risk strategy — is when the bot does not rush into every market move but also does not sit aside. It opens deals only on clear signals, holds positions moderately, and strives for stable profit without sharp jumps.

Why this is interesting

nof1.ai turns AI testing into a real stock arena where every mistake costs money — albeit virtual. These are not abstract benchmarks, but a test of logic, risk management, and composure.

The results showed that AI models can count, but do not always understand the market yet. Deepseek acts as a disciplined trader, Grok as an intuitive strategist, and GPT-5 is prone to panic.

Most importantly: for the first time, artificial intelligence not only passed the test but competes for profit. And it seems to be doing this better and better.