AI Bots

Why AI Bots?

Human play-testing is slow, inconsistent, and doesn't scale across 5 game steps × 3 layouts. An AI bot provides:

Deterministic testing — same seed, same input, same outcome. Every level change produces a measurable difference in bot score.
Edge case detection — bots expose physics bugs (vertical loops, dead zones) that manual play might miss.
Rapid iteration — a bot can play 50 levels in the time a human plays one. Level balance tweaks get immediate feedback.
Benchmark baseline — bot score is the "minimum viable experience." If the bot can't clear a level, a human player probably can't either.

Bot Versions

Each bot version gets a number and a changelog entry. Improvements are measured by: average score, clear rate (levels completed), survival time (lives remaining), and bricks cleared %.

Version	Game	Strategy	Status
Breakout v1	Breakout	Reactive — tracks ball X within 10px dead zone, speed 200	✅ Live
Breakout v2	Breakout	Trajectory prediction with wall-bounce calculation	In progress
Breakout v3	Breakout	Alternating center zone — breaks ball cycling corridor on center hits	✅ Live
Breakout v4	Breakout	4-zone cycling — sweeps full field via paddle edge offsets	✅ Live (current)
Breakout v5	Breakout	Brick-scanning — experimental, outperformed by v4	📉 Reverted
Gun Fight v1	Gun Fight	Random-direction strafe + fire-toward-opponent within 450px range	✅ Live

Breakout Bot

v1 — Reactive Tracker

The original bot simply follows the ball's current X position. It has a 10px dead zone to prevent jitter and moves at speed 200. Works well for simple layouts but fails on multi-ball levels (steps 3-5) where it can't track multiple active balls simultaneously.

updateBot(paddle) {
  let diff = this.ball.x - paddle.x;
  if (Math.abs(diff) > 10) {
    paddle.body.setVelocityX(diff > 0 ? 200 : -200);
  } else {
    paddle.body.setVelocityX(0);
  }
  paddle.x = Phaser.Math.Clamp(paddle.x, 40, W - 40);
  if (this.stickyBall) this.releaseSticky();
}

Limitations: No trajectory prediction. Bot reacts to the ball's current position, not where it will land. On fast angles where the ball bounces off a wall, the bot chases the ball's current X instead of moving to the predicted interception point. Results in ~3-5 missed returns per level on average.

v2 — Trajectory Predictor (current)

Improves on v1 by computing where the ball will intersect the paddle's Y level, accounting for wall bounces. This means the bot moves to the interception point rather than chasing the ball.

predictLanding(ball) {
  // Predict where ball will land at paddle Y-level
  let px = ball.x, py = ball.y;
  let vx = ball.body.velocity.x, vy = ball.body.velocity.y;
  if (vy <= 0) return px; // ball moving up — can't predict yet

  // Step forward until ball reaches paddle Y
  while (py < 438) {
    px += vx * 0.016;  // one frame at ~60fps
    py += vy * 0.016;
    // Wall bounce
    if (px < 6 || px > 674) { vx *= -1; px = px < 6 ? 6 : 674; }
  }
  return px;
}

updateBot() {
  let tx = this.predictLanding(this.ball);
  let diff = tx - this.paddle.x;
  this.paddle.body.setVelocityX(Math.abs(diff) > 10 ? (diff > 0 ? 300 : -300) : 0);
  this.paddle.x = Phaser.Math.Clamp(this.paddle.x, 40, W - 40);
  if (this.stickyBall) this.releaseSticky();
}

Improvements over v1: Faster response (speed 300 vs 200), trajectory prediction eliminates chasing, wall-bounce awareness means the bot positions for the ball's actual landing point. Missed returns drop to ~1-2 per level.

v3 — Alternating Center Zone

Fixes a degenerate case exposed by v2's trajectory prediction. When the ball hit the paddle center zone, the 0° angle produced purely vertical motion. The min-vx guard (±60) created a narrow corridor that only hit the center column of bricks — cycling forever without spreading to other bricks.

// v3: center zone alternates direction every hit
var angles = [-65, -35, this._paddleHits % 2 === 0 ? 20 : -20, 35, 65];

Result: Even returns go +20°, odd returns go -20°. The ball naturally spreads across the full width within 3-4 returns.

Benchmark (10 trials × 20s on step-01): avg score 36.7, median 21.5, high 113, bricks 13.3, 100% survival. Median +43% over v2 (15→21.5).

v4 — 4-Zone Cycling (current)

The trajectory predictor lands every return at center, always hitting paddle zone 2 (±20°). This produces a narrow corridor. v4 adds an intentional lateral offset that alternates through all 4 wide paddle edge zones every 4 hits. This sweeps the full 680px brick field.

// v4: cycle through 4 wide zones — skip center
var zoneOffsets = [32, 16, -16, -32];
var zoneIdx = (this._paddleHits || 0) % 4;
tx += zoneOffsets[zoneIdx];

Benchmark (5 trials × 90s on step-05): avg score 53.6, median 64, high 68, bricks 16/50, 100% survival. 3.3x improvement over no-offset baseline. The bot cannot clear all 50 bricks under 5 min because brick collisions scatter the ball chaotically — about 1 brick per 6s regardless of zone targeting.

v5 — Brick-Scanning (experimental)

Attempted. Scanned remaining bricks per column, aimed at the densest cluster. Failed — the zone-to-column mapping is too approximate. Brick collisions and wall bounces make the trajectory too chaotic for a predetermined "best zone." The v4 blind cycle outperforms it.

Lesson: Rule-based brick targeting in Breakout is a hard AI problem. Brick collisions produce chaotic trajectory changes. To clear levels, you need reinforcement learning (DQN Atari style) or a full brick-collision trajectory simulator.

v4 — Multi-ball + Priority Target (future)

Planned. For steps 3-5 (power-ups, multi-ball), the bot must handle multiple active balls simultaneously. Strategy: track the most dangerous ball (closest to falling below the paddle), predict its landing, and catch it. Less urgent balls are ignored.

Why the Bot Can't Clear Levels

Every bot version achieves 100% survival — they never lose a life. But none clear a single level (50 bricks) within a reasonable benchmark window. This isn't a bug — it's a physics constraint. Here's why.

Brick Collisions Are Chaotic

When the ball hits a brick, it bounces unpredictably. The bounce angle depends on: which edge of the brick it hit, the ball's trajectory at impact, and which adjacent bricks are already destroyed. Each brick hit scatters the ball slightly — and 50 bricks produce 50 trajectory changes. A hand-coded predictor can't account for this complexity because the state space (ball trajectory × remaining brick layout) is enormous.

Time Constraint

Metric	Step-01 (v3)	Step-05 (v4)
Bricks / sec	0.67	0.18
Bricks / 90s	13.3	16
Time to clear 50 bricks	~75s	~280s
Level 1 clear rate	~50% in 90s	~0% in 90s

Step-01 clears faster because it uses continuous angle mapping (diff * 1.8) — every paddle hit produces a unique trajectory. Step-05 uses 5 discrete zones. The trajectory predictor lands every return at center, which always hits zone 2. Even with the v4 offset cycling, the ball only visits 4 angles before repeating. Step-01's ball visits a new angle every time.

The Speed Tunneling Problem

Attempting a 10x speed multiplier (?speed=10) would let the bot play 10 minutes of game time in 1 real minute — enough to clear levels. But at 10x physics speed, the ball moves 48px per frame while bricks are only 20px thick. The ball tunnels through bricks without collision detection firing. Even Phaser's maxSteps sub-stepping can't fix this — the delta per sub-step is still too large relative to brick thickness.

Rule-Based Targeting Is Fragile

The v5 experiment (brick-scanning) confirmed that a hand-coded zone-to-column mapping can't survive brick collisions. The ball's actual trajectory after hitting one brick is unpredictable — so a predetermined "best zone" doesn't reliably reach the intended column. The blind-cycling v4 outperformed smart targeting because it spreads evenly without depending on a fragile trajectory model.

The Real Solution: Reinforcement Learning

Breakout is a solved problem in RL. DQN (Deep Q-Network, 2013) achieved superhuman performance with 10M frames of training. A trained RL agent learns the chaotic brick-physics dynamics implicitly — it doesn't need a zone-to-column mapping because it's trained on millions of actual trajectories. The hand-coded bot hits the ceiling at ~32% brick coverage per 90s. An RL agent can clear 100% of bricks in the same time.

Building an RL training loop for this game is a separate project (state encoding, reward shaping, hyperparameter tuning, ~10M frame training run). The hand-coded bot is infrastructure for game testing — perfect survival is useful for that role even without level clearance.

Gun Fight Bot

v1 — Random Strafe + Fire

The bot moves left-right randomly within boundary zones, fires when the opponent is within 450px. Used for both ?bot (bot vs bot) and ?ai (human vs AI) modes. The AI variant adds Y-axis tracking of the player.

botAI() {
  // P1: strafe in left zone
  this.p1.body.setVelocityX((d > 0 ? 1 : -1) * SPD * 0.5);
  if (this.p1.x < 40 || this.p1.x > 300) this.p1.body.setVelocityX(...);
  if (Math.abs(d) < 450 && Math.random() < 0.2) this.fireBullet(...);

  // P2: strafe in right zone
  this.p2.body.setVelocityX((d > 0 ? 1 : -1) * SPD * 0.5);
  if (this.p2.x > 640 || this.p2.x < 380) this.p2.body.setVelocityX(...);
  if (Math.abs(d) < 450 && Math.random() < 0.2) this.fireBullet(...);
}

Limitations: Random strafe direction — no strategic positioning. Fire rate is random (20% per 160ms tick ≈ 1.25 bullets/sec average). No bullet-dodge logic. Future versions could add: strafe toward opponent center, aim ahead of opponent movement, dodge incoming bullets.

How Bots Are Scored

Each bot run produces a standard set of metrics for comparison:

Metric	Breakout	Gun Fight
Primary	Score, Clear rate	Win rate (bot vs bot)
Secondary	Bricks cleared %, Lives remaining	Avg bullets fired, Hit accuracy
Run count	10 runs per version	25 matches per version
Bot vs	Level layouts	Same-version opponent

Benchmark Results

Game	Version	Strategy	Avg	Min	High	Median	Max Score	Bricks	Levels	Lives	Trials	Stale
breakout	v2	Trajectory Prediction	32.7	8	131	15	170	12.7	0	3	10	✅ No
breakout	v1	Reactive Chasing (speed 200)	19.5	1	102	10	170	9.5	0	2.7	10	✅ No
breakout	v3 (step-01)	Alternating Center Zone	36.7	10	113	21.5	170	13.3	0	3	10	✅ No
breakout	v3 (step-05)	Lateral Offset + Per-Hit Alternation	53.6	22	68	64	170	16	0	3	5	✅ No
breakout	v5	Brick-Scanning + Trajectory Prediction	7.8	0	24	0	170	2.6	0	3	5	✅ No
gun-fight	v1	Random Strafe + Fire	Not yet benchmarked

Benchmarks run for 20s per trial on step-01 layout. Scores vary significantly per run — more trials needed for reliable comparison. Run bash scripts/benchmark-bot.sh to add new results.

Reading Max Score as a Diagnostic

Max score is a bug detector. In Breakout, every brick destroyed adds to score. If a bot finishes a trial with all 3 lives but a low max score, it means the bot survived but wasn't scoring — a stale state or logic bug. A healthy bot with 3 lives should push its ceiling on every run.

Current check: v2 survives at 100% (3/3 lives every trial) but its max was 54 (actual highScore: 131). v3's alternating center zone fix raised median 43% (15→21.5) with highScore 113 and no stale states. The corridor is broken — the remaining ceiling gap is now normal variance, not a stuck state.

For future multi-level steps: a bot with all lives at timeout that hasn't cleared every level indicates a layout navigation bug.

Running Bot Benchmarks

Benchmarks are automated via the Playwright test suite. To run a bot scoring session manually:

# Breakout bot mode (loads and plays automatically)
open https://aigamingdev.com/games/breakout/breakout-step-01.html?bot=true

# Gun Fight bot mode (bot vs bot)
open https://aigamingdev.com/games/gun-fight/gun-fight-01.html?bot

# Gun Fight AI mode (human vs AI)
open https://aigamingdev.com/games/gun-fight/gun-fight-01.html?ai

Bot Improvement Log

Date	Game	Version	Change
2026-06-30	Breakout	v1	Initial reactive tracker: follows ball X, 10px dead zone, speed 200
2026-06-30	Breakout	v2	Trajectory prediction with wall-bounce calculation, speed 300
2026-07-01	Breakout	v3	Alternating center zone: ±20° on center hits to break ball cycling corridor. Applied to all breakout steps. v2 exposed degenerate 0° bounce physics.
2026-07-01	Breakout	v4	4-zone cycling: lateral offset sweeps through paddle edge zones (0,1,3,4) for full field coverage. 3.3x improvement over no-offset baseline. Applied to step-05.
2026-07-01	Breakout	v5 (exp)	Brick-scanning: attempted to aim at remaining brick clusters. Failed — zone-to-column mapping too approximate for chaotic brick physics. Reverted to v4.
2026-06-30	Gun Fight	v1	Initial random strafe + fire with 450px range