merge house

ai coders keep trying to drag messy pull requests into main. openclaw merge bouncer checks the diff, stops the risky ones at the door, and makes every pr prove it belongs inside.

OpenClaw Unboxed and Josh Davis

May 18, 2026

∙ Paid

ai coding didn’t kill code review.

it moved the risky part closer to the merge button.

openai now has codex code review for github pull requests. the review pass that reads the pr diff, follows repository guidance, and focuses on serious issues before human review. codex also uses agents.md guidance, which means the repo itself shapes how the reviewer thinks.

software work doesn’t stay on one machine anymore.

code runs somewhere else.

review happens from a phone.

github holds the pull request.

one click moves the change into main.

fine for small copy edits.

dangerous when the agent quietly touches login, payments, user data, secrets, migrations, deployment files, dependencies, or permissions.

the merge button is where confidence gets expensive

senior engineers read pull requests with scar tissue.

payment code moves, they slow down.

auth middleware changes, they ask why.

a lockfile update inside a ui task makes the whole diff feel suspicious.

new builders don’t always see those clues yet. they see the preview load, read a clean summary, and assume green checks mean the change is safe.

a working screen proves less than people think.

wired’s may 18 vibe-coding piece showed a normal version of this problem. a nontechnical builder used claude, github, supabase, and netlify, then exposed an api key in a public github repository before claude helped move the key somewhere safer.

nothing about that feels rare.

beginners reach production-shaped problems before they’ve built production-shaped judgment.

tools got easier.

consequences stayed real.

maintainers already feel the cleanup tax

rpcs3, the open-source playstation 3 emulator, tightened its contribution rules after poor ai-generated pull requests wasted maintainer time.

recent coverage says the project wants contributors to understand and own their code, even when ai helps. it also says rpcs3 had to revert multiple ai-generated pull requests that caused regressions, and submissions without ai-use disclosure may be closed without review.

that line matters.

ai-written code isn’t the core problem.

unreviewed code is.

open-source maintainers feel it when strangers drop bad pull requests into public repos. solo founders usually feel it later, after the app breaks somewhere quiet.

build a review packet before merge

start smaller than automation.

use one markdown packet.

plain english.

specific enough to catch obvious problems before they get expensive.

before merge, the packet should answer this:

change summary:
what changed in plain english

changed files:
which files moved

scope check:
whether the file list matches the original task

risk check:
login, payments, user data, secrets, database, deployment, dependencies, permissions

test evidence:
what ran, where the output is, what’s missing

human review order:
which files deserve inspection first

decision:
approve, revise, or block

that packet gives a beginner something usable.

technical builders get a gate they’ll harden.

consultants get a safer process for client apps built with ai coding tools.

suspicious files need extra friction

some files control more than the screen.

use this starter list:

.env
.env.local
.env.production
package.json
package-lock.json
pnpm-lock.yaml
yarn.lock
requirements.txt
pyproject.toml
dockerfile
docker-compose.yml
.github/workflows/*
auth/*
middleware/*
routes/api/*
server/*
database/*
migrations/*
prisma/schema.prisma
drizzle/*
supabase/*
stripe/*
billing/*
payments/*
railway.*
vercel.*
netlify.*
fly.*
render.*

a copy task that changes src/lib/auth.ts needs review.

button work that moves package-lock.json needs an explanation.

pricing page edits shouldn’t create a database migration unless the task called for one.

the file list tells you where the agent wandered.

where openclaw fits

don’t make openclaw compete for the coding seat.

let codex, claude code, cursor, copilot, opencode, or a local model write the first pass.

give openclaw the control job.

openclaw gathers pull request metadata, routes the diff to a reviewer, compares changed files against a risk list, checks whether tests exist, and produces one packet a human can read.

i built the extensive 60 file+ repo for this below: