Vibe Coding Still Needs a Senior Engineer (For Now)

grith team·May 11, 2026·14 min read·vibe-coding

grith is launching soon

A security proxy for AI coding agents, enforced at the OS level. Register your interest to be notified when we go live.

Two columns: features the AI shipped, and security controls a senior engineer's review found missing. Below, a stat strip - 28 total findings, 12 OWASP-classic security bugs, 0 prompt-injection vulnerabilities, 0 agent-hijacking bugs. — One vibe-coded internal tool, one agentic IDE, two hours of senior-engineer review. The classic bugs outnumbered the AI-native ones by every count.

A colleague asked me to take a look at a small internal tool they'd built. It's a side utility, not a production product, written in the way most things get written in 2026 - opened the agentic IDE, described the feature, accepted the diff, ran it, iterated. The kind of code that didn't exist on Monday and was running by Friday. The kind of code most of us are shipping a version of every week.

I read it for a couple of hours and came out with twenty-eight findings. A handful were correctness bugs. Most of them were security issues. None of them were the kind of security issue the current AI safety discourse spends most of its time worrying about.

There was no prompt injection. No agent hijacking. No exfiltration via a poisoned MCP server. No supply-chain compromise. The model didn't go off the rails. The model did exactly what it was asked to do, competently, in idiomatic code that looks fine at a glance and only stops looking fine if someone with the right scar tissue reads it carefully.

The security holes were the boring ones. The ones that have been on the OWASP top ten in some form since 2003. And I think the pattern is worth writing down, because if you're shipping code with AI tools right now, the chances that you've got at least one of these in something you've built recently are higher than you'd like to think.

What "vibe coding" actually produces

I want to be careful here, because it's easy to slip into "AI bad, hand-coding good" framing and that's not the point I'm making. The code I was reading wasn't bad. The architecture was fine. The components were sensibly factored. The library choices were reasonable. If you'd handed me the same problem and a weekend, I'd probably have produced something structurally similar.

The issue is that AI coding tools are extremely good at producing the obvious thing. You ask for a user-management function and you get a user-management function. You ask for a CSV upload and you get a CSV upload. The code does the thing. It's the things around the thing - the questions a careful engineer would ask before writing the first line - that the tool doesn't volunteer.

So you get features. You get a lot of features, very fast. What you don't get, unless you ask explicitly, is the meta-layer where someone says "wait, this endpoint is reachable from the public internet and has no auth check."

Here's what that looked like in practice.

The serverless function that anyone could call

The app had a single serverless function for admin operations - create user, reset password, delete user. The kind of thing every app with role-based access has. It used the platform's privileged service key, which is the right pattern: keep the powerful credential server-side, expose a small surface, never put the key in the browser bundle.

The function had no authentication check.

Not "weak authentication". Not "the wrong authentication". Just none. Anyone who knew the URL - which is to say, anyone who'd ever opened DevTools on the admin page - could POST a payload and create themselves an admin account. Or reset the existing admin's password. Or delete every user.

The frontend, of course, only called the function from inside the admin-only UI. The frontend had a perfectly reasonable check that hid the admin button from non-admins. The frontend's belief in its own security was sincere and entirely irrelevant.

This is a textbook missing-authorization bug. It's been on OWASP's list for two decades. And it's the kind of bug the AI was never going to flag on its own, because at no point in the implementation did anyone explicitly ask "who is allowed to hit this endpoint?" The prompt was "build me a function that lets admins create users." The function does that. It also lets non-admins create users. It also lets people who aren't logged in create users. The prompt didn't say it shouldn't.

The database that was protected on paper

The database backend supports row-level security policies - rules that say "user X can read row Y if conditions Z." That's the entire security model. The public API key is shipped in the JS bundle by design; without policies, the API key gives anyone on the internet full read/write access to every table.

The migration the AI wrote enabled RLS and added sensible policies. For the three tables the migration created.

The other five tables - the ones that already existed - were untouched. RLS may or may not have been on for them. The migration didn't enable it. The migration didn't even check. If you ran npm run db:push on a fresh database, you got an app where the new auth-related tables were correctly locked down and the actual business data tables were potentially wide open to anyone with a working internet connection.

The pattern here is subtle and worth unpacking. The AI was asked to add multi-user support. It added multi-user support. The multi-user-support code is correct. The implicit assumption it didn't surface was "and the rest of your database is presumably also protected." If your existing database wasn't already locked down - and there's no way for the model to know without being told - the new safe code coexisted happily with the old unsafe state.

You can't see this in a code review unless you go and check the database state directly. The repo will look fine. The PR will look fine. The bug is in the gap between what the code does and what the surrounding system was assumed to do.

The UI gate that wasn't a gate

The admin pages were gated client-side, the way most React apps gate things: if (user.role !== 'admin') return <Redirect to="/" />. That's a UX feature. It tells non-admins they're not supposed to be here. It doesn't stop them.

A warehouse user could open DevTools, drop to the console, and call supabase.from('clients').delete() directly. The browser ships the API key. The user is authenticated. The frontend has done its honest best to hide the button, but the API doesn't know the button is supposed to be hidden. Without the RLS policies from the previous section actually being in place on the relevant tables, the call goes through.

If you ask the model "build me an admin panel", you get an admin panel. The admin-only routing is correctly conditional on role. The components correctly check user.role before rendering. Every visible part of the security model is in place. The non-visible part - "and the API enforces this independently" - isn't, because you didn't ask for it, because if you were the kind of engineer who knew to ask for it you probably weren't going to forget to write it yourself.

All the other things that aren't there

The same shape repeats:

The session token sits in localStorage (the platform's default). Any cross-site-scripting bug - including one in a transitive npm dependency you've never heard of - reads the token and impersonates the user. The deploy config has no Content-Security-Policy header. No X-Frame-Options. No HSTS. No clickjacking protection. The platform doesn't add these for you. The AI doesn't add these unless you ask, and nobody asks, because nobody knows to ask.

The password minimum is six characters. There's no audit log for admin actions - creating users, resetting passwords, deleting clients all happen with no record of who did what or when. There's no rate limiting on login. There's no de-duplication on the CSV import, so re-uploading the same file silently doubles every row.

Each of these, individually, is a thing an experienced engineer doing a security review would catch in about ninety seconds. Together, they're maybe a day's worth of hardening work. None of them are exotic. None of them are AI-specific.

That's the part I want to land on.

The pattern, named

A lot of the AI-security conversation right now is about the new class of attacks. Prompt injection. Agentic exfiltration. Tool-call hijacking. Confused-deputy problems when the model is given more capabilities than the user intended. All real, all worth talking about, all things grith exists to address at the syscall level.

But there's a parallel problem that's at least as big and gets a fraction of the airtime. AI coding tools, today, write junior-level code at senior speed. The code itself is fluent. The library choices are sensible. The components are reasonably factored. What's missing is the layer a senior engineer adds without thinking about it - the muttered "and who's allowed to call this?" before the first line of the function, the reflex to check what the deploy config does and doesn't set, the instinct that a frontend role-check is not the same thing as a backend role-check. That layer isn't in the training data the way the happy-path code is, because it mostly exists in the back of people's heads.

The classic missing-auth bug, the missing-CSP bug, the missing-RLS bug, the missing-rate-limit bug - these have always existed. They are the bugs juniors have always shipped. What's changed is the throughput. A junior engineer writing an admin panel by hand from scratch in 2018 would absolutely have shipped some of these. They'd have shipped one or two of them, in one feature, every couple of weeks - and somewhere upstream of production, a senior would have read the PR and asked the awkward questions before merging it.

The AI-assisted version of that junior ships fifteen features in the same time, with the same blind spots, in production code that looks competent enough that nobody opens it for review. The vulnerabilities aren't worse. There are just more of them in flight, and the review step that used to catch them has quietly been skipped.

What I'd actually do about it

I'm not going to pretend the answer is "stop using AI tools." I use them every day. Half of grith is going to ship on time because of them.

What I'd suggest, having seen the inside of one of these reviews:

Get a senior pair of eyes on it. Sorry. That's the headline. If the thing you're shipping has real users, real data, or real money attached, an experienced engineer needs to read it before it leaves your laptop. Not as a code-style review, as a security review - with the specific question "what's the worst thing a hostile authenticated user can do here, and what's the worst thing an unauthenticated stranger can do?" A senior who has shipped a few production incidents will spot the missing-auth function in about thirty seconds. The model, today, will not.

Run a second AI pass as a reviewer, not a writer. This helps, and it's better than nothing, and you should do it, but it's a supplement to the human review rather than a replacement. After the feature is "done", open a fresh session - not the one that wrote the code - and give it the diff with a prompt that's purely "act as a security reviewer: who is allowed to call each endpoint, what credentials are reachable from the client, what happens if an attacker has a valid session for a low-privilege user, what's in the deploy config." A fresh session does this much better than the session that just wrote the code, because the writing session is committed to its own design and will defend it. But "fresh model" is not the same as "senior engineer." Use both.

Treat the BaaS configuration as code. If you're using a hosted backend with row-level security, the policies are the security model. They need to be in version control, they need to be reviewed, and "RLS is enabled" needs to be verifiable in the repo, not in someone's memory of what they clicked in the dashboard two months ago.

Assume the frontend is not a security boundary. It never was, but the AI-assisted version of you is producing a lot more frontends very fast, and each one carries an implicit "the backend will enforce this" that is your job to make true.

Read the deploy config. The netlify.toml, the vercel.json, the IAM policies, the Cloudflare rules. The AI doesn't generally touch these unless asked. They're where most of the missing defence-in-depth lives.

None of this is novel. It's the same security hygiene that's been good practice for fifteen years. What's changed is that the volume of code that needs it applied has gone up by an order of magnitude, and the natural friction that used to slow down shipping - the friction that gave the careful version of you time to notice the missing auth check - has mostly gone away.

The honest closer

I'm writing this partly because the AI-coding-tool industry, including the bits of it I work in, has a small amount of an incentive problem. The pitch is "ship features faster, fewer engineers, junior developers can build production apps now." That pitch is true at the surface and quietly false underneath. Features ship faster, yes. Production apps get built by people who couldn't have built them two years ago, yes. The thing that gets left out is the part where, between the working feature and the shippable feature, somebody with twenty years of scar tissue used to read the code and ask the questions a junior didn't know to ask.

That step hasn't gone away. It's just stopped being visible, because the model produces output that looks like it doesn't need it. Two hours of one senior engineer's afternoon turned up twenty-eight findings in code that the team had already deployed and was using daily. None of those findings were exotic. All of them are the kind of thing a senior catches by reflex and a model, today, does not.

I want to be honest about the "today" part. The models are getting better. The gap between "junior code at senior speed" and "senior code at senior speed" is narrowing month over month. It's possible - probable, even - that the kind of review I did last week will be largely automatable in eighteen months. I am not betting that it's automatable now. The evidence on my screen says otherwise.

Until then, the practical advice is uncomfortable but simple. If you're shipping AI-assisted code into anything that matters, you still need somebody who has been bitten by all of these mistakes before to read what the model wrote. Not because the model is bad. Because the model writes the same code an enthusiastic, capable, time-pressured junior writes - and that code has always needed a senior to look at it before it goes out.

The good news, if you're a senior engineer reading this: the job is not over. The job has changed shape. There is more code to read, it arrives faster, it looks more confident than the code juniors used to write, and your value is the bit you can't articulate easily - the muttered question, the raised eyebrow at the deploy config, the thirty-second scan of an endpoint that lets you say "wait, who's allowed to call this?" before the function ships.

That's still a job. For now, it's still ours.

Like this post? Share it.

Share on X Submit to HN