BLOG


You shipped fast. You shipped with vibes. But are you sure what you shipped is actually safe?
Vibe coding went from Andrej Karpathy's 2025 tweet to the dominant workflow almost overnight. 84% of developers worldwide now use or plan to use AI coding tools. 41% of all code shipped globally is AI-generated. The speed gains are real, and the business case is obvious.
The risk is equally real but just less talked about.
Here's the uncomfortable part. AI-generated code has roughly 1.7x as many issues as human-written code. 45% of AI-generated code samples fail security benchmarks across OWASP Top-10 categories. Only 3% of developers say they highly trust AI output without reviewing it. Yet 52% don't always review before deploying.
That gap between trust and behavior — that's where bugs live. Breaches live there too. So do bad user experiences that quietly kill retention.
This checklist exists to close that gap. Solo founder with a weekend MVP. Startup team shipping fast. Enterprise developer weaving AI into an existing pipeline. Doesn't matter. Run through this before you call anything production-ready.
A CodeRabbit analysis of 470 pull requests found AI-generated code carries 2.74x higher security vulnerability rates than human-written code. On Lovable, 170 out of 1,645 apps — roughly 10% — had critical row-level security flaws in the wild.
Then there's the Stanford study. Developers using AI tools wrote less secure code than developers who didn't. And felt more confident about it.
That's the quiet danger. Not the speed. The false sense of safety that comes with it.
None of this is an argument against AI-assisted development. It's an argument for knowing what you're actually shipping. Speed and safety can coexist - but not by accident.
Work through each section before pushing to production. Check every box you can confidently confirm. Any unchecked box is a conversation you need to have with your code.
Security is where vibe coding debt shows up fastest — and most painfully.
No hardcoded credentials. Search the codebase for API keys, database passwords, secret tokens, and environment variables committed directly into source. AI assistants do this as a shortcut more often than you'd expect.
Environment variables are properly scoped. Secrets in .env files shouldn't be exposed client-side. Critical in React/Next.js — the NEXT_PUBLIC_ prefix makes variables visible in the browser. Easy to miss. Painful to discover late.
Authentication isn't LLM-generated boilerplate. AI-generated auth logic often pulls from outdated training data and replicates insecure patterns. If you didn't write it yourself or reach for a battle-tested library — Auth.js, Clerk, Supabase Auth — get a human to review it line by line.
Row-level security is enabled on every database table. This was the root cause of the Lovable CVE-2025-48757 incident that exposed roughly 10% of apps on the platform. If you're running Supabase, Postgres, or any row-aware database layer, verify RLS policies exist and actually work.
User inputs are validated and sanitized. AI generates optimistic code. It assumes inputs are clean. They aren't always. Check for SQL injection vectors, XSS vulnerabilities, and unvalidated form fields.
HTTPS is enforced everywhere. No mixed-content warnings. No HTTP fallbacks. Every API call, webhook, and redirect goes over TLS.
Rate limiting is on sensitive endpoints. Login, signup, password reset, any feature calling an external API. Without it, you're exposed to abuse — and runaway costs.
Fast code and clean code aren't the same thing. This is where AI quietly runs up a tab you'll pay later.
Hunt down dead code from regeneration loops. Iterative prompting leaves a trail — orphaned functions, unused imports, duplicate logic that nobody cleaned up. Run a linter. ESLint, Pylint, RuboCop. Actually address the output.
Functions should do one thing. AI-generated functions have a habit of bloating into multi-responsibility monsters. Hard to test. Harder to debug. Anything creeping past ~50 lines probably needs to be broken up.
Rename the AI defaults. data, result, temp, handleClick1, handleClick2 — these are placeholders, not names. If you wouldn't use it in a code review, rename it now.
No magic numbers or strings. 86400, "admin", "type_3" Buried in logic should be named constants. AI skips this almost every time. You'll forget what it means faster than you think.
Error handling should actually handle errors. Catch blocks that silently swallow exceptions, or console.log("error") and nothing else — those aren't error handlers. They're time bombs. Every error path needs context and a graceful exit.
Audit your dependencies. Run npm audit, pip-audit, or equivalent. AI-generated package files love pinning to older versions and pulling in packages you don't need.
The 30-minute test. Could a new developer understand a module without access to the AI conversation that generated it? In 30 minutes? If the honest answer is no — add comments, add documentation, make it readable on its own.
74% of developers report increased productivity with AI coding tools. Productivity in development and performance in production are different things entirely.
Check your queries - AI writes working ones, not fast ones. Run your most frequent queries through EXPLAIN ANALYZE. Add indexes where the data tells you to. Don't guess.
Find the N+1 problems before your users do. Classic AI anti-pattern: fetch a list, then loop through it, fetching related data one item at a time. Use joins. Use batch fetching. Don't let it ship as-is.
API calls shouldn't fire on every render. AI generates naive data fetching. It hammers endpoints on every state change without thinking twice. Verify everything is cached, debounced, or otherwise controlled.
Optimize your images and assets. Uncompressed images, missing lazy loading, oversized bundles — AI doesn't think about Lighthouse scores. You have to.
Run a bundle analyzer. AI loves importing entire libraries for a single function. All of lodash for one _.get call. All of moment.js for a date format. Find it and cut it.
Load test before real users do. Even a basic run with k6 or Artillery will show you where the ceiling is. Finding out in production is a much worse experience.
96% of developers don't fully trust that AI-generated code is functionally correct. Tests are how you replace that trust with evidence.
Critical paths have unit tests. Authentication, payment processing, data mutations, and core business logic should all have tests that don't require a browser.
Integration tests cover key user flows. At minimum: sign up, log in, complete the primary action your product is built around, log out.
Tests were written or reviewed by a human. AI-generated tests frequently test the wrong thing, validating implementation rather than behavior, or passing trivially because they don't set up the right preconditions.
CI/CD runs tests on every push. Automated testing means nothing if it's not automatic. Verify your pipeline is configured and actually failing on failures.
Edge cases are explicitly tested. Empty states, nulls, maximum input lengths, concurrent requests, expired sessions. AI code often assumes the happy path.
Speed-to-market shouldn't mean shipping experiences that exclude users or frustrate them.
Semantic HTML is used throughout. AI frequently generates <div> soup. Check for proper use of <button>, <nav>, <main>, <label>, and heading hierarchy.
All interactive elements are keyboard navigable. Tab through your entire application. Everything clickable should be reachable and activatable without a mouse.
Color contrast meets WCAG AA standards. Use a contrast checker on all text/background combinations. AI picks colors for aesthetics, not accessibility.
Forms have explicit labels. Every input field needs a <label> element, not just a placeholder, which disappears when a user types.
Error messages are human-readable. "Error 422" is not an error message. "Please enter a valid email address" is.
Loading and empty states are handled. Every async operation needs a loading indicator. Every list or data view needs an empty state. AI skips these without prompting.
The best code in the world causes problems silently if you can't see what it's doing.
Application logs are structured and searchable. JSON-formatted logs with consistent fields (timestamp, level, message, user ID, request ID) are vastly easier to query than raw text.
Error tracking is configured. Sentry, Datadog, Rollbar, pick one. You need to know when things break in production before your users tell you on social media.
Uptime monitoring is active. A simple external ping monitor (Better Uptime, Pingdom) provides a safety net that costs minutes to set up.
Database backups are automated and tested. Automated backups aren't enough; verify that you can actually restore from them.
Secrets are managed through a vault or CI/CD environment variables. Not in .env files checked into version control. Not in Slack DMs. Proper secrets management.
A rollback plan exists. Before every deploy, know how you'll revert if something breaks. Blue-green deployments, feature flags, or versioned releases all help here.
Count your checked boxes and score yourself:
Score | Status |
|---|---|
50–55 checked | ✅ Production-ready. Ship with confidence. |
40–49 checked | ⚠️ Acceptable for beta/soft launch. Address gaps within two sprints. |
30–39 checked | 🔶 High risk. Significant issues likely. Block deployment until critical items (especially Section 1) are resolved. |
Under 30 checked | 🚨 Do not ship. Return to the drawing board, prioritize security and testing before anything else. |
Vibe coding isn't going away. The market sits at 7 billion today and is projected to hit 3 billion by 2027.
63% of vibe coding users are non-developers. The people building the next generation of software tools have less formal training in security and code quality than any generation before them.
That's not a criticism. It's just where we are. The tools are powerful. The speed is real. But there's a gap between "it works on my machine" and "it works safely at scale for real users" — and that gap is exactly what this checklist is for.
Run this audit before every major release. Automate what you can — linting, dependency audits, CI/CD tests. Review the rest by hand. Build the habit now. Catching a security flaw in review costs almost nothing. Catching it in a post-incident report costs a lot more than that.
Bookmark this page. Share it with your team, your co-founder, or your freelancer. PubGenius publishes practical resources like this to help builders ship better, faster, and with fewer regrets.
If this checklist helped you catch something before it became a production incident, we'd love to hear about it, drop us a note, or share this post with someone who needs it.