Here is what happened to code review in 2026. AI writes the code. AI reviews the code. A human clicks approve. The PR merges. Everyone moves on. Nobody in that chain actually read the diff.
This is not a prediction. This is Tuesday.
46% of all code on GitHub is now AI-generated. In Java repositories, that number is 61%. Teams with high AI adoption merge 98% more pull requests than they did before. And the time spent reviewing those pull requests increased by 91%. Not because people are reviewing more carefully. Because there is more to review and the reviews take longer to skim.
The volume of AI-generated code is projected to outstrip human review capacity by 40% this year. The code is arriving faster than humans can read it. So they stopped reading it.
The Ceremony
Code review used to mean something. A senior developer would read your pull request line by line. They would catch the bug you missed. They would explain why your approach had a hidden performance problem. They would push back on the architecture. They would teach you something. The review was not a gate. It was a conversation.
Now it is a gate. And the gate is made of rubber.
The pull request arrives. It is 400 lines long. It was generated by an AI agent that restructured the database queries, added error handling, and wrote tests. The code looks clean. The tests pass. The linter is happy. CI is green. The reviewer has their own AI-generated PRs to ship. They are chasing the same tokens as the author. They skim the diff. They click approve. Everyone moves on.
Nobody read the code. The code was reviewed. Those are two different things and the industry is pretending they are the same.
Camp One: Reviews Still Matter
The review defenders have data on their side. 40 to 45 percent of AI-generated code contains security vulnerabilities. That is not a fringe finding. That is across multiple studies from Stanford, NYU, and Veracode. XSS failures hit 86% in Java code generated by AI. Design-level security flaws, authentication bypasses, insecure direct object references, broken session management, increased 153%.
AI-assisted developers produce three to four times more code but generate ten times more security issues. Over 10,000 new security findings per month from AI-generated code alone. And 43% of AI-generated code changes require production debugging even after passing QA and staging.
There was a prompt injection vulnerability in GitHub Copilot Chat, rated CVSS 9.6, that allowed attackers to exfiltrate AWS keys from private repositories through hidden instructions in PR comments. The code review process was supposed to catch that. It did not, because nobody was reading PR comments for prompt injection attacks. That threat did not exist when the review process was designed. These vulnerabilities are one force in a convergence the industry is not prepared for.
This camp will tell you that human review is the last line of defense against AI-generated vulnerabilities. That the 91% increase in review time is not a problem to optimize away but evidence that review needs to become more rigorous, not less. That the moment you remove the human from the loop entirely, you are shipping code that nobody on the team can vouch for.
Camp Two: Review Is Theater
The review skeptics have a different argument and it is harder to dismiss than the defenders would like.
Nobody actually reads 500-line PRs. They did not read them before AI, and they are definitely not reading them now that the volume doubled. The rubber-stamp culture is not new. AI just made it visible by increasing the volume to the point where the pretense collapsed.
As one prominent essay on the death of code review put it: every engineering org has the same dirty secret. PRs sitting for days. Rubber-stamp approvals. Reviewers skimming 500-line diffs because they have their own work to do. Human-written code died in 2025. Human code review dies in 2026.
The argument is not that review does not matter. The argument is that line-by-line review of AI-generated code is the wrong checkpoint. The human should be upstream, authoring the spec and acceptance criteria, not downstream reading diffs they did not write and cannot fully contextualize at the speed they arrive.
This camp will tell you that the ceremony of code review is being preserved for political reasons, not engineering reasons. That LGTM was always the most common review comment, and that AI just made the honesty about that impossible to avoid.
What Happens When Nobody Reads the Code
Amazon found out in March 2026. AI-assisted code changes deployed without proper review triggered outages that cost an estimated 6.3 million lost orders. Amazon initiated a 90-day code safety reset across 335 systems. GitHub itself logged 257 incidents between May 2025 and April 2026, roughly one per week, driven by the explosion of AI-generated code and agentic workflows.
These are not small companies with loose processes. These are the companies that built the tools generating the code. If they cannot keep up with review, nobody can.
The pattern is consistent. AI generates code faster than humans can review it. The backlog grows. The pressure to ship increases. Review becomes cursory. Bugs ship to production. Incidents happen. The response is always the same: we need to improve our review process. But the process is not the problem. The volume is the problem. And the volume is not going down.
The Knowledge Transfer Problem
There is a quieter crisis underneath the security headlines. Code review was how knowledge transferred between engineers. A senior reviewing a junior's PR was not just catching bugs. They were teaching architecture. They were explaining why this pattern causes problems at scale. They were sharing context about the system that is not written down anywhere.
When AI writes the code and AI reviews it, that transfer stops. The senior does not read the junior's code because the junior did not write it. The junior does not learn from the review because the review is automated. The codebase grows in capability and shrinks in comprehension. More features, fewer people who understand how they work.
The optimistic data says juniors receiving AI feedback improved code quality 3.2 times faster, cutting onboarding from six months to eight weeks. The pessimistic interpretation is that they learned to satisfy the AI's criteria without understanding why those criteria exist. They optimized for the metric without learning the principle.
Gartner predicts 80% of engineers will need upskilling by 2027 specifically for AI collaboration. The codebase becomes legible to AI but opaque to the humans responsible for it. And when something breaks in a way the AI does not understand, the human who also does not understand it is the one on call.
The New Bottleneck
The bottleneck in software development used to be writing code. Then it was shipping code. Now it is understanding code.
AI can write a service in hours that would have taken weeks. AI can review the PR and catch the obvious issues. AI can generate tests that cover the happy path and most edge cases. What AI cannot do is tell you whether this service belongs in this system. Whether the architecture you are building will hold when usage doubles. Whether the trade-off you made today will become a production incident in six months.
Those judgments require understanding. Understanding requires reading. And nobody is reading.
The teams that figure this out will not go back to line-by-line reviews. That ship has sailed. They will move the human checkpoint upstream. Review the spec, not the diff. Define what the code should do and let AI do it, then validate the behavior, not the implementation. Test the system, not the syntax. Invest in integration tests and observability rather than code review for implementation correctness.
The teams that do not figure this out will ship faster and faster until something breaks that nobody on the team knows how to fix. Then they will have an incident review where the root cause is that nobody understood the system, and nobody will know what to do about that because the review process that was supposed to ensure understanding has been a rubber stamp for eighteen months.
So What Do You Do?
If you are a reviewer: stop pretending you read 500-line AI-generated diffs. You did not. Everyone knows you did not. Focus on what humans are still better at. Does this change make architectural sense? Does it introduce a pattern that will cause problems at scale? Does the test coverage match the risk? If you can answer those questions, you added value. If you cannot, you were a rubber stamp and the green checkmark meant nothing.
If you are a team lead: measure what you actually care about. If review time is your metric, you are measuring the ceremony, not the outcome. Measure production incidents. Measure security findings. Measure how quickly a new team member can understand a service. Those tell you whether your code is understood. PR approval time tells you nothing.
If you are a developer: understand what you ship. Not every line. That is not possible anymore and was barely possible before. But understand the architecture. Understand the data flow. Understand the failure modes. If you cannot explain what a service does without asking the AI to explain it to you, you do not understand it. And the person on call at 2am when it breaks should be someone who understands it.
Nobody reads your code anymore. The question is whether anyone understands it. Those are different things, and the gap between them is where the next generation of production incidents is being born.


















