当150k行代码没有一行手写:AI编程的繁荣背后,是技术的自我背叛
当开发者自豪地宣称“零手工代码、仅靠Claude Code在六周内输出15万行TypeScript”时,技术界不应只看到效率奇迹,而应警觉:这种生产方式正在用不可追踪的自动生成代码,替换掉我们仅存的对系统逻辑的把控能力。
核心观点:AI辅助代码生成的速度神话,正掩盖一个危险的趋势:我们正在用不可解释、不可维护、不可审计的“黑箱代码”替换原本清晰的基础设施,而这将导致未来几年内最严重的技术债危机。
一位开发者上周在Reddit上宣布,他独自一人在六周内,用Claude Code输出了15万行TypeScript代码、70个已合并的PR以及147个测试文件,而且“基本上没有手写一行代码”。这条帖子在技术社区引发了大量赞美——“这就是未来”、“AI让个人开发者拥有了全栈团队的能力”。但很少有人停下来问一个更根本的问题:这些代码,真的属于我们吗?
这个问题不是哲学上的,而是工程上的。当代码的每一行都不是由人类有意识地决策、推理和书写,而是由大语言模型根据概率分布“猜测”出来的时候,人类开发者实际上已经放弃了两个最重要的东西:对代码的深层理解和对其长期演化的控制。我们正在用速度交换透明度,用眼前的生产力交换未来的可维护性。而这些代价,将在项目生命周期中逐渐显现。
更令人不安的是,这不仅仅是个人开发者的选择。类似的现象正在整个技术行业蔓延。开发者们开始讨论“安装.md技能取代安装.sh脚本”——用自然语言描述安装过程,让LLM去理解并执行。这听起来像是更智能的接口,但它同时意味着:你再也无法预判系统会做什么。一个bash脚本是确定性的,你可以逐行审计、修改、理解。而一个LLM对自然语言指令的解读,却受到训练数据、上下文窗口、随机采样等多种因素的影响,结果不可复制,行为不可预测。这不是进步,这是把控制权交给了你无法真正理解的智能体。
支持者会反驳说,这种担忧是老旧思维。他们认为,只要最终结果正确,过程不重要。但这是对软件工程本质的误解。软件工程之所以区别于“写脚本”,恰恰在于我们不仅关心输出,还关心系统在时间维度上的行为——如何演进、如何调试、如何扩展。当代码来自一个黑箱生成器,当每一行都带着概率性的不确定性,传统的调试和重构手段将失效。一个由LLM生成的代码库,其内部结构不是以人类可理解的逻辑组织起来的,而是以概率相关的方式拼凑在一起的。查找一个bug可能不是沿着逻辑路径回溯,而是需要重新运行生成器,期待它产生一个不同的、不包含bug的版本。这听起来像科幻,但这就是我们正在走向的未来。
当然,也有一个更乐观的解读视角。有些人认为,代码生成工具只是放大了个人的能力,让开发者能够专注于更高层次的架构设计,而不是纠结于具体实现。类似地,Karpathy在最近的Sequoia Ascent谈话中提出,LLM的真正价值不在于加速现有工作,而在于创造全新范式——比如全由神经网络驱动的应用(如菜单生成器menugen),这些应用在传统软件架构下根本不可能实现。他举例说,有些功能“也许不应该存在”,比如用自然语言写安装脚本,或者构建能够处理任意非结构化知识的知识库,这些都是以前的确定性代码无法触及的领域。
但这里存在一个关键的分歧点。Karpathy自己也承认,LLM的能力分布是不均匀的——它可以在一次对话中重构10万行代码,同时却建议你“走到洗车场去洗车”。他把这种现象归结为“锯齿形能力”模式,并认为这与领域可验证性和经济学有关。然而,他忽略了一个更直接的后果:当一个系统既能在某些任务上表现出超人能力,又能在另一些任务上犯下低级错误时,你如何信任它对关键基础设施的生成?你如何确保那15万行代码中没有隐藏着一个LLM“走神”时写出的逻辑黑洞?
更讽刺的是,这股“vibe coding”浪潮正在与另一个技术趋势——AI安全与漏洞管理——迎头相撞。就在同一周,安全社区在讨论为什么传统的CVE(通用漏洞披露)系统无法适用于AI代理。CVE被设计用来标记缓冲区溢出、SQL注入这类确定性漏洞,但面对AI代理中出现的“技能文件攻击”“提示注入”“模型行为漂移”等新型威胁,CVE的编号机制毫无用处。人们不得不提出AVE(AI漏洞评估)这样的新体系。这说明一个事实:当代码的生产方式从人类手工转向AI自动生成时,与之配套的安全、审计、质量保障体系完全脱节了。我们正在用2010年的工具去管理2026年的代码生成器产出的不可预测性。
教育背景的讨论也与此相关。Paul Graham在X上评论说,他观察到最成功的创始人往往拥有一种“特殊的、古怪的、来自长期浸润的洞见”,而不是“AI for X”这样的通用点子。他警告那些在18岁就想退学创业的人:没有时间积累这种“来自生活经历的洞见”,很难做出真正差异化的产品。这个逻辑在代码生成领域同样成立:你如何在不真正编写代码、不经历痛苦调试、不理解底层实现的情况下,获得对系统真正深入的洞见?一个从未手写代码的开发者,就像从未亲手做饭的美食评论家——他可能知道结果好不好吃,但永远无法理解为什么好吃,更无法在有需要的时候改进配方。
当然,我并不是在全面否定AI代码生成的价值。在很多领域,它确实能极大提升效率,尤其是在原型验证、胶水代码编写、常规CRUD操作等场景。但问题在于,目前社区对它的讨论已经完全偏向一边——所有人都在谈论速度、产出、个人生产力的放大,很少有人谈论可维护性、可审计性、技术债和长期成本。这种信息不对称正在制造一个危险的共识:代码生成是“免费的午餐”。它不是。
历史上每一次工具革命都伴随着类似的过度乐观。当高级语言取代汇编时,有人说我们失去了对底层的控制;当框架和库大规模普及时,有人说我们正在失去对系统行为的理解。但这些担心的共同点是:人类仍然掌握着核心逻辑,只是将重复性劳动外包了。而AI代码生成的不同之处在于,它外包的不是重复性劳动,而是思维本身。你不再是在写代码,你是在“描述”代码。你不再是工程师,你变成了产品经理,而你的“开发团队”是一个概率模型。
我预计,在未来一到两年内,将出现第一波因过度依赖AI代码生成而导致的严重事故——可能是关键基础设施中的逻辑错误导致服务中断,可能是安全漏洞被批量植入,也可能是大型代码库变得无法维护,只能整体重写。届时,技术社区才会开始认真反思:我们是否在追求速度的道路上,抛弃了太多不该抛弃的东西。
代码从来不只是代码。它是决策的记录,是逻辑的具象化,是人类思考的结晶。当它不再代表这些,而只是一个概率模型的输出时,我们实际上正在用看不见的、不可预测的、不可理解的智能废墟,替换掉我们精心建造的工程大厦。速度是诱人的,但它不应该成为我们放弃理解的借口。
参考来源
- This kinda ties into the "you're competing with non-tech people now" from the tweet yday about the Indonesian girl getting to $800 MRR within a month
- It's all about your idea plugged into the cultural zeitgeist and then your style of execution which is based on who you are a s person
- Every little thing you experienced influences the choices you make when building a product too, small tiny details that you do different that are unmeasurable but turn out to be a big reason why users like your product over others
- For me the easiest way to get more life experience always has been to just go travel, even better travel for loooong times, live in foreign places by yourself for months (maybe years), preferrably solo, something happens to you that changes you as a person
- You wanna do this in your 20s/30s but you can do it any age, it's just that if you're not single anymore, your style of travel usually changes into more normie patterns but you can still do it
- Go to places where few other people go, I always talk about China because so few people visit it, yet it's a world leader now in so many things, you'll learn so many things just being there
- For me it started when I studied abroad in 2009 in Korea, it reset my mind and identity is such a fundamental way that everything that came after for me (like going nomad in 2013, building startups, becoming a perpetual immigrant away from my home country forever) can kinda be lead to that moment
- Fly somewhere far for months, by yourself, if you can, and you'll get those life experiences that will change you forever, make you a better person and also help you make better products! - https://nitter.net/levelsio/status/2058550078059475098#m
- How I shipped 150k LOC of infra in 6 weeks — no hand-written code, just Claude Code - https://www.reddit.com/r/vibecoding/comments/1tnsh9k/how_i_shipped_150k_loc_of_infra_in_6_weeks_no/
- Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights:
- The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons:
- 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing.
- 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc.
- 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc.
- I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3).
- The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to...
- Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors. - https://nitter.net/karpathy/status/2049903821095354523#m