别再把LLM当加速器了——它正在创造以前根本不存在的东西
从“menugen”到“install .md”,再到自动赢得交易纠纷的AI客服,LLM正在创造一个全新的功能类别——这是传统软件工程永远无法企及的领域。
核心观点:LLM的真正价值不在于让旧事情做得更快,而在于让以前不可能做的事情成为可能。
当大多数人还在讨论LLM如何让程序员写代码更快、让客服更高效的时候,一场更根本的变革已经悄然发生。在这场变革里,LLM不再是一个加速器,而是创造者——它正在创造那些以前根本不可能存在的功能和产品。这不是一个渐进式的优化,而是一个范式的跳跃。
最近在Sequoia Ascent 2026的一次炉边谈话中,Andrej Karpathy提出了一个关键论点:LLM的意义远不止是加速现有的工作流程。他举了三个例子,每个都值得深思。第一个是“menugen”——一个完全由LLM驱动的应用,输入一张图片,输出一张图片,整个过程不需要任何传统代码。这个应用不是把已有的东西做得更快,而是创造了一个全新的功能类别。第二个例子更令人震惊:用“.md”技能替代“.sh”脚本。这意味着,不再需要编写复杂的bash脚本来安装软件,而是直接写一段自然语言描述,LLM就能理解并执行。这不仅是效率的提升,更是对“如何定义和执行任务”这一根本问题的重新思考。第三个例子是LLM知识库——一个在传统计算中根本不可能实现的功能,因为它需要对来自任意来源的非结构化数据进行计算。
但真正让我确信这一转变正在发生的,是另一个故事。一个创业者用AI编写了一个纠纷处理机器人,专门应对Stripe上的交易纠纷。在过去,他几乎从不处理纠纷,结果几乎每次都输。但现在,他的AI机器人自动收集证据、生成详细的PDF报告,甚至比人工做得更好。他刚刚赢了一笔1199美元的纠纷——这是十年来的第一次。这个AI不是让旧有的纠纷处理流程更快,而是让一个以前因为成本和复杂性而被忽视的功能变得可行。这正是LLM创造新可能性的最佳例证。
当然,这里有一个反方观点:LLM真的能创造新功能吗?还是只是把已有的任务用新的方式包装?一些批评者认为,所谓的“新功能”其实只是旧功能的变体——比如纠纷处理机器人,本质上还是自动化,只是做得更好了。但这种观点忽略了一个关键点:在LLM出现之前,我们根本无法想象一个人工智能可以理解自然语言描述、自动收集证据、并生成具有说服力的法律文件。这不是从“慢”到“快”的变化,而是从“不可能”到“可能”的飞跃。
另一方面,LLM的能力分布是“锯齿状”的——它可以在一个任务上表现惊人,在另一个看似简单的任务上却一塌糊涂。Karpathy在谈话中提到了这一点:为什么一个LLM可以重构一个10万行的代码库,却同时告诉你“走路去洗车场洗车”?他认为这源于可验证性与经济效益的交叉——那些利润丰厚的领域(如代码生成)被强化学习精心封装进训练数据分布,而其他领域则被遗忘。这意味着,LLM创造新功能的能力是不均等的,它更擅长在那些已经被数据覆盖的领域创新。
但即便有这些限制,我们也已经看到了令人振奋的迹象。另一个创业者在推文中提到,他的AI“状态监控”仪表盘自动发现了Cloudflare账单过高的问题,并促使客服退款。这个功能不是人类主动设计的,而是AI在“巡视”项目时自主发现的。这听起来像是科幻小说,但它确实正在发生。
更令人兴奋的是,这些功能正在从实验走向生产。Strix Halo的实战指南显示,AMD的Ryzen AI MAX+ 395正在被用于生产环境——运行一个名为Sahir的机器人、一个本地代理、一个朋友的隧道以及ComfyUI。这不是实验室里的玩具,而是真正的生产力工具。
所以,问题不在于LLM是否能取代程序员或客服,而在于我们是否敢于想象那些以前不可能存在的功能。当一家公司不再需要编写复杂的bash脚本,而是用一段自然语言就能安装软件;当纠纷处理从“忽略”变成“自动赢得”;当AI能主动发现问题并修正——我们正在见证的不是加速,而是创造。
当然,这并不意味着每个人都应该立刻抛弃传统软件工程。正如Karpathy指出的,LLM的“锯齿状”能力意味着我们在依赖它时必须谨慎。但毫无疑问,我们正在进入一个新时代:一个由LLM驱动的功能创造时代。在这个时代里,最宝贵的技能不是如何写代码,而是如何向LLM描述你想要的东西。
这不仅仅是技术上的变化,更是思维上的变化。我们不再问“如何让这件事做得更快”,而是问“这件事以前为什么不可能?”。当越来越多的不可能变成可能,我们唯一能做的就是拥抱这个新世界。
如果把这个判断再往前推一步,真正重要的不是 My full strix halo…、Fireside chat at Se…、🏆 For the first tim… 本身,而是它们共同暴露出的分配逻辑。 reddit、x 在同一轮里把注意力推向同一问题,通常意味着这个主题正在从圈层内部经验,转向更可共享的公共议题。 这也是为什么这种内容值得写成长文:短帖只负责提醒你“这里有事发生”,但只有长文才能把背景、代价、误判空间和后续影响放到同一张桌面上。 换句话说,LLM的真正价值不在于让旧事情做得更快,而在于让以前不可能做的事情成为可能。 之所以重要,不是因为它看上去新,而是因为它会重新定义用户接下来应该如何理解这一类内容。
参考来源
- My full strix halo tips and tricks - https://www.reddit.com/r/StrixHalo/comments/1t2h7pp/my_full_strix_halo_tips_and_tricks/
- Fireside chat at Sequoia Ascent 2026 from a ~week ago. Some highlights:
- The first theme I tried to push on is that LLMs are about a lot more than just speeding up what existed before (e.g. coding). Three examples of new horizons:
- 1. menugen: an app that can be fully engulfed by LLMs, with no classical code needed: input an image, output an image and an LLM can natively do the thing.
- 2. install .md skills instead of install .sh scripts. Why create a complex Software 1.0 bash script for e.g. installing a piece of software if you can write the installation out in words and say "just show this to your LLM". The LLM is an advanced interpreter of English and can intelligently target installation to your setup, debug everything inline, etc.
- 3. LLM knowledge bases as an example of something that was *impossible* with classical code because it's computation over unstructured data (knowledge) from arbitrary sources and in arbitrary formats, including simply text articles etc.
- I pushed on these because in every new paradigm change, the obvious things are always in the realm of speeding up or somehow improving what existed, but here we have examples of functionality that either suddenly perhaps shouldn't even exist (1,2), or was fundamentally not possible before (3).
- The second (ongoing) theme is trying to explain the pattern of jaggedness in LLMs. How it can be true that a single artifact will simultaneously 1) coherently refactor a 100,000-line code base *and* 2) tell you to walk to the car wash to wash your car. I previously wrote about the source of this as having to do with verifiability of a domain, here I expand on this as having to also do with economics because revenue/TAM dictates what the frontier labs choose to package into training data distributions during RL. You're either in the data distribution (on the rails of the RL circuits) and flying or you're off-roading in the jungle with a machete, in relative terms. Still not 100% satisfied with this, but it's an ongoing struggle to build an accurate model of LLM capabilities if you wish to practically take advantage of their power while avoiding their pitfalls, which brings me to...
- Last theme is the agent-native economy. The decomposition of products and services into sensors, actuators and logic (split up across all of 1.0/2.0/3.0 computing paradigms), how we can make information maximally legible to LLMs, some words on the quickly emerging agentic engineering and its skill set, related hiring practices, etc., possibly even hints/dreams of fully neural computing handling the vast majority of computation with some help from (classical) CPU coprocessors. - https://nitter.net/karpathy/status/2049903821095354523#m
- 🏆 For the first time in a decade on @Stripe I've started winning disputes with my vibe coded dispute responder
- I used to ignore disputes so I almost always lost them, now I've started winning, this one is the first big dispute for $1,199 USD!
- Whenever a dispute comes in, my site gets a webhook notice from Stripe, it then starts collecting evidence and generates a PDF with entire user's details, when they signed up, and most importantly what they did in the app
- In this case the user used the app for months, generated thousands of photos then tried to get the money back from their bank
- The evidence has to be REALLY detailed, and REALLY good, which is why it's perfect to vibe code it, you can get quite detailed with different types of users and activity on your app, and put that all in the PDF
- I'm shocked because I again I never would win disputes before
- People in US especially abuse the [ chargeback ] or [ dispute ] en masse, unlike the rest of the world, it's easily built into their banking app next to every transaction, so it's one tap to get free stuff. And why not? You get free stuff!
- It's destructive for business owners like me on many levels, if I get over 1% disputes on my account, I risk getting shutdown permanently by Stripe, Visa and MasterCard, like permanently for life, not just my business but on my personal name too, it's ruthless
- Disputes are also super expensive for business owners: you don't just pay back the amount they disputed, for every dispute you pay $30, which you only get back if you win!
- But with AI we can now create our own tools to fight back against dispute abuse and finally win! 🎉 - https://nitter.net/levelsio/status/2049847252680614105#m