Why Most Multi-Agent Frameworks Fail at Scale — open-kraken’s Control Plane Architecture (Paper
Hi,I'm preparing to submit my first paper to cs.AI on arXiv and would really appreciate feedback from the community.
Title:
Agent Organization: A Scheduling, Coordination, and Governance Architecture for Large-Scale Agents
Most existing multi-agent frameworks focus heavily on prompting, tool use, or message passing, but they don’t really solve the system-level problems that appear once you scale to hundreds or thousands of heterogeneous agents. Scheduling, reliable coordination, governance, and failure recovery quickly become the real bottlenecks.
In this work, we treat a large-scale agent system as an executable organization and formally define the Agent Coordination Problem (ACP). Both theoretically and empirically, we show that three components form a minimal reliable architecture:
AEL (Authoritative Execution Ledger) — provides global, immutable execution state
CWS (Budget-Aware Cognitive Workload Scheduler) — does intelligent quality–cost routing across providers
SEM (Shared Execution Memory) — enables cross-agent knowledge sharing and reuse
Removing any one of them causes clear degradation in robustness and efficiency.
On the implementation side (open-kraken), we ran the system at scale (1,200+ concurrent runs on a 32-node cluster) and saw strong robustness under 30% node failures, plus a 31.4% cost reduction through multi-provider routing. We also validated the architecture on embodied robotics (cloud–edge nested organization) and a real-world logistics network case study.
The English PDF is now available here:
https://zenodo.org/records/19676306
Full open-source code: https://github.com/open-kraken/open-kraken
I’d love any feedback — especially on the theory, architecture, or evaluation.
Also, if anyone here is eligible to endorse cs.AI submissions, I would really appreciate the help:
https://arxiv.org/auth/endorse?x=9FL6QT
Code: 9FL6QT
Thank you!
[留言]
为什么值得关注
能改变理解方式,而不只是重复常识;符合当前抓取需求;它提供了新的理解或解释,而不只是表面观点
来源:reddit,领域:tech,保留分:0.68
讨论总结
讨论量较低,暂无明显增量信息。