On May 29, 2026, TechCrunch reported that South Korea–US chip startup XCENA raised $135 million in a Series B round at a $570 million valuation to commercialize its MX1 CXL computational memory chip. The funding will support bringing near‑memory compute hardware that offloads AI inference workloads from GPUs and CPUs into production by 2027.
This article aggregates reporting from 3 news sources. The TL;DR is AI-generated from original reporting. Race to AGI's analysis provides editorial context on implications for AGI development.
XCENA’s bet is that AI’s next bottleneck isn’t more flops, it’s memory bandwidth and locality. By pushing compute directly into the DRAM module over CXL, its MX1 chip targets the “glue code” around LLMs — preprocessing, KV‑cache management, and data shuffling — that still burns CPU cycles and power in today’s clusters. If their claims hold, inference loads that currently require 10 servers could run on one memory‑centric node, which would materially change the economics of serving large models at scale.
From an AGI perspective, this is important because frontier models are increasingly limited by system‑level constraints, not just parameter counts. Cheaper, denser inference makes it easier to deploy powerful models pervasively, including in agents that run persistently over large context windows. It also diversifies the hardware stack beyond GPU‑only thinking, which could erode Nvidia’s leverage over the entire value chain. For cloud providers and big labs, memory‑centric architectures like MX1 offer a plausible path to keep scaling usage even as GPU prices and power draw spike.