When you purchase through links on our site, we may earn an affiliate commission.Heres how it works.
The company’s Memory Machine software uses CXL to reduce idle time in GPUs triggered by memory loading.
Impressive results
The demo utilized a high-throughput FlexGen generation engine and an OPT-66B large language model.