23 cze 2025
DeepSeek's Nano-VLLM cuts VLLM's memory by 5x using dynamic KV cache, runs on CPU via NumPy, and adds speculative prefill. Targets edge AI devs.
31 maj 2025
DeepSeek's R1 is a lightweight AI model that runs on a single GPU.