Articles tagged: vLLM

4 articles

Local models

Run a vLLM Server on HF Jobs in One Command

Learn how to launch a vLLM inference server on Hugging Face Jobs with a single command. This guide covers setup, configuration, and practi...

Jun 26, 20266 min
AI agents

3 Agents. 3 LLMs. 1 Aging GPU: Engineering Parallel Inference on Bare Metal

Learn how to run three AI agents with separate LLMs simultaneously on a single outdated GPU. This article covers bare-metal parallel infer...

Jun 25, 20267 min
Guides

Drilling Into AI’s Financial Sustainability

Explore the hidden costs of AI development and deployment, from hardware to energy. Learn practical strategies for budgeting, optimizing m...

Jun 17, 20266 min
Guides

DeepSeek Sharpens Its Reasoning: DeepSeek-R1, an Affordable Rival to OpenAI’s o1

DeepSeek-R1 brings advanced reasoning capabilities at a fraction of the cost of OpenAI’s o1. Learn how this open-source model matches o1 i...

Jun 15, 20266 min