Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessWhy TSMC grew four times faster than its foundry rivals in 2025 — price hikes, vertical integration, and commanding technology lead pay dividendstomshardware.comThe Complete DevSecOps Engineer Career Guide: From Pipeline Security to Platform Architect in 2026DEV CommunityOpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert TasksDEV CommunitySemantic matching in graph space without matrix computation and hallucinations and no GPUdiscuss.huggingface.coWhy We Built 5 Products on FastAPI + Next.js (and Would Do It Again)DEV CommunityHow We Run 5 Live SaaS Products on $35/Month in InfrastructureDEV CommunityOur Email Provider Banned Us Overnight -- Here's What We LearnedDEV CommunityThe AI Stack: A Practical Guide to Building Your Own Intelligent ApplicationsDEV Community🚀 Day 29 of My Automation Journey – Arrays (Full Guide + Tricky Questions)DEV CommunityThe Real Size of AI Frameworks: A Wake-Up CallDEV CommunityInside OmegaLessWrong AIGoogle quietly releases an offline-first AI dictation app on iOSTechCrunch AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessWhy TSMC grew four times faster than its foundry rivals in 2025 — price hikes, vertical integration, and commanding technology lead pay dividendstomshardware.comThe Complete DevSecOps Engineer Career Guide: From Pipeline Security to Platform Architect in 2026DEV CommunityOpenAI’s $1M API Credits, Holos’ Agentic Web, and Xpertbench’s Expert TasksDEV CommunitySemantic matching in graph space without matrix computation and hallucinations and no GPUdiscuss.huggingface.coWhy We Built 5 Products on FastAPI + Next.js (and Would Do It Again)DEV CommunityHow We Run 5 Live SaaS Products on $35/Month in InfrastructureDEV CommunityOur Email Provider Banned Us Overnight -- Here's What We LearnedDEV CommunityThe AI Stack: A Practical Guide to Building Your Own Intelligent ApplicationsDEV Community🚀 Day 29 of My Automation Journey – Arrays (Full Guide + Tricky Questions)DEV CommunityThe Real Size of AI Frameworks: A Wake-Up CallDEV CommunityInside OmegaLessWrong AIGoogle quietly releases an offline-first AI dictation app on iOSTechCrunch AI
AI NEWS HUBbyEIGENVECTOREigenvector

Lemonade by AMD: a fast and open source local LLM server using GPU and NPU

Hacker NewsApril 2, 20262 min read1 views
Source Quiz

Comments

Refreshingly fast

on GPUs and NPUs

Open source. Private. Ready in minutes on any PC.

Chat

What can I do with 128 GB of unified RAM?

Load up models like gpt-oss-120b or Qwen-Coder-Next for advanced tool use.

What should I tune first?

You can use --no-mmap to speed up load times and increase context size to 64 or more.

Image Generation

A pitcher of lemonade in the style of a renaissance painting

Speech

Hello, I am your AI assistant. What can I do for you today?

Open Source

Built by the local AI community for every PC.

Lemonade exists because local AI should be free, open, fast, and private.

Ecosystem

Works with great apps.

Lemonade is integrated in many apps and works out-of-box with hundreds more thanks to the OpenAI API standard.

Tech Specs

Built for practical local AI workflows.

Everything from install to runtime is optimized for fast setup, broad compatibility, and local-first execution.

Native C++ Backend

Lightweight service that is only 2MB.

One Minute Install

Simple installer that sets up the stack automatically.

OpenAI API Compatible

Works with hundreds of apps out-of-box and integrates in minutes.

Auto-configures for your hardware

Configures dependencies for your GPU and NPU.

Multi-engine compatibility

Works with llama.cpp, Ryzen AI SW, FastFlowLM, and more.

Multiple Models at Once

Run more than one model at the same time.

Cross-platform

A consistent experience across Windows, Linux, and macOS (beta).

Built-in app

A GUI that lets you download, try, and switch models quickly.

Unified API

One local service for every modality.

Point your app at Lemonade and get chat, vision, image gen, transcription, speech gen, and more with standard APIs.

POST /api/v1/chat/completions

``

Latest Release

Always improving.

Track the newest improvements and highlights from the Lemonade release stream.

Original source

Hacker News

https://lemonade-server.ai
Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

open source

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Lemonade by…open sourceHacker News

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 211 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models