Live
Black Hat USADark ReadingBlack Hat AsiaAI Business5 best practices to secure AI systemsAI NewsAI models fail at robot control without human-designed building blocks but agentic scaffolding closes the gap - the-decoder.comGoogle News - AI roboticsVulkan backend much easier on the CPU and GPU memory than CUDA.Reddit r/LocalLLaMAAn interview with Mustafa Suleyman on Microsoft s AI reorg, how revising its OpenAI contract "unlocked [Microsoft s] ability to pursue superintelligence", more (Hayden Field/The Verge)TechmemeTikTok's 'hidden game' shows it wants even more of our timeCreative Bloq AI DesignUS crude tops US$110, Wall Street falls after Trump vows more Iran attacksSCMP Tech (Asia AI)Qwen3.6-Plus: Towards Real World AgentsHacker News TopUnlocking the promise of smart factories: Advanced analytics powered by 5G provides a road map to the futureTech Monitor1.13.0a7CrewAI ReleasesCalls to Regulate Smart Glasses Are Officially DeafeningGizmodoUMW Inaugural AI Expert-in-Residence Shares Insight on Technology’s ‘Tremendous’ Impact - University of Mary WashingtonGoogle News: AIAmazon vs. Apple: Which Is the Better Artificial Intelligence (AI) Stock to Buy Today? - The Motley FoolGoogle News: AIBlack Hat USADark ReadingBlack Hat AsiaAI Business5 best practices to secure AI systemsAI NewsAI models fail at robot control without human-designed building blocks but agentic scaffolding closes the gap - the-decoder.comGoogle News - AI roboticsVulkan backend much easier on the CPU and GPU memory than CUDA.Reddit r/LocalLLaMAAn interview with Mustafa Suleyman on Microsoft s AI reorg, how revising its OpenAI contract "unlocked [Microsoft s] ability to pursue superintelligence", more (Hayden Field/The Verge)TechmemeTikTok's 'hidden game' shows it wants even more of our timeCreative Bloq AI DesignUS crude tops US$110, Wall Street falls after Trump vows more Iran attacksSCMP Tech (Asia AI)Qwen3.6-Plus: Towards Real World AgentsHacker News TopUnlocking the promise of smart factories: Advanced analytics powered by 5G provides a road map to the futureTech Monitor1.13.0a7CrewAI ReleasesCalls to Regulate Smart Glasses Are Officially DeafeningGizmodoUMW Inaugural AI Expert-in-Residence Shares Insight on Technology’s ‘Tremendous’ Impact - University of Mary WashingtonGoogle News: AIAmazon vs. Apple: Which Is the Better Artificial Intelligence (AI) Stock to Buy Today? - The Motley FoolGoogle News: AI
AI NEWS HUBbyEIGENVECTOREigenvector

APEX MoE quantized models boost with 33% faster inference and TurboQuant (14% of speedup in prompt processing)

Reddit r/LocalLLaMAby /u/mudler_it https://www.reddit.com/user/mudler_itApril 1, 20261 min read0 views
Source Quiz

I've just released APEX (Adaptive Precision for EXpert Models): a novel MoE quantization technique that outperforms Unsloth Dynamic 2.0 on accuracy while being 2x smaller for MoE architectures. Benchmarked on Qwen3.5-35B-A3B, but the method applies to any MoE model. Half the size of Q8. Perplexity comparable to F16. Works with stock llama.cpp with no patches. Open source (of course!), with github.com/mudler/LocalAI team! https://preview.redd.it/uv2bnfheymsg1.jpg?width=1632 format=pjpg auto=webp s=3eca979e8f9ca6b75d206eecdf29308b74aed530 Perplexity by itself doesn't say the full story. KL divergence tells a story perplexity doesn't: https://preview.redd.it/jn9ua2ksymsg1.jpg?width=1617 format=pjpg auto=webp s=7df969308e10aa6b6d31098c92fca1c14bb42a40 Tiers for every GPU: - I-Quality: 21.3 GB

Could not retrieve the full article text.

Read on Reddit r/LocalLLaMA →
Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
APEX MoE qu…llamamodelbenchmarkreleaseopen sourcestockReddit r/Lo…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 129 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!