Starlette 1.0 skill
<p><strong>Research:</strong> <a href="https://github.com/simonw/research/tree/main/starlette-1-skill#readme">Starlette 1.0 skill</a></p> <p>See <a href="https://simonwillison.net/2026/Mar/22/starlette/">Experimenting with Starlette 1.0 with Claude skills</a>.</p> <p>Tags: <a href="https://simonwillison.net/tags/starlette">starlette</a></p>
Research
Starlette 1.0 skill — Starlette 1.0 Skill offers a concise guide for building robust web applications with Starlette, a lightweight ASGI framework. The accompanying demo showcases a task management app featuring projects, tasks, comments, and labels, illustrating Starlette's flexibility in handling routing, templating (Jinja2), async database operations (aiosqlite), and real-time updates.
Simon Willison Blog
https://simonwillison.net/2026/Mar/23/starlette-1-skill/#atom-everythingSign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
clauderesearchgithub
Help running Qwen3-Coder-Next TurboQuant (TQ3) model
I found a TQ3-quantized version of Qwen3-Coder-Next here: https://huggingface.co/edwardyoon79/Qwen3-Coder-Next-TQ3_0 According to the page, this model requires a compatible inference engine that supports TurboQuant. It also provides a command, but it doesn’t clearly specify which version or fork of llama.cpp should be used (or maybe I missed it). llama-server I’ve tried the following llama.cpp forks that claim to support TQ3, but none of them worked for me: https://github.com/TheTom/llama-cpp-turboquant https://github.com/turbo-tan/llama.cpp-tq3 https://github.com/drdotdot/llama.cpp-turbo3-tq3 If anyone has successfully run this model, I’d really appreciate it if you could share how you did it. submitted by /u/UnluckyTeam3478 [link] [comments]
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Is Turboquant really a game changer?
I am currently utilizing qwen3.5 and Gemma 4 model. Realized Gemma 4 requires 2x ram for same context length. As far as I understand, what turbo quant gives is quantizing kv cache into about 4 bit and minimize the loses But Q8 still not lose the context that much so isn't kv cache ram for qwen 3.5 q8 and Gemma 4 truboquant is the same? Is turboquant also applicable in qwen's cache architecture? because as far as I know they didn't tested it in qwen3.5 style kv cache in their paper. Just curious, I started to learn local LLM recently submitted by /u/Interesting-Print366 [link] [comments]



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!