v0.20.1: Revert "enable flash attention for gemma4 (#15296)" (#15311)
This reverts commit c8e0878 .
What's Changed
-
bench: add prompt calibration, context size flag, and NumCtx reporting by @dhiltgen in #15158
-
model/parsers: fix gemma4 arg parsing when quoted strings contain " by @drifkin in #15254
-
ggml: skip cublasGemmBatchedEx during graph reservation by @jessegross in #15301
-
gemma4: enable flash attention by @dhiltgen in #15296
-
ggml: fix ROCm build for cublasGemmBatchedEx reserve wrapper by @jessegross in #15305
-
model/parsers: rework gemma4 tool call handling by @drifkin in #15306
Full Changelog: v0.20.0...v0.20.1-rc2
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Generative UI

Newegg shaves $240 off this well-equipped RTX 5070, 7800X3D gaming PC — at $1,929, this CyberpowerPC is at least $100 less than the current cost of its components
Newegg shaves $240 off this well-equipped RTX 5070, 7800X3D gaming PC — at $1,929, this CyberpowerPC is at least $100 less than the current cost of its components



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!