b8629
sycl : fix llama_kv_cache hang when kv_cache is huge: 5GB ( #21283 ) macOS/iOS: macOS Apple Silicon (arm64) macOS Intel (x64) iOS XCFramework Linux: Ubuntu x64 (CPU) Ubuntu arm64 (CPU) Ubuntu s390x (CPU) Ubuntu x64 (Vulkan) Ubuntu arm64 (Vulkan) Ubuntu x64 (ROCm 7.2) Ubuntu x64 (OpenVINO) Windows: Windows x64 (CPU) Windows arm64 (CPU) Windows x64 (CUDA 12) - CUDA 12.4 DLLs Windows x64 (CUDA 13) - CUDA 13.1 DLLs Windows x64 (Vulkan) Windows x64 (SYCL) Windows x64 (HIP) openEuler: openEuler x86 (310p) openEuler x86 (910b, ACL Graph) openEuler aarch64 (310p) openEuler aarch64 (910b, ACL Graph)
Provide feedback
Saved searches
Use saved searches to filter your results more quickly
Sign up
Appearance settings
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
llama
Gemma 4 E4B + E2B Uncensored (Aggressive) — GGUF + K_P Quants (Multimodal: Vision, Video, Audio)
My first Gemma 4 uncensors are out. Two models dropping today, the E4B (4B) and E2B (2B). Both Aggressive variants, both fully multimodal. Aggressive means no refusals. I don't do any personality changes or alterations. The ORIGINAL Google release, just uncensored. Gemma 4 E4B (4B): https://huggingface.co/HauhauCS/Gemma-4-E4B-Uncensored-HauhauCS-Aggressive Gemma 4 E2B (2B): https://huggingface.co/HauhauCS/Gemma-4-E2B-Uncensored-HauhauCS-Aggressive 0/465 refusals * on both. Fully unlocked with zero capability loss. These are natively multimodal so text, image, video, and audio all in one model. The mmproj file is included for vision/audio support. What's included: E4B: Q8_K_P, Q6_K_P, Q5_K_P, Q5_K_M, Q4_K_P, Q4_K_M, IQ4_XS, Q3_K_P, Q3_K_M, IQ3_M, Q2_K_P + mmproj E2B: Q8_K_P, Q6_K_P, Q5_K_P,

I Built a Vision-Based Desktop Agent That Navigates by Screenshot. Here's What Actually Works.
I Built a Vision-Based Desktop Agent That Navigates by Screenshot. Here’s What Actually Works. DOM-based automation requires you to reverse-engineer someone else’s frontend and pray they don’t change it. They always change it. Source: Image by Resource Database on Unsplash Last month, I spent a couple of weeks attempting to build a testing framework for an app that includes a web app, a Slack app, and connections to multiple external sources, requiring testing of interface elements on external web interfaces. I managed to vibe engineer a Playwright-based test suite that “sort of™” worked. Until it didn’t. One of the external sites had updated its dashboard. Not a redesign, just a CSS class rename on a table component. Three automations targeting that table stopped working simultaneously. A
b8640
tests : add unit test coverage for llama_tensor_get_type ( #20112 ) Add unit test coverage for llama_tensor_get_type Fix merge conflicts, add more schemas clang formatter changes Trailing whitespace Update name Start rebase Updating files with upstream changes prior to rebase Changes needed from rebase Update attn_qkv schema, change throw behaviour Fix merge conflicts White space Update with latest changes to state counters Revert accidental personal CLAUDE.md changes Change quotation mark Reuse metadata.name since we have it Move test-only stuff out of llama-quant.cpp Hide the regex functionality back in llama-quant.cpp, use a unique pointer to a new struct 'compiled_tensor_type_patterns' which contains the patterns cont : inital deslop guidelines Cleanup based on review comments Continue
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Maybe a party-pooper but: A dozen 120B models later, and GPTOSS-120B is still king
Never consumes entire context walking in place. Never fails at tool calling. Never runs slow regardless the back-end. Never misses a piece of context in its entire window. Never slows down no matter how long the prompt is. As much as I despise OpenAI, I believe they've done something exceptional with that model. This is the Toyota Tacoma of open models and I see myself using it a 500K more miles. submitted by /u/ParaboloidalCrest [link] [comments]

One of the best sensible reasons that I can think of to have an llm downloaded on my cell phone would be emergency advice.
It seems like every conversation about derestricted models everyone treat you like a pervert. The fact is you can be sensible and be a pervert 😂. submitted by /u/RedParaglider [link] [comments]


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!