Mark Zuckerberg, ‘Frustrated’ by Llama 4, Assembles Meta ‘Superintelligence’ Team - retailwire.com
<a href="https://news.google.com/rss/articles/CBMib0FVX3lxTE5QVml1aWhoMEV0ZGZoVXV0Z0FUVlY3WDFaWU5MSWxmTlJuZXd5ZExIaXZCdHBRcU9xOHpETUpjSGtzblpaT3Mzb1JROHJZTVNHWW1QbjJXWUpuTDN1ajRKVktLNVZDbmpJMTBZMjlRWQ?oc=5" target="_blank">Mark Zuckerberg, ‘Frustrated’ by Llama 4, Assembles Meta ‘Superintelligence’ Team</a> <font color="#6f6f6f">retailwire.com</font>
Could not retrieve the full article text.
Read on GNews AI Llama →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
llamasuperintelligence
A Quick Note on Gemma 4 Image Settings in Llama.cpp
In my last post, I mentioned using --image-min-tokens to increase the quality of image responses from Qwen3.5 . I went to load Gemma 4 the same way, and hit an error: [58175] srv process_chun: processing image... [58175] encoding image slice... [58175] image slice encoded in 7490 ms [58175] decoding image batch 1/2, n_tokens_batch = 2048 [58175] /Users/socg/llama.cpp-b8639/src/llama-context.cpp:1597: GGML_ASSERT((cparams.causal_attn || cparams.n_ubatch > = n_tokens_all ) "non-causal attention requires n_ubatch >= n_tokens" ) failed [58175] WARNING: Using native backtrace. Set GGML_BACKTRACE_LLDB for more info. [58175] WARNING: GGML_BACKTRACE_LLDB may cause native MacOS Terminal.app to crash. [58175] See: https://github.com/ggml-org/llama.cpp/pull/17869 [58175] 0 libggml-base.0.9.11.dylib 0
v4.3
Changes ik_llama.cpp support : Add ik_llama.cpp as a new backend: new textgen-portable-ik portable builds, new --ik flag for full installs. ik_llama.cpp is a fork by the author of the imatrix quants, including support for new quant types, significantly more accurate KV cache quantization (via Hadamard KV cache rotation, enabled by default), and optimizations for MoE models and CPU inference. API: Add echo + logprobs for /v1/completions . The completions endpoint now supports the echo and logprobs parameters, returning token-level log probabilities for both prompt and generated tokens. Token IDs are also included in the output via a new top_logprobs_ids field. Further optimize my custom gradio fork, saving up to 50 ms per UI event (button click, etc). Transformers: Autodetect torch_dtype fr

Google Gemma 4: Everything Developers Need to Know
Google dropped Gemma 4 on April 2, 2026, A full generational jump in what open models can do at their parameter range and the first time in the Gemma family's history that one ships under Apache 2.0, meaning commercial use without permission-seeking. Some context: since Gemma's first generation, developers have downloaded the models over 400 million times and built more than 100,000 variants. Four Models, One Family Gemma 4 is a family of four, each aimed at a different point in the hardware spectrum. E2B : Effective 2 billion active parameters. Runs on smartphones, Raspberry Pi, Jetson Orin Nano. 128K context window. Handles images, video, and audio. Built for battery and memory efficiency. E4B : Effective 4 billion active parameters. Same hardware targets, higher reasoning quality. About
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Tracking the emergence of linguistic structure in self-supervised models learning from speech
arXiv:2604.02043v1 Announce Type: cross Abstract: Self-supervised speech models learn effective representations of spoken language, which have been shown to reflect various aspects of linguistic structure. But when does such structure emerge in model training? We study the encoding of a wide range of linguistic structures, across layers and intermediate checkpoints of six Wav2Vec2 and HuBERT models trained on spoken Dutch. We find that different levels of linguistic structure show notably distinct layerwise patterns as well as learning trajectories, which can partially be explained by differences in their degree of abstraction from the acoustic signal and the timescale at which information from the input is integrated. Moreover, we find that the level at which pre-training objectives are d

My most common research advice: do quick sanity checks
Written quickly as part of the Inkhaven Residency . At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories: Doing quick sanity checks Saying precisely what you want to say Asking why one more time In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like. This piece covers doing quick sanity checks, which is the most common advice I give to junior researchers. I’ll cover the other two pieces of advice in a subsequent piece. Doing quick sanity checks Research is hard (almost by definition) and people are often wrong. Every researcher has wasted countless hours

Fast dynamical similarity analysis
arXiv:2511.22828v2 Announce Type: replace-cross Abstract: Understanding how nonlinear dynamical systems (e.g., artificial neural networks and neural circuits) process information requires comparing their underlying dynamics at scale, across diverse architectures and large neural recordings. While many similarity metrics exist, current approaches fall short for large-scale comparisons. Geometric methods are computationally efficient but fail to capture governing dynamics, limiting their accuracy. In contrast, traditional dynamical similarity methods are faithful to system dynamics but are often computationally prohibitive. We bridge this gap by combining the efficiency of geometric approaches with the fidelity of dynamical methods. We introduce fast dynamical similarity analysis (fastDSA),

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!