The Real Reason Gemini 3.1 Could Eventually Replace Your Keyboard - Geeky Gadgets
The Real Reason Gemini 3.1 Could Eventually Replace Your Keyboard Geeky Gadgets
Could not retrieve the full article text.
Read on GNews AI voice →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
gemini
Google Gemma 4: Everything Developers Need to Know
Google dropped Gemma 4 on April 2, 2026, A full generational jump in what open models can do at their parameter range and the first time in the Gemma family's history that one ships under Apache 2.0, meaning commercial use without permission-seeking. Some context: since Gemma's first generation, developers have downloaded the models over 400 million times and built more than 100,000 variants. Four Models, One Family Gemma 4 is a family of four, each aimed at a different point in the hardware spectrum. E2B : Effective 2 billion active parameters. Runs on smartphones, Raspberry Pi, Jetson Orin Nano. 128K context window. Handles images, video, and audio. Built for battery and memory efficiency. E4B : Effective 4 billion active parameters. Same hardware targets, higher reasoning quality. About

Gemma 4 is efficient with thinking tokens, but it will also happily reason for 10+ minutes if you prompt it to do so.
Tested both 26b and 31b in AI Studio. The task I asked of it was to crack a cypher. The top closed source models can crack this cypher at max thinking parameters, and Kimi 2.5 Thinking and Deepseek 3.2 are the only open source models to crack the cypher without tool use. (Of course, with the closed models you can't rule out 'secret' tool use on the backend.) When I first asked these models to crack the cypher, they thought for a short amount of time and then both hallucinated false 'translations' of the cypher. I added this to my prompt: Spare no effort to solve this, the stakes are high. Increase your thinking length to maximum in order to solve it. Double check and verify your results to rule out hallucination of an incorrect response. I did not expect dramatic results (we all laugh at p
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Tracking the emergence of linguistic structure in self-supervised models learning from speech
arXiv:2604.02043v1 Announce Type: cross Abstract: Self-supervised speech models learn effective representations of spoken language, which have been shown to reflect various aspects of linguistic structure. But when does such structure emerge in model training? We study the encoding of a wide range of linguistic structures, across layers and intermediate checkpoints of six Wav2Vec2 and HuBERT models trained on spoken Dutch. We find that different levels of linguistic structure show notably distinct layerwise patterns as well as learning trajectories, which can partially be explained by differences in their degree of abstraction from the acoustic signal and the timescale at which information from the input is integrated. Moreover, we find that the level at which pre-training objectives are d

My most common research advice: do quick sanity checks
Written quickly as part of the Inkhaven Residency . At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories: Doing quick sanity checks Saying precisely what you want to say Asking why one more time In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like. This piece covers doing quick sanity checks, which is the most common advice I give to junior researchers. I’ll cover the other two pieces of advice in a subsequent piece. Doing quick sanity checks Research is hard (almost by definition) and people are often wrong. Every researcher has wasted countless hours

Fast dynamical similarity analysis
arXiv:2511.22828v2 Announce Type: replace-cross Abstract: Understanding how nonlinear dynamical systems (e.g., artificial neural networks and neural circuits) process information requires comparing their underlying dynamics at scale, across diverse architectures and large neural recordings. While many similarity metrics exist, current approaches fall short for large-scale comparisons. Geometric methods are computationally efficient but fail to capture governing dynamics, limiting their accuracy. In contrast, traditional dynamical similarity methods are faithful to system dynamics but are often computationally prohibitive. We bridge this gap by combining the efficiency of geometric approaches with the fidelity of dynamical methods. We introduce fast dynamical similarity analysis (fastDSA),

Combining Masked Language Modeling and Cross-Modal Contrastive Learning for Prosody-Aware TTS
arXiv:2604.01247v1 Announce Type: cross Abstract: We investigate multi-stage pretraining for prosody modeling in diffusion-based TTS. A speaker-conditioned dual-stream encoder is trained with masked language modeling followed by SigLIP-style cross-modal contrastive learning using mixed-phoneme batches, with an additional same-phoneme refinement stage studied separately. We evaluate intrinsic text-audio retrieval and downstream synthesis in Grad-TTS and a latent diffusion TTS system. The two-stage curriculum (MLM + mixed-phoneme contrastive learning) achieves the best overall synthesis quality in terms of intelligibility, speaker similarity, and perceptual measures. Although same-phoneme refinement improves prosodic retrieval, it reduces phoneme discrimination and degrades synthesis. These


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!