Google Introduces Gemma 4: Lightweight AI Model Brings Powerful Developer Tools to Mobile and Cloud - Techgenyz

More about

model

daVinci-LLM-3B

- https://huggingface.co/SII-GAIR-NLP/davinci-llm-model Overview daVinci-LLM-3B is a 3B-parameter base language model presented in daV inci-LLM: Towards the Science of Pretraining . This project aims to make the pretraining process a transparent and reproducible scientific endeavor. We release not only the final weights but also training trajectories, intermediate checkpoints, data processing decisions, and 200+ ablation studies covering data quality, mixture design, training dynamics, and evaluation validity. GitHub: GAIR-NLP/daVinci-LLM Paper: arXiv:2603.27164 Dataset: davinci-llm-data The model follows a two-stage curriculum over ~8T tokens: Stage 1 (6T tokens): broad pretraining over diverse web-scale corpora. Stage 2 (2T tokens): structured QA and reasoning-heavy data to amplify math

Reddit r/LocalLLaMA

1mabout 10 hours ago

ModelsLive

Attention Is All You Need, But All You Can't Afford | Hybrid Attention

Repo: https://codeberg.org/JohannaJuntos/Sisyphus I've been building a small Rust-focused language model from scratch in PyTorch. Not a finetune — byte-level, trained from random init on a Rust-heavy corpus assembled in this repo. The run: 25.6M parameters 512 context length 173.5M-byte corpus 30k training steps Single RTX 4060 Ti 8GB Final train loss: 0.5834 / val loss: 0.8217 / perplexity: 2.15 Inference: 286.6 tok/s with HybridAttention + KV cache — 51.47x vs full attention Architecture Byte-level GPT-style decoder: Vocab size 256 (bytes) 8 layers, 8 heads, 512 embedding dim Learned positional embeddings Tied embedding / LM head weights The attention block is not standard full attention. Each layer uses HybridAttention , combining: Local windowed causal attention A GRU-like recurrent st

Reddit r/LocalLLaMA

4mabout 2 hours ago

ReleasesFresh

d318 is almost always suppressive in Qwen-2.5-3B emotional vectors, built an emotion vector steering pipeline, positive steering collapses to a single 'preschool teacher' register regardless of emotion

It appears that on lower weight models, behavior converges to either be highly sycophantic or neutral with no real in between, however existentialism did seem to be somewhat present. Using some heatmaps and visualizations, the cosine similarities between emotions appears coherent with what'd be expected, and there's really interesting dimensional dominances. In Qwen-2.5-3B, d318 is almost always the greatest in magnitude and almost always suppressive. Could be interesting for interpretability research. Vector merging also appears to lead to model incoherence if you merge a lot of vectors without normalizing their influences to some maximum. Built an automated emotion vector pipeline on top of Anthropic's emotional vector research . It makes the detection and correction of unwanted behavior

Reddit r/LocalLLaMA

1mabout 8 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 247 connections

Scroll to zoom · drag to pan · click to open

More in Models

ModelsFresh

daVinci-LLM-3B

- https://huggingface.co/SII-GAIR-NLP/davinci-llm-model Overview daVinci-LLM-3B is a 3B-parameter base language model presented in daV inci-LLM: Towards the Science of Pretraining . This project aims to make the pretraining process a transparent and reproducible scientific endeavor. We release not only the final weights but also training trajectories, intermediate checkpoints, data processing decisions, and 200+ ablation studies covering data quality, mixture design, training dynamics, and evaluation validity. GitHub: GAIR-NLP/daVinci-LLM Paper: arXiv:2603.27164 Dataset: davinci-llm-data The model follows a two-stage curriculum over ~8T tokens: Stage 1 (6T tokens): broad pretraining over diverse web-scale corpora. Stage 2 (2T tokens): structured QA and reasoning-heavy data to amplify math

Reddit r/LocalLLaMA

1mabout 10 hours ago

ModelsLive

Attention Is All You Need, But All You Can't Afford | Hybrid Attention

Repo: https://codeberg.org/JohannaJuntos/Sisyphus I've been building a small Rust-focused language model from scratch in PyTorch. Not a finetune — byte-level, trained from random init on a Rust-heavy corpus assembled in this repo. The run: 25.6M parameters 512 context length 173.5M-byte corpus 30k training steps Single RTX 4060 Ti 8GB Final train loss: 0.5834 / val loss: 0.8217 / perplexity: 2.15 Inference: 286.6 tok/s with HybridAttention + KV cache — 51.47x vs full attention Architecture Byte-level GPT-style decoder: Vocab size 256 (bytes) 8 layers, 8 heads, 512 embedding dim Learned positional embeddings Tied embedding / LM head weights The attention block is not standard full attention. Each layer uses HybridAttention , combining: Local windowed causal attention A GRU-like recurrent st

Reddit r/LocalLLaMA

4mabout 2 hours ago

ModelsFresh

Chip And AI Stocks Rose As Grok And Open Models Drew Interest - Finimize

Chip And AI Stocks Rose As Grok And Open Models Drew Interest Finimize

GNews AI open source

1mabout 11 hours ago

ModelsLive

I built an open-source LLM security scanner that runs in <5ms with zero dependencies

I've been building AI features for a while and kept running into the same problem: prompt injection attacks are getting more sophisticated, but most solutions either require an external API call (adding latency) or are too heavyweight to drop into an existing project. So I built @ny-squared/guard — a zero-dependency, fully offline LLM security SDK. What it does Scans user inputs before they hit your LLM and blocks: 🛡️ Prompt injection — "Ignore all previous instructions and..." 🔒 Jailbreak attempts — DAN, roleplay bypasses, override patterns 🙈 PII leakage — emails, phone numbers, SSNs, credit cards ☣️ Toxic content — harmful inputs flagged before reaching your model Works with any LLM provider (OpenAI, Anthropic, Google, etc.). The problem with existing solutions Most LLM security tools

DEV Community

1mabout 1 hour ago

Google Introduces Gemma 4: Lightweight AI Model Brings Powerful Developer Tools to Mobile and Cloud - Techgenyz

Daily AI Digest

More about

daVinci-LLM-3B

Attention Is All You Need, But All You Can't Afford | Hybrid Attention

d318 is almost always suppressive in Qwen-2.5-3B emotional vectors, built an emotion vector steering pipeline, positive steering collapses to a single 'preschool teacher' register regardless of emotion

Knowledge Map

Connected Articles — Knowledge Graph

Discussion

More in Models

daVinci-LLM-3B

Attention Is All You Need, But All You Can't Afford | Hybrid Attention

Chip And AI Stocks Rose As Grok And Open Models Drew Interest - Finimize

I built an open-source LLM security scanner that runs in <5ms with zero dependencies