Scale AI cuts more contractors as it shifts toward more specialized AI training - Business Insider

Google News - Scale AI dataOctober 15, 20251 min read0 views

Scale AI cuts more contractors as it shifts toward more specialized AI training Business Insider

Could not retrieve the full article text.

Original source

Google News - Scale AI data

https://news.google.com/rss/articles/CBMiowFBVV95cUxPdTJoVnVmbkR2b19BbnNrbkt2Z1J3OFg5WFJxVkVKSFJZMW5FUlFlcHpwODFOUmV2WmM1Z3BvVVhEOWJBZWdTMTd0Vk9BejZ5ZFNOaWNWeWt0aEItc2MzWUNLcHZDQ3FWSkJDUHhYQWpLUms3X1hXZy0zQzRubHJQTkh5V1N0cUNiYVFpWDVRcVNfUzluZWg3Ry0xSXI2TVZucUdn?oc=5

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

training

Open Source AIFresh

Fine-tuned Gemma 4 E4B for structured JSON extraction from regulatory docs - 75% to 94% accuracy, notebook + 432 examples included

Gemma 4 dropped this week so I fine-tuned E4B for a specific task: extracting structured JSON (doc type, obligations, key fields) from technical and regulatory documents. https://preview.redd.it/v7yg80prpetg1.png?width=1026 format=png auto=webp s=517fb50868405f90a94f60b54b04608bcedd2ced Results on held-out test set: - doc_type accuracy: 75% base → 94% fine-tuned - Hallucinated obligations: 1.25/doc → 0.59/doc - JSON validity: 100% - Field coverage: 100% Setup: - QLoRA 4-bit, LoRA r=16 alpha=16, Unsloth + TRL - 432 training examples across 8 doc types - 5 epochs on a single L4, ~10 min training time - Final train loss 1.04, eval loss 1.12 The whole thing is open: notebook, dataset, serve.py for FastAPI inference. https://github.com/spriyads-vault/gemma4-docparse Some things I learned the ha

Reddit r/LocalLLaMA

2mabout 3 hours ago

ModelsLive

Building a Decentralized Mesh Network in Rust — Lessons from the Global South

The Problem 2.6 billion people lack reliable internet access. When disasters strike, infrastructure fails, or communities are remote — traditional communication breaks down precisely when coordination is most critical. I'm a cybersecurity student in Nairobi, Kenya. I've seen what happens when communities lose connectivity: families can't check on each other after floods, rescue teams can't coordinate, and activists can't organize safely. So I built GhostWire — a decentralized, censorship-resistant mesh communication platform that works without any central servers. What Is GhostWire? GhostWire is a peer-to-peer encrypted communication platform written in Rust. Instead of connecting to a server, devices connect directly to each other. Messages hop from node to node through whatever path is a

DEV Community

4m41 minutes ago

ProductsLive

Caveman Claude: The Token-Cutting Skill That's Changing AI Workflows

Caveman Claude: The Token-Cutting Skill That's Changing AI Workflows Meta Description: Discover the Claude Code skill that makes Claude talk like a caveman, cutting token use dramatically. Save money and speed up AI workflows with this clever technique. TL;DR: A creative Claude Code custom skill forces Claude to respond in ultra-compressed "caveman speak" — stripping out filler words, pleasantries, and verbose explanations. The result? Responses that use significantly fewer tokens while still conveying the essential information. It's quirky, it's effective, and developers are using it to slash API costs and speed up their AI pipelines. The Problem With AI That Talks Too Much If you've spent any real time working with Claude through the API or Claude Code, you've noticed something: the mode

Dev.to AI

11mabout 1 hour ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 193 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Is cutting ‘please’ and ‘thank you’ when talking to ChatGPT better for the planet? An expert explains - The Independent

Is cutting ‘please’ and ‘thank you’ when talking to ChatGPT better for the planet? An expert explains The Independent

Google News: ChatGPT

1m19 minutes ago

Models

Anthropic Races to Contain Leak of Code Behind Claude AI Agent - WSJ

Anthropic Races to Contain Leak of Code Behind Claude AI Agent WSJ

Google News: Claude

1m4 days ago

Models

Can We Fine-Tune a 0.6B LLM with GRPO for Trading? | by Seb | Mar, 2026 - DataDrivenInvestor

Can We Fine-Tune a 0.6B LLM with GRPO for Trading? | by Seb | Mar, 2026 DataDrivenInvestor

GNews AI fine-tuning

1m19 days ago

ModelsFresh

I wrote a fused MoE dispatch kernel in pure Triton that beats Megablocks on Mixtral and DeepSeek at inference batch sizes

Been working on custom Triton kernels for LLM inference for a while. My latest project: a fused MoE dispatch pipeline that handles the full forward pass in 5 kernel launches instead of 24+ in the naive approach. Results on Mixtral-8x7B (A100): Tokens vs PyTorch vs Megablocks 32 4.9x 131% 128 5.8x 124% 512 6.5x 89% At 32 and 128 tokens (where most inference serving actually happens), it's faster than Stanford's CUDA-optimized Megablocks. At 512+ Megablocks pulls ahead with its hand-tuned block-sparse matmul. The key trick is fusing the gate+up projection so both GEMMs share the same input tile from L2 cache, and the SiLU activation happens in registers without ever hitting global memory. Saves ~470MB of memory traffic per forward pass on Mixtral. Also tested on DeepSeek-V3 (256 experts) and

Reddit r/LocalLLaMA

1mabout 3 hours ago