Artificial Intelligence engines are sucking up global water supplies - The Tanzania Times
<a href="https://news.google.com/rss/articles/CBMijgFBVV95cUxPalZRNFV2WUNxZXYzbE1QNjVmNFBpRW9UczZ6WU0zZHZSa2JCUm9XS1p2Vy12SjFKaTF4T3FBTXpyOFlpM3BrNUNrTFZxempxa1dKeTd2a0ZMakhGaXpPTjh2ZjFzX00wcjkxVE1vXzJUMS02Mm8tQ0hiVUhwZVJicHJURW1QVkhadm82cTRn?oc=5" target="_blank">Artificial Intelligence engines are sucking up global water supplies</a> <font color="#6f6f6f">The Tanzania Times</font>
Could not retrieve the full article text.
Read on Google News - AI Tanzania →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
global
I wrote a fused MoE dispatch kernel in pure Triton that beats Megablocks on Mixtral and DeepSeek at inference batch sizes
Been working on custom Triton kernels for LLM inference for a while. My latest project: a fused MoE dispatch pipeline that handles the full forward pass in 5 kernel launches instead of 24+ in the naive approach. Results on Mixtral-8x7B (A100): Tokens vs PyTorch vs Megablocks 32 4.9x 131% 128 5.8x 124% 512 6.5x 89% At 32 and 128 tokens (where most inference serving actually happens), it's faster than Stanford's CUDA-optimized Megablocks. At 512+ Megablocks pulls ahead with its hand-tuned block-sparse matmul. The key trick is fusing the gate+up projection so both GEMMs share the same input tile from L2 cache, and the SiLU activation happens in registers without ever hitting global memory. Saves ~470MB of memory traffic per forward pass on Mixtral. Also tested on DeepSeek-V3 (256 experts) and
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Countries

UK confirms drone-killing DragonFire laser weapon for Royal Navy destroyers by 2027 —laser downs 400mph high‑speed drones, costs $13 per shot
UK confirms drone-killing DragonFire laser weapon for Royal Navy destroyers by 2027 —laser downs 400mph high‑speed drones, costs $13 per shot

AMD's upcoming Ryzen 9 9950X3D2 listed around $1,000 at several retailers across Canada and the UK — New flagship dual-cache CPU might demand a hefty premium
AMD's upcoming Ryzen 9 9950X3D2 listed around $1,000 at several retailers across Canada and the UK — New flagship dual-cache CPU might demand a hefty premium




Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!