I Built a Tiny Computer Inside a Transformer

Medium AIby Sean MoranApril 4, 20261 min read1 views

By compiling a simple program directly into transformer weights. Continue reading on Medium »

Could not retrieve the full article text.

Original source

Medium AI

https://medium.com/@sean.j.moran/i-built-a-tiny-computer-inside-a-transformer-e3000a0019b3?source=rss------artificial_intelligence-5

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

transformer

ModelsLive

Your AI Is Not Thinking. It's Multiplying Numbers. Let Me Show You Exactly How.

Everyone's talking about AI like it's magic. I work with it daily. It's not. Here's what's actually happening inside. I've fine-tuned LLMs. I've published research on them. I've built systems around them. And the single most honest thing I can tell you about large language models is this: At the bottom, it's matrix multiplication. That's it. Not intelligence. Not reasoning. Not understanding. Matrices of floating point numbers being multiplied together, billions of times per second. But here's the uncomfortable part. That doesn't mean nothing interesting is happening. Let me break this down without the hype, without the doomsaying, and without the marketing. What a "Model" Actually Is Forget the word "model." It carries too much baggage. What you're actually dealing with is a file. A very

Dev.to AI

7m10 minutes ago

ModelsFresh

Low-Rank Compression of Pretrained Models via Randomized Subspace Iteration

arXiv:2604.02659v1 Announce Type: cross Abstract: The massive scale of pretrained models has made efficient compression essential for practical deployment. Low-rank decomposition based on the singular value decomposition (SVD) provides a principled approach for model reduction, but its exact computation is expensive for large weight matrices. Randomized alternatives such as randomized SVD (RSVD) improve efficiency, yet they can suffer from poor approximation quality when the singular value spectrum decays slowly, a regime commonly observed in modern pretrained models. In this work, we address this limitation from both theoretical and empirical perspectives. First, we establish a connection between low-rank approximation error and predictive performance by analyzing softmax perturbations, s

arXiv stat.ML

1mabout 3 hours ago

ModelsFresh

Linguistic Frameworks Go Toe-to-Toe at Neuro-Symbolic Language Modeling

arXiv:2112.07874v2 Announce Type: cross Abstract: We examine the extent to which, in principle, linguistic graph representations can complement and improve neural language modeling. With an ensemble setup consisting of a pretrained Transformer and ground-truth graphs from one of 7 different formalisms, we find that, overall, semantic constituency structures are most useful to language modeling performance -- outpacing syntactic constituency structures as well as syntactic and semantic dependency structures. Further, effects vary greatly depending on part-of-speech class. In sum, our findings point to promising tendencies in neuro-symbolic language modeling and invite future research quantifying the design choices made by different formalisms.

ArXiv CS.AI

1mabout 3 hours ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 218 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsFresh

Goal-Conditioned Neural ODEs with Guaranteed Safety and Stability for Learning-Based All-Pairs Motion Planning

arXiv:2604.02821v1 Announce Type: new Abstract: This paper presents a learning-based approach for all-pairs motion planning, where the initial and goal states are allowed to be arbitrary points in a safe set. We construct smooth goal-conditioned neural ordinary differential equations (neural ODEs) via bi-Lipschitz diffeomorphisms. Theoretical results show that the proposed model can provide guarantees of global exponential stability and safety (safe set forward invariance) regardless of goal location. Moreover, explicit bounds on convergence rate, tracking error, and vector field magnitude are established. Our approach admits a tractable learning implementation using bi-Lipschitz neural networks and can incorporate demonstration data. We illustrate the effectiveness of the proposed method

arXiv cs.RO

1mabout 3 hours ago

Models

Exclusive | Pentagon Used Anthropic’s Claude in Maduro Venezuela Raid - WSJ

Exclusive | Pentagon Used Anthropic’s Claude in Maduro Venezuela Raid WSJ

Google News - AI Venezuela

1mabout 2 months ago

ModelsFresh

Learning Structured Robot Policies from Vision-Language Models via Synthetic Neuro-Symbolic Supervision

arXiv:2604.02812v1 Announce Type: new Abstract: Vision-language models (VLMs) have recently demonstrated strong capabilities in mapping multimodal observations to robot behaviors. However, most current approaches rely on end-to-end visuomotor policies that remain opaque and difficult to analyze, limiting their use in safety-critical robotic applications. In contrast, classical robotic systems often rely on structured policy representations that provide interpretability, modularity, and reactive execution. This work investigates how foundation models can be specialized to generate structured robot policies grounded in multimodal perception, bridging high-dimensional learning and symbolic control. We propose a neuro-symbolic approach in which a VLM synthesizes executable Behavior Tree polici

arXiv cs.RO

1mabout 3 hours ago

ModelsFresh

Benchmark - Death Stranding 2 voor pc - Overtuigende port van mooie PS5-game

We bekijken hoe zwaar Death Stranding 2: On The Beach cpu, videokaart en geheugen belast op verschillende pc-configuraties en welke beeldkwaliteit je daarvoor terugkrijgt.

Tweakers.net

1mabout 3 hours ago