A “scientific sandbox” lets researchers explore the evolution of vision systems
The AI-powered tool could inform the design of better sensors and cameras for robots or autonomous vehicles.
Why did humans evolve the eyes we have today?
While scientists can’t go back in time to study the environmental pressures that shaped the evolution of the diverse vision systems that exist in nature, a new computational framework developed by MIT researchers allows them to explore this evolution in artificial intelligence agents.
The framework they developed, in which embodied AI agents evolve eyes and learn to see over many generations, is like a “scientific sandbox” that allows researchers to recreate different evolutionary trees. The user does this by changing the structure of the world and the tasks AI agents complete, such as finding food or telling objects apart.
This allows them to study why one animal may have evolved simple, light-sensitive patches as eyes, while another has complex, camera-type eyes.
The researchers’ experiments with this framework showcase how tasks drove eye evolution in the agents. For instance, they found that navigation tasks often led to the evolution of compound eyes with many individual units, like the eyes of insects and crustaceans.
On the other hand, if agents focused on object discrimination, they were more likely to evolve camera-type eyes with irises and retinas.
This framework could enable scientists to probe “what-if” questions about vision systems that are difficult to study experimentally. It could also guide the design of novel sensors and cameras for robots, drones, and wearable devices that balance performance with real-world constraints like energy efficiency and manufacturability.
“While we can never go back and figure out every detail of how evolution took place, in this work we’ve created an environment where we can, in a sense, recreate evolution and probe the environment in all these different ways. This method of doing science opens to the door to a lot of possibilities,” says Kushagra Tiwary, a graduate student at the MIT Media Lab and co-lead author of a paper on this research.
He is joined on the paper by co-lead author and fellow graduate student Aaron Young; graduate student Tzofi Klinghoffer; former postdoc Akshat Dave, who is now an assistant professor at Stony Brook University; Tomaso Poggio, the Eugene McDermott Professor in the Department of Brain and Cognitive Sciences, an investigator in the McGovern Institute, and co-director of the Center for Brains, Minds, and Machines; co-senior authors Brian Cheung, a postdoc in the Center for Brains, Minds, and Machines and an incoming assistant professor at the University of California San Francisco; and Ramesh Raskar, associate professor of media arts and sciences and leader of the Camera Culture Group at MIT; as well as others at Rice University and Lund University. The research appears today in Science Advances.
Building a scientific sandbox
The paper began as a conversation among the researchers about discovering new vision systems that could be useful in different fields, like robotics. To test their “what-if” questions, the researchers decided to use AI to explore the many evolutionary possibilities.
“What-if questions inspired me when I was growing up to study science. With AI, we have a unique opportunity to create these embodied agents that allow us to ask the kinds of questions that would usually be impossible to answer,” Tiwary says.
To build this evolutionary sandbox, the researchers took all the elements of a camera, like the sensors, lenses, apertures, and processors, and converted them into parameters that an embodied AI agent could learn.
They used those building blocks as the starting point for an algorithmic learning mechanism an agent would use as it evolved eyes over time.
“We couldn’t simulate the entire universe atom-by-atom. It was challenging to determine which ingredients we needed, which ingredients we didn’t need, and how to allocate resources over those different elements,” Cheung says.
In their framework, this evolutionary algorithm can choose which elements to evolve based on the constraints of the environment and the task of the agent.
Each environment has a single task, such as navigation, food identification, or prey tracking, designed to mimic real visual tasks animals must overcome to survive. The agents start with a single photoreceptor that looks out at the world and an associated neural network model that processes visual information.
Then, over each agent’s lifetime, it is trained using reinforcement learning, a trial-and-error technique where the agent is rewarded for accomplishing the goal of its task. The environment also incorporates constraints, like a certain number of pixels for an agent’s visual sensors.
“These constraints drive the design process, the same way we have physical constraints in our world, like the physics of light, that have driven the design of our own eyes,” Tiwary says.
Over many generations, agents evolve different elements of vision systems that maximize rewards.
Their framework uses a genetic encoding mechanism to computationally mimic evolution, where individual genes mutate to control an agent’s development.
For instance, morphological genes capture how the agent views the environment and control eye placement; optical genes determine how the eye interacts with light and dictate the number of photoreceptors; and neural genes control the learning capacity of the agents.
Testing hypotheses
When the researchers set up experiments in this framework, they found that tasks had a major influence on the vision systems the agents evolved.
For instance, agents that were focused on navigation tasks developed eyes designed to maximize spatial awareness through low-resolution sensing, while agents tasked with detecting objects developed eyes focused more on frontal acuity, rather than peripheral vision.
Another experiment indicated that a bigger brain isn’t always better when it comes to processing visual information. Only so much visual information can go into the system at a time, based on physical constraints like the number of photoreceptors in the eyes.
“At some point a bigger brain doesn’t help the agents at all, and in nature that would be a waste of resources,” Cheung says.
In the future, the researchers want to use this simulator to explore the best vision systems for specific applications, which could help scientists develop task-specific sensors and cameras. They also want to integrate LLMs into their framework to make it easier for users to ask “what-if” questions and study additional possibilities.
“There’s a real benefit that comes from asking questions in a more imaginative way. I hope this inspires others to create larger frameworks, where instead of focusing on narrow questions that cover a specific area, they are looking to answer questions with a much wider scope,” Cheung says.
This work was supported, in part, by the Center for Brains, Minds, and Machines and the Defense Advanced Research Projects Agency (DARPA) Mathematics for the Discovery of Algorithms and Architectures (DIAL) program.
MIT ML News
https://news.mit.edu/2025/scientific-sandbox-lets-researchers-explore-evolution-vision-systems-1217Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
autonomousresearch![[P] Cadenza: Connect Wandb logs to agents easily for autonomous research.](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-robot-assembly-8kCUTaKVejALbAywaCQBiC.webp)
[P] Cadenza: Connect Wandb logs to agents easily for autonomous research.
Wandb CLI and MCP is atrocious to use with agents for full autonomous research loops. They are slow, clunky, and result in context rot. So I built a CLI tool and a Python SDK to make it easy to connect your Wandb projects and runs to your agent (clawed or otherwise). The cli tool works by allowing you to import your wandb projects and structures your runs in a way that makes it easy for agents to get a sense of the solution space of your research project. When projects are imported, only the configs and metrics are analyzed to index and store your runs. When an agent samples from this index, only the most high performing experiments are returned which reduces context rot. You can also change the behavior of the index and your agent to trade-off exploration with exploitation. Open sourcing
![[D] Offering licensed Indian language speech datasets (with explicit contributor consent)](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-robot-hand-JvPW6jsLFTCtkgtb97Kys5.webp)
[D] Offering licensed Indian language speech datasets (with explicit contributor consent)
Hi everyone, I run a small data initiative where we collect speech datasets in multiple Indian languages directly from contributors who provide explicit consent for their recordings to be used and licensed. We can provide datasets with either exclusive or non-exclusive rights depending on the use case. The goal is to make ethically sourced speech data available for teams working on ASR, TTS, voice AI, or related research. If anyone here is working on speech models and might be looking for Indian language audio data, feel free to reach out. Happy to share more details about the datasets and collection process. — Divyam Founder, DataCatalyst datacatalyst.in submitted by /u/Trick-Praline6688 [link] [comments]

Unnoticed Gemma-4 Feature - it admits that it does not now...
Edit: "it admits that it does not know" (sorry for the TYPO!) Although Qwen3.5 is a great series of models, it is prone to make very broad assumptions/hallucinate stuff and it does it with a great confidence, so you may believe what it says. In contrast, Gemma-4 (specifically I tested E4b Q8 version) admits that it does not know right at the start of conversation: Therefore, I cannot confirm familiarity with a single, specific research study by that name. However, I am generally familiar with the factors that researchers and military trainers study regarding attrition in elite training programs... That is very important feature and it may hint to changing model training routine, where admitting to not know stuff is penalized less than trying to guess and then fail. submitted by /u/mtomas7
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!