Models model language model announce product integration analysis

DeepEye: A Steerable Self-driving Data Agent System

arXiv cs.DBby Boyan Li, Yiran Peng, Yupeng Xie, Sirong Lu, Yizhang Zhu, Xing Mu, Xinyu Liu, Yuyu LuoApril 1, 20261 min read0 views

Source Quiz

arXiv:2603.28889v1 Announce Type: new Abstract: Large Language Models (LLMs) have revolutionized natural language interaction with data. The "holy grail" of data analytics is to build autonomous Data Agents that can self-drive complex data analysis workflows. However, current implementations are still limited to linear "ChatBI" systems. These systems struggle with joint analysis across heterogeneous data sources (e.g., databases, documents, and data files) and often encounter "context explosion" in complex and iterative data analysis workflows. To address these challenges, we present DeepEye, a production-ready data agent system that adopts a workflow-centric architecture to ensure scalability and trustworthiness. DeepEye introduces a Unified Multimodal Orchestration protocol, enabling sea

View PDF HTML (experimental)

Abstract:Large Language Models (LLMs) have revolutionized natural language interaction with data. The "holy grail" of data analytics is to build autonomous Data Agents that can self-drive complex data analysis workflows. However, current implementations are still limited to linear "ChatBI" systems. These systems struggle with joint analysis across heterogeneous data sources (e.g., databases, documents, and data files) and often encounter "context explosion" in complex and iterative data analysis workflows. To address these challenges, we present DeepEye, a production-ready data agent system that adopts a workflow-centric architecture to ensure scalability and trustworthiness. DeepEye introduces a Unified Multimodal Orchestration protocol, enabling seamless integration of structured and unstructured data sources. To mitigate hallucinations, it employs Hierarchical Reasoning with context isolation, decomposing complex intents into autonomous AgentNodes and deterministic ToolNodes. Furthermore, DeepEye incorporates a database-inspired Workflow Engine (comprising a Compiler, Validator, Optimizer, and Executor) that guarantees structural correctness and accelerates execution via runtime topological optimization. In this demonstration, we showcase DeepEye's ability to orchestrate complex workflows to generate diverse multimodal outputs -- including Data Videos, Dashboards, and Analytical Reports -- highlighting its advantages in transparent execution, automated optimization, and human-in-the-loop reliability.

Comments: SIGMOD Demo (2026)

Subjects:

Databases (cs.DB)

Cite as: arXiv:2603.28889 [cs.DB]

(or arXiv:2603.28889v1 [cs.DB] for this version)

https://doi.org/10.48550/arXiv.2603.28889

arXiv-issued DOI via DataCite

Related DOI:

https://doi.org/10.1145/3788853.3801612

DOI(s) linking to related resources

Submission history

From: Boyan Li [view email] [v1] Mon, 30 Mar 2026 18:14:28 UTC (4,751 KB)

Original source

arXiv cs.DB

https://arxiv.org/abs/2603.28889

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modelannounce

ModelsLive

Hate Speech Detection Still Cooks (Even in 2026)

The failure case you didn’t see coming In late 2025, a major social platform quietly rolled back parts of its LLM-based moderation pipeline after internal audits revealed a systematic pattern: posts in African American Vernacular English (AAVE) were flagged at nearly three times the rate of semantically equivalent Standard American English content. The LLM reasoner, a fine-tuned GPT-4-class model had learned to treat certain phonetic spellings and grammatical constructions as proxies for “informal aggression.” A linguist reviewing the flagged corpus found no aggression whatsoever. The failure wasn’t adversarial. It was architectural: the model had no representation of dialect as a legitimate register. Simultaneously, coordinated hate communities on adjacent platforms were having a producti

Towards AI

12m33 minutes ago

ProductsLive

What Is New In Helm 4 And How It Improves Over Helm 3

The release of Helm 4 marks a massive milestone in the Kubernetes ecosystem. For years developers and system administrators have relied on this robust package manager to template deploy and manage complex cloud native applications. When the maintainers transitioned from the second version to Helm 3 the community rejoiced because it completely removed Tiller. That removal drastically simplified cluster security models and streamlined deployment pipelines. Now the highly anticipated Helm 4 is stepping into the spotlight to address the modern challenges of DevOps workflows. This comprehensive blog post will explore exactly what is new in Helm 4 and how it provides a vastly

DEV Community

10m23 minutes ago

ProductsLive

Developers Are Designing for AI Before Users Now

A quiet shift is happening in modern web development. For years, developers designed applications with one priority: users. UI came first. User flows came first. User experience came first. Backend, APIs, and integrations were built around that experience. But today, something has changed. Developers are increasingly designing systems with AI in mind before users, and this is reshaping how frontend, UX, and fullstack engineering work. <h2> The Old Way of Building Applications </h2> Traditional product development looked like this: Design UI → Build frontend → Connect backend → Launch The focus was simple: <ul> <li>What does the user need?</li> <li>How will they interact?</li> <li>What is the ea

DEV Community

5m27 minutes ago

Knowledge Map

TopicsEntitiesSource

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 184 connections

Scroll to zoom · drag to pan · click to open

Discussion

No comments yet — be the first to share your thoughts!

More in Models

ModelsLive

Hate Speech Detection Still Cooks (Even in 2026)

Towards AI

12m33 minutes ago

ModelsLive

The 5th Agent Orchestration Pattern: Market-Based Task Allocation

Most conversations about agent orchestration patterns settle on the same four: pipeline, supervisor, router, blackboard. Each solves coordination differently. Pipelines chain steps linearly. Supervisors centralize control. Routers classify and dispatch. Blackboards let agents coordinate through shared state without direct communication. These four cover a lot of ground. But there is a fifth pattern that comes from an older field, and it solves a problem the other four handle poorly: dynamic allocation across heterogeneous agents when cost matters. <h2> The Pattern: Auction-Based Task Allocation </h2> Instead of a supervisor deciding which agent handles a task, you let agents bid on it. The mechanism works like this. A task enters the system. It gets broadcast to all

DEV Community

7m17 minutes ago

ModelsLive

The Stages of AI Grief

<blockquote> Assumed audience: People who work with AI daily — or are starting to — and have complicated feelings about it. </blockquote> I don't think I've ever had so much fun in my programming career as I do now. Which is strange, because a few weeks ago I was in a very different place. I was watching - in horror - as the machine on my desk was taking over my craft. Like most people I guess, I derive quite a lot of my identity from that craft; hence the horror. (Let's ignore for now whether that's a good thing or not.) I just watched it melt away. Like a block of ice in the sun; inexorable. In that moment it felt like I was witnessing an emerging god: an uncontrollable force in the sky asserts its influence over all it touches, and every day, it touches

DEV Community

7m40 minutes ago

ModelsLive

Unlock the Power of Private AI: Build a Local RAG Pipeline with LangGraph, Ollama & Vector Databases

<blockquote> I created a new website: <a href="https://programmingcentral.vercel.app/books/typescript/" rel="noopener noreferrer">Free Access to the 8 Volumes on Typescript & AI Masterclass</a>, no registration required. Choose Volume and chapter on the menu on the left. 160 Chapters and hundreds of quizzes at the end of chapters. </blockquote> Retrieval-Augmented Generation (RAG) is revolutionizing how we interact with AI, allowing models to provide more informed and contextually relevant answers. But what if you need to keep your data private and secure? This guide dives into building a Private RAG pipeline – a self-contained AI system that operates entirely on your machine, leveraging local embeddings, vector stores, and Large Language Models (LLMs). We'll

DEV Community

9m34 minutes ago