Live
🔥 OpenBMB/ChatDevGitHub Trending🔥 microsoft/agent-lightningGitHub Trending🔥 apache/supersetGitHub Trending🔥 shanraisshan/claude-code-best-practiceGitHub TrendingA-SelecT: Automatic Timestep Selection for Diffusion Transformer Representation LearningarXivGUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play AnnotationarXivSommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language ModelsarXivCANGuard: A Spatio-Temporal CNN-GRU-Attention Hybrid Architecture for Intrusion Detection in In-Vehicle CAN NetworksarXivDesignWeaver: Dimensional Scaffolding for Text-to-Image Product DesignarXivA Lightweight, Transferable, and Self-Adaptive Framework for Intelligent DC Arc-Fault Detection in Photovoltaic SystemsarXivConsistency Amplifies: How Behavioral Variance Shapes Agent AccuracyarXivStabilizing Rubric Integration Training via Decoupled Advantage NormalizationarXivSemi-Automated Knowledge Engineering and Process Mapping for Total Airport ManagementarXivAIRA_2: Overcoming Bottlenecks in AI Research AgentsarXivBeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional EnvironmentsarXiv🔥 OpenBMB/ChatDevGitHub Trending🔥 microsoft/agent-lightningGitHub Trending🔥 apache/supersetGitHub Trending🔥 shanraisshan/claude-code-best-practiceGitHub TrendingA-SelecT: Automatic Timestep Selection for Diffusion Transformer Representation LearningarXivGUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play AnnotationarXivSommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language ModelsarXivCANGuard: A Spatio-Temporal CNN-GRU-Attention Hybrid Architecture for Intrusion Detection in In-Vehicle CAN NetworksarXivDesignWeaver: Dimensional Scaffolding for Text-to-Image Product DesignarXivA Lightweight, Transferable, and Self-Adaptive Framework for Intelligent DC Arc-Fault Detection in Photovoltaic SystemsarXivConsistency Amplifies: How Behavioral Variance Shapes Agent AccuracyarXivStabilizing Rubric Integration Training via Decoupled Advantage NormalizationarXivSemi-Automated Knowledge Engineering and Process Mapping for Total Airport ManagementarXivAIRA_2: Overcoming Bottlenecks in AI Research AgentsarXivBeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional EnvironmentsarXiv

Search AI News

Find articles across all categories and topics

Filter:
Filter by tag

935 results

Sort by:
🔥 microsoft/agent-lightning
Open Source AIFresh

🔥 microsoft/agent-lightning

The absolute trainer to light up AI agents. — Trending on GitHub today with 349 new stars.

GitHub Trending
5m5about 5 hours ago
🔥 OpenBMB/ChatDev
Open Source AIFresh

🔥 OpenBMB/ChatDev

ChatDev 2.0: Dev All through LLM-powered Multi-Agent Collaboration — Trending on GitHub today with 255 new stars.

GitHub Trending
11m5about 5 hours ago
🔥 apache/superset
Open Source AIFresh

🔥 apache/superset

Apache Superset is a Data Visualization and Data Exploration Platform — Trending on GitHub today with 430 new stars.

GitHub Trending
4m1about 6 hours ago
🔥 shanraisshan/claude-code-best-practice
Open Source AIFresh

🔥 shanraisshan/claude-code-best-practice

practice made claude perfect — Trending on GitHub today with 1050 new stars.

GitHub Trending
14m1about 6 hours ago
Consistency Amplifies: How Behavioral Variance Shapes Agent Accuracy
Research PapersRecent

Consistency Amplifies: How Behavioral Variance Shapes Agent Accuracy

arXiv:2603.25764v1 Announce Type: cross Abstract: As LLM-based agents are deployed in production systems, understanding their behavioral consistency (whether they produce similar action sequences when given identical tasks) becomes critical for reliability. We study consistency in the context of SWE-bench, a challenging software engineering benchmark requiring complex, multi-step reasoning. Comparing Claude~4.5~Sonnet, GPT-5, and Llama-3.1-70B across 50 runs each (10 tasks $\times$ 5 runs), we find that across models, higher consistency aligns with higher accuracy: Claude achieves the lowest v — Aman Mehta

arXiv
10mabout 13 hours ago
GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation
Research PapersRecent

GUIDE: Resolving Domain Bias in GUI Agents through Real-Time Web Video Retrieval and Plug-and-Play Annotation

arXiv:2603.26266v1 Announce Type: new Abstract: Large vision-language models have endowed GUI agents with strong general capabilities for interface understanding and interaction. However, due to insufficient exposure to domain-specific software operation data during training, these agents exhibit significant domain bias - they lack familiarity with the specific operation workflows (planning) and UI element layouts (grounding) of particular applications, limiting their real-world task performance. In this paper, we present GUIDE (GUI Unbiasing via Instructional-Video Driven Expertise), a traini — Rui Xie, Zhi Gao, Chenrui Shi, Zirui Shang, Lu Chen, Qing Li

arXiv
10mabout 13 hours ago
CANGuard: A Spatio-Temporal CNN-GRU-Attention Hybrid Architecture for Intrusion Detection in In-Vehicle CAN Networks
Research PapersRecent

CANGuard: A Spatio-Temporal CNN-GRU-Attention Hybrid Architecture for Intrusion Detection in In-Vehicle CAN Networks

arXiv:2603.25763v1 Announce Type: cross Abstract: The Internet of Vehicles (IoV) has become an essential component of smart transportation systems, enabling seamless interaction among vehicles and infrastructure. In recent years, it has played a progressively significant role in enhancing mobility, safety, and transportation efficiency. However, this connectivity introduces severe security vulnerabilities, particularly Denial-of-Service (DoS) and spoofing attacks targeting the Controller Area Network (CAN) bus, which could severely inhibit communication between the critical components of a veh — Rakib Hossain Sajib, Md. Rokon Mia, Prodip Kumar Sarker, Abdullah Al Noman, Md Arifur Rahman

arXiv
10mabout 13 hours ago
Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models
Research PapersRecent

Sommelier: Scalable Open Multi-turn Audio Pre-processing for Full-duplex Speech Language Models

arXiv:2603.25750v1 Announce Type: cross Abstract: As the paradigm of AI shifts from text-based LLMs to Speech Language Models (SLMs), there is a growing demand for full-duplex systems capable of real-time, natural human-computer interaction. However, the development of such models is constrained by the scarcity of high-quality, multi-speaker conversational data, as existing large-scale resources are predominantly single-speaker or limited in volume. Addressing the complex dynamics of natural dialogue, such as overlapping and back-channeling remains a challenge, with standard processing pipelin — Kyudan Jung, Jihwan Kim, Soyoon Kim, Jeongoon Kim, Jaegul Choo, Cheonbok Park

arXiv
10mabout 13 hours ago
Stabilizing Rubric Integration Training via Decoupled Advantage Normalization
Research PapersRecent

Stabilizing Rubric Integration Training via Decoupled Advantage Normalization

arXiv:2603.26535v1 Announce Type: new Abstract: We propose Process-Aware Policy Optimization (PAPO), a method that integrates process-level evaluation into Group Relative Policy Optimization (GRPO) through decoupled advantage normalization, to address two limitations of existing reward designs. Outcome reward models (ORM) evaluate only final-answer correctness, treating all correct responses identically regardless of reasoning quality, and gradually lose the advantage signal as groups become uniformly correct. Process reward models (PRM) offer richer supervision, but directly using PRM scores — Zelin Tan, Zhouliang Yu, Bohan Lin, Zijie Geng, Hejia Geng, Yudong Zhang, Mulei Zhang, Yang Chen, Shuyue Hu, Zhenfei Yin, Chen Zhang, Lei Bai

arXiv
10mabout 13 hours ago
A Lightweight, Transferable, and Self-Adaptive Framework for Intelligent DC Arc-Fault Detection in Photovoltaic Systems
Research PapersRecent

A Lightweight, Transferable, and Self-Adaptive Framework for Intelligent DC Arc-Fault Detection in Photovoltaic Systems

arXiv:2603.25749v1 Announce Type: cross Abstract: Arc-fault circuit interrupters (AFCIs) are essential for mitigating fire hazards in residential photovoltaic (PV) systems, yet achieving reliable DC arc-fault detection under real-world conditions remains challenging. Spectral interference from inverter switching, hardware heterogeneity, operating-condition drift, and environmental noise collectively compromise conventional AFCI solutions. This paper proposes a lightweight, transferable, and self-adaptive learning-driven framework (LD-framework) for intelligent DC arc-fault detection. At the de — Xiaoke Yang, Long Gao, Haoyu He, Hanyuan Hang, Qi Liu, Shuai Zhao, Qiantu Tuo, Rui Li

arXiv
10mabout 13 hours ago
ETA-VLA: Efficient Token Adaptation via Temporal Fusion and Intra-LLM Sparsification for Vision-Language-Action Models
Research PapersRecent

ETA-VLA: Efficient Token Adaptation via Temporal Fusion and Intra-LLM Sparsification for Vision-Language-Action Models

arXiv:2603.25766v1 Announce Type: cross Abstract: The integration of Vision-Language-Action (VLA) models into autonomous driving systems offers a unified framework for interpreting complex scenes and executing control commands. However, the necessity to incorporate historical multi-view frames for accurate temporal reasoning imposes a severe computational burden, primarily driven by the quadratic complexity of self-attention mechanisms in Large Language Models (LLMs). To alleviate this bottleneck, we propose ETA-VLA, an Efficient Token Adaptation framework for VLA models. ETA-VLA processes the — Yiru Wang, Anqing Jiang, Shuo Wang, Yuwen Heng, Zichong Gu, Hao Sun

arXiv
10mabout 13 hours ago
DesignWeaver: Dimensional Scaffolding for Text-to-Image Product Design
Research PapersRecent

DesignWeaver: Dimensional Scaffolding for Text-to-Image Product Design

arXiv:2502.09867v2 Announce Type: cross Abstract: Generative AI has enabled novice designers to quickly create professional-looking visual representations for product concepts. However, novices have limited domain knowledge that could constrain their ability to write prompts that effectively explore a product design space. To understand how experts explore and communicate about design spaces, we conducted a formative study with 12 experienced product designers and found that experts -- and their less-versed clients -- often use visual references to guide co-design discussions rather than writt — Sirui Tao, Ivan Liang, Cindy Peng, Zhiqing Wang, Srishti Palani, Steven P. Dow

arXiv
10mabout 13 hours ago
Semi-Automated Knowledge Engineering and Process Mapping for Total Airport Management
Research PapersRecent

Semi-Automated Knowledge Engineering and Process Mapping for Total Airport Management

arXiv:2603.26076v1 Announce Type: new Abstract: Documentation of airport operations is inherently complex due to extensive technical terminology, rigorous regulations, proprietary regional information, and fragmented communication across multiple stakeholders. The resulting data silos and semantic inconsistencies present a significant impediment to the Total Airport Management (TAM) initiative. This paper presents a methodological framework for constructing a domain-grounded, machine-readable Knowledge Graph (KG) through a dual-stage fusion of symbolic Knowledge Engineering (KE) and generative — Darryl Teo, Adharsha Sam, Chuan Shen Marcus Koh, Rakesh Nagi, Nuno Antunes Ribeiro

arXiv
10mabout 13 hours ago
AIRA_2: Overcoming Bottlenecks in AI Research Agents
Research PapersRecent

AIRA_2: Overcoming Bottlenecks in AI Research Agents

arXiv:2603.26499v1 Announce Type: new Abstract: Existing research has identified three structural performance bottlenecks in AI research agents: (1) synchronous single-GPU execution constrains sample throughput, limiting the benefit of search; (2) a generalization gap where validation-based selection causes performance to degrade over extended search horizons; and (3) the limited capability of fixed, single-turn LLM operators imposes a ceiling on search performance. We introduce AIRA$_2$, which addresses these bottlenecks through three architectural choices: an asynchronous multi-GPU worker po — Karen Hambardzumyan, Nicolas Baldwin, Edan Toledo, Rishi Hazra, Michael Kuchnik, Bassel Al Omari, Thomas Simon Foster, Anton Protopopov, Jean-Christophe Gagnon-Audet, Ishita Mediratta, Kelvin Niu, Michael Shvartsman, Alisia Lupidi, Alexis Audran-Reiss, Parth Pathak, Tatiana Shavrina, Despoina Magka, Hela Momand, Derek Dunfield, Nicola Cancedda, Pontus Stenetorp, Carole-Jean Wu, Jakob Nicolaus Foerster, Yoram Bachrach, Martin Josifoski

arXiv
10mabout 13 hours ago
A-SelecT: Automatic Timestep Selection for Diffusion Transformer Representation Learning
Research PapersRecent

A-SelecT: Automatic Timestep Selection for Diffusion Transformer Representation Learning

arXiv:2603.25758v1 Announce Type: cross Abstract: Diffusion models have significantly reshaped the field of generative artificial intelligence and are now increasingly explored for their capacity in discriminative representation learning. Diffusion Transformer (DiT) has recently gained attention as a promising alternative to conventional U-Net-based diffusion models, demonstrating a promising avenue for downstream discriminative tasks via generative pre-training. However, its current training efficiency and representational capacity remain largely constrained due to the inadequate timestep sea — Changyu Liu, James Chenhao Liang, Wenhao Yang, Yiming Cui, Jinghao Yang, Tianyang Wang, Qifan Wang, Dongfang Liu, Cheng Han

arXiv
10mabout 13 hours ago
BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments
Research PapersRecent

BeSafe-Bench: Unveiling Behavioral Safety Risks of Situated Agents in Functional Environments

arXiv:2603.25747v1 Announce Type: new Abstract: The rapid evolution of Large Multimodal Models (LMMs) has enabled agents to perform complex digital and physical tasks, yet their deployment as autonomous decision-makers introduces substantial unintentional behavioral safety risks. However, the absence of a comprehensive safety benchmark remains a major bottleneck, as existing evaluations rely on low-fidelity environments, simulated APIs, or narrowly scoped tasks. To address this gap, we present BeSafe-Bench (BSB), a benchmark for exposing behavioral safety risks of situated agents in functional — Yuxuan Li, Yi Lin, Peng Wang, Shiming Liu, Xuetao Wei

arXiv
10mabout 13 hours ago
Page 1 of 59