Berkeley Lab: DL4SCI 2026 to Spotlight Discovery Through Agentic AI, Foundation Models - HPCwire
<a href="https://news.google.com/rss/articles/CBMixgFBVV95cUxPUm1ydjBxdEZVS2JqbGhrdndEZzNaNlo1R2k3RzRnMWtxN1V3UXdvUDkwNmRiaGZSOXV4M3RiZGFVdS01ZU9FWnFEZGN2VThUTUxXTldsRGx0MXdqZGV2MnFtTGVnakVzWHdPbWNKMHNhOFp5UjNQUDlJNjJGa09ZN2U5Yy01TUU4T3N6blp1WmlmUVhhczZZN05UY2JFTzkwWXhkT1hlbEhjUWJQakRZSFRjekpfN0JLMjlobnNIYzIxS1ZuTWc?oc=5" target="_blank">Berkeley Lab: DL4SCI 2026 to Spotlight Discovery Through Agentic AI, Foundation Models</a> <font color="#6f6f6f">HPCwire</font>
Could not retrieve the full article text.
Read on GNews AI agentic →Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modelfoundation modelagentic
StoryScope: Investigating idiosyncrasies in AI fiction
arXiv:2604.03136v1 Announce Type: new Abstract: As AI-generated fiction becomes increasingly prevalent, questions of authorship and originality are becoming central to how written work is evaluated. While most existing work in this space focuses on identifying surface-level signatures of AI writing, we ask instead whether AI-generated stories can be distinguished from human ones without relying on stylistic signals, focusing on discourse-level narrative choices such as character agency and chronological discontinuity. We propose StoryScope, a pipeline that automatically induces a fine-grained, interpretable feature space of discourse-level narrative features across 10 dimensions. We apply StoryScope to a parallel corpus of 10,272 writing prompts, each written by a human author and five LLM

Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts
arXiv:2604.03127v1 Announce Type: new Abstract: Automated annotation of pedagogical dialogue is a high-stakes task where LLMs often fail without sufficient domain grounding. We present a domain-adapted RAG pipeline for tutoring move annotation. Rather than fine-tuning the generative model, we adapt retrieval by fine-tuning a lightweight embedding model on tutoring corpora and indexing dialogues at the utterance level to retrieve labeled few-shot demonstrations. Evaluated across two real tutoring dialogue datasets (TalkMoves and Eedi) and three LLM backbones (GPT-5.2, Claude Sonnet 4.6, Qwen3-32b), our best configuration achieves Cohen's $\kappa$ of 0.526-0.580 on TalkMoves and 0.659-0.743 on Eedi, substantially outperforming no-retrieval baselines ($\kappa = 0.275$-$0.413$ and $0.160$-$0.4

Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization
arXiv:2604.03110v1 Announce Type: new Abstract: Knowledge distillation is an effective technique for pre-trained language model compression. However, existing methods only focus on the knowledge distribution among layers, which may cause the loss of fine-grained information in the alignment process. To address this issue, we introduce the Multi-aspect Knowledge Distillation (MaKD) method, which mimics the self-attention and feed-forward modules in greater depth to capture rich language knowledge information at different aspects. Experimental results demonstrate that MaKD can achieve competitive performance compared with various strong baselines with the same storage parameter budget. In addition, our method also performs well in distilling auto-regressive architecture models.
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

StoryScope: Investigating idiosyncrasies in AI fiction
arXiv:2604.03136v1 Announce Type: new Abstract: As AI-generated fiction becomes increasingly prevalent, questions of authorship and originality are becoming central to how written work is evaluated. While most existing work in this space focuses on identifying surface-level signatures of AI writing, we ask instead whether AI-generated stories can be distinguished from human ones without relying on stylistic signals, focusing on discourse-level narrative choices such as character agency and chronological discontinuity. We propose StoryScope, a pipeline that automatically induces a fine-grained, interpretable feature space of discourse-level narrative features across 10 dimensions. We apply StoryScope to a parallel corpus of 10,272 writing prompts, each written by a human author and five LLM

Domain-Adapted Retrieval for In-Context Annotation of Pedagogical Dialogue Acts
arXiv:2604.03127v1 Announce Type: new Abstract: Automated annotation of pedagogical dialogue is a high-stakes task where LLMs often fail without sufficient domain grounding. We present a domain-adapted RAG pipeline for tutoring move annotation. Rather than fine-tuning the generative model, we adapt retrieval by fine-tuning a lightweight embedding model on tutoring corpora and indexing dialogues at the utterance level to retrieve labeled few-shot demonstrations. Evaluated across two real tutoring dialogue datasets (TalkMoves and Eedi) and three LLM backbones (GPT-5.2, Claude Sonnet 4.6, Qwen3-32b), our best configuration achieves Cohen's $\kappa$ of 0.526-0.580 on TalkMoves and 0.659-0.743 on Eedi, substantially outperforming no-retrieval baselines ($\kappa = 0.275$-$0.413$ and $0.160$-$0.4

Multi-Aspect Knowledge Distillation for Language Model with Low-rank Factorization
arXiv:2604.03110v1 Announce Type: new Abstract: Knowledge distillation is an effective technique for pre-trained language model compression. However, existing methods only focus on the knowledge distribution among layers, which may cause the loss of fine-grained information in the alignment process. To address this issue, we introduce the Multi-aspect Knowledge Distillation (MaKD) method, which mimics the self-attention and feed-forward modules in greater depth to capture rich language knowledge information at different aspects. Experimental results demonstrate that MaKD can achieve competitive performance compared with various strong baselines with the same storage parameter budget. In addition, our method also performs well in distilling auto-regressive architecture models.

Verbalizing LLMs' assumptions to explain and control sycophancy
arXiv:2604.03058v1 Announce Type: new Abstract: LLMs can be socially sycophantic, affirming users when they ask questions like "am I in the wrong?" rather than providing genuine assessment. We hypothesize that this behavior arises from incorrect assumptions about the user, like underestimating how often users are seeking information over reassurance. We present Verbalized Assumptions, a framework for eliciting these assumptions from LLMs. Verbalized Assumptions provide insight into LLM sycophancy, delusion, and other safety issues, e.g., the top bigram in LLMs' assumptions on social sycophancy datasets is ``seeking validation.'' We provide evidence for a causal link between Verbalized Assumptions and sycophantic model behavior: our assumption probes (linear probes trained on internal repre

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!