Models model language model training announce safety code generation

LLMON: An LLM-native Markup Language to Leverage Structure and Semantics at the LLM Interface

arXiv cs.PLby Michael Hind, Basel Shbita, Bo Wu, Farhan Ahmed, Chad DeLuca, Nathan Fulton, David Cox, Dan GutfreundApril 1, 20262 min read0 views

Source Quiz

arXiv:2603.22519v2 Announce Type: replace-cross Abstract: Textual Large Language Models (LLMs) provide a simple and familiar interface: a string of text is used for both input and output. However, the information conveyed to an LLM often has a richer structure and semantics, which is not conveyed in a string. For example, most prompts contain both instructions ("Summarize this paper into a paragraph") and data (the paper to summarize), but these are usually not distinguished when passed to the model. This can lead to model confusion and security risks, such as prompt injection attacks. This work addresses this shortcoming by introducing an LLM-native mark-up language, LLMON (LLM Object Notation, pronounced "Lemon"), that enables the structure and semantic metadata of the text to be communi

View PDF HTML (experimental)

Abstract:Textual Large Language Models (LLMs) provide a simple and familiar interface: a string of text is used for both input and output. However, the information conveyed to an LLM often has a richer structure and semantics, which is not conveyed in a string. For example, most prompts contain both instructions ("Summarize this paper into a paragraph") and data (the paper to summarize), but these are usually not distinguished when passed to the model. This can lead to model confusion and security risks, such as prompt injection attacks. This work addresses this shortcoming by introducing an LLM-native mark-up language, LLMON (LLM Object Notation, pronounced "Lemon"), that enables the structure and semantic metadata of the text to be communicated in a natural way to an LLM. This information can then be used during model training, model prompting, and inference implementation, leading to improvements in model accuracy, safety, and security. This is analogous to how programming language types can be used for many purposes, such as static checking, code generation, dynamic checking, and IDE highlighting. We discuss the general design requirements of an LLM-native markup language, introduce the LLMON markup language and show how it meets these design requirements, describe how the information contained in a LLMON artifact can benefit model training and inference implementation, and provide some preliminary empirical evidence of its value for both of these use cases. We also discuss broader issues and research opportunities that are enabled with an LLM-native approach.

Comments: 28 pages

Subjects:

Software Engineering (cs.SE); Artificial Intelligence (cs.AI); Programming Languages (cs.PL)

Cite as: arXiv:2603.22519 [cs.SE]

(or arXiv:2603.22519v2 [cs.SE] for this version)

https://doi.org/10.48550/arXiv.2603.22519

arXiv-issued DOI via DataCite

Submission history

From: Michael Hind [view email] [v1] Mon, 23 Mar 2026 19:27:35 UTC (137 KB) [v2] Mon, 30 Mar 2026 20:39:09 UTC (135 KB)

Original source

arXiv cs.PL

https://arxiv.org/abs/2603.22519

Was this article helpful?

Ask AI about this article

Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modellanguage modeltraining

ModelsLive

Intel Arc B70 Benchmarks/Comparison to Nvidia RTX 4070 Super

Good day everyone! You may remember me from such posts as Getting An Intel Arc B70 Running For LLM Inference on a Dell Poweredge R730XD . Maybe not. Probably not... Anyway, I've had this card for about a week now, I ordered it on launch day and have been beating my head against a wall with drivers and other issues until finally getting it running properly! Since then, I've realized there's a significant lack of people actually testing this card and getting some real benchmarks out into the community. Something something be the change you want to see in the world, something something... So I've done some testing, and this certainly won't be the last of my tests and benchmarks, but it'll certainly be the first. I know what is on the community's mind. I hear you ask "How does the new Intel ca

Reddit r/LocalLLaMA

16m44 minutes ago

ReleasesFresh

The Spaceballs sequel will be released in April next year

There's finally a release date for the Spaceballs sequel — but before you get too excited, it's a whole year away. As first reported by Deadline , Amazon MGM Studios announced on Friday night that the upcoming Spaceballs movie will hit theaters on April 23, 2027, right around the 40th anniversary of the first film. Several members of the original cast will be reprising their roles, according to Deadline , including Mel Brooks, Rick Moranis, Bill Pullman, George Wynder and Daphne Zuniga. Spaceballs: The Release Date. April 23, 2027. pic.twitter.com/5Xv0BKmf7C — Amazon MGM Studios (@AmazonMGMStudio) April 4, 2026 Whispers of a potential Spaceballs 2 go back a couple of years, but Brooks officially confirmed in an extremely on-brand announcement video last summer that the movie is actually ha