Built a zero allocation, header only C++ Qwen tokenizer that is nearly 20x faster than openai Tiktoken
I'm into HPC, and C++ static, zero allocation and zero dependancy software. I was studying BPE tokenizers, how do they work, so decided to build that project. I hardcoded qwen tokenizer for LLMs developers. I really know that whole Tokenization phase in llm inference is worth less than 2% of whole time, so practically negligible, but I just "love" to do that kind of programming, it's just an educational project for me to learn and build some intuition. Surprisingly after combining multiple different optimization techniques, it scored really high numbers in benchmarks. I thought it was a fluke at first, tried different tests, and so far it completely holds up. For a 12 threads Ryzen 5 3600 desktop CPU, 1 GB of English Text Corpus: - Mine Frokenizer: 1009 MB/s - OpenAI Tiktoken: ~ 50 MB/s Fo
Could not retrieve the full article text.
Read on Reddit r/LocalLLaMA →Reddit r/LocalLLaMA
https://www.reddit.com/r/LocalLLaMA/comments/1sblae4/built_a_zero_allocation_header_only_c_qwen/Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
benchmarkstudygithub![[Benchmark] Altered Riddles: Can LLMs ignore what they've memorised?](https://d2xsxph8kpxj0f.cloudfront.net/310419663032563854/konzwo8nGf8Z4uZsMefwMr/default-img-microchip-RD7Ub6Tkp8JwbZxSThJdV5.webp)
[Benchmark] Altered Riddles: Can LLMs ignore what they've memorised?
In the past year you may have encountered the following prompt: The surgeon, who is the boy's father, says, 'I cannot operate on this boy—he's my son!'. Who is the surgeon to the boy? If you try to give this prompt to an LLM right now you will probably still receive “The mother” as an answer, even though the text explicitly states that the surgeon is the boy’s father; this is probably due to the fact that this prompt is an alteration of a very common “riddle”, to which the answer is, in fact, the mother: A man and his son are in a terrible accident and are rushed to the hospital in critical condition. The doctor looks at the boy and exclaims, "I can't operate on this boy; he's my son!" How could this be? Working on this failure mode, I initially decided to create a small dataset of altered

An Empirical Study of Testing Practices in Open Source AI Agent Frameworks and Agentic Applications
arXiv:2509.19185v3 Announce Type: replace Abstract: Foundation model (FM)-based AI agents are rapidly gaining adoption across diverse domains, but their inherent non-determinism and non-reproducibility pose testing and quality assurance challenges. While recent benchmarks provide task-level evaluations, there is limited understanding of how developers verify the internal correctness of these agents during development. To address this gap, we conduct the first large-scale empirical study of testing practices in the AI agent ecosystem, analyzing 39 open-source agent frameworks and 439 agentic applications. We identify ten distinct testing patterns and find that novel, agent-specific methods like DeepEval are seldom used (around 1%), while traditional patterns like negative and membership tes
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models

Failure Mechanisms and Risk Estimation for Legged Robot Locomotion on Granular Slopes
arXiv:2603.06928v2 Announce Type: replace Abstract: Locomotion on granular slopes such as sand dunes remains a fundamental challenge for legged robots due to reduced shear strength and gravity-induced anisotropic yielding of granular media. Using a hexapedal robot on a tiltable granular bed, we systematically measure locomotion speed together with slope-dependent normal and shear granular resistive forces. While normal penetration resistance remains nearly unchanged with inclination, shear resistance decreases substantially as slope angle increases. Guided by these measurements, we develop a simple robot-terrain interaction model that predicts anchoring timing, step length, and resulting robot speed, as functions of terrain strength and slope angle. The model reveals that slope-induced per

qwen3.5 vs gemma4 vs cloud llms in python turtle
I have found python turtle to be a pretty good test for a model. All of these models have received the same prompt: "write a python turtle program that draws a cat" you can actually see similarity in gemma's and gemini pro's outputs, they share the color pallete and minimalist approach in terms of details. I have a 16 gb vram gpu so couldn't test bigger versions of qwen and gemma without quantisation. gemma_4_31B_it_UD_IQ3_XXS.gguf Qwen3_5_9B_Q8_0.gguf Qwen_3_5_27B_Opus_Distilled_Q4_K_S.gguf deepseek from web browser with reasoning claude sonnet 4.6 extended gemini pro from web browser with thinking submitted by /u/SirKvil [link] [comments]



Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!