Automatic Identification of Parallelizable Loops Using Transformer-Based Source Code Representations
arXiv:2603.30040v1 Announce Type: new Abstract: Automatic parallelization remains a challenging problem in software engineering, particularly in identifying code regions where loops can be safely executed in parallel on modern multi-core architectures. Traditional static analysis techniques, such as dependence analysis and polyhedral models, often struggle with irregular or dynamically structured code. In this work, we propose a Transformer-based approach to classify the parallelization potential of source code, focusing on distinguishing independent (parallelizable) loops from undefined ones. We adopt DistilBERT to process source code sequences using subword tokenization, enabling the model to capture contextual syntactic and semantic patterns without handcrafted features. The approach is
View PDF HTML (experimental)
Abstract:Automatic parallelization remains a challenging problem in software engineering, particularly in identifying code regions where loops can be safely executed in parallel on modern multi-core architectures. Traditional static analysis techniques, such as dependence analysis and polyhedral models, often struggle with irregular or dynamically structured code. In this work, we propose a Transformer-based approach to classify the parallelization potential of source code, focusing on distinguishing independent (parallelizable) loops from undefined ones. We adopt DistilBERT to process source code sequences using subword tokenization, enabling the model to capture contextual syntactic and semantic patterns without handcrafted features. The approach is evaluated on a balanced dataset combining synthetically generated loops and manually annotated real-world code, using 10-fold cross-validation and multiple performance metrics. Results show consistently high performance, with mean accuracy above 99% and low false positive rates, demonstrating robustness and reliability. Compared to prior token-based methods, the proposed approach simplifies preprocessing while improving generalization and maintaining computational efficiency. These findings highlight the potential of lightweight Transformer models for practical identification of parallelization opportunities at the loop level.
Comments: 28 pages, 12 figures
Subjects:
Software Engineering (cs.SE); Artificial Intelligence (cs.AI)
Cite as: arXiv:2603.30040 [cs.SE]
(or arXiv:2603.30040v1 [cs.SE] for this version)
https://doi.org/10.48550/arXiv.2603.30040
arXiv-issued DOI via DataCite (pending registration)
Submission history
From: Izavan Dos Santos Correia [view email] [v1] Tue, 31 Mar 2026 17:54:27 UTC (1,440 KB)
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
modeltransformerannounceAI company insiders can bias models for election interference
tl;dr it is currently possible for a captured AI company to deploy a frontier AI model that later becomes politically disinformative and persuasive enough to distort electoral outcomes. With gratitude to Anders Cairns Woodruff for productive discussion and feedback. LLMs are able to be highly persuasive, especially when engaged in conversational contexts . An AI "swarm" or other disinformation techniques scaled massively by AI assistance are potential threats to democracy because they could distort electoral results. AI massively increases the capacity for actors with malicious incentives to influence politics and governments in ways that are hard to prevent, such as AI-enabled coups . Mundane use and integration of AI also has been suggested to pose risks to democracy. A political persuas
I open sourced a production MLOps pipeline. Here is what it took to get it to PyPI and Hugging Face in one day.
<p>I have been running ML pipelines in production for few years. Tens of millions of predictions a day, real money on the line, no tolerance for guesswork.</p> <p>PulseFlow started as something I built for myself. A reference architecture I kept recreating from scratch at every company because nothing open source matched what production actually demands.</p> <p>Today I packaged it, published it to PyPI, and put a live demo on Hugging Face. Here is what it covers and how to run it in under ten minutes.</p> <h2> What PulseFlow is </h2> <p>A production-grade MLOps pipeline you can clone and run immediately. Not a tutorial. Not a toy dataset. A real stack.<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight shell"><code>pip <span class="nb">install </span>pulseflow-mlops <
Building a Real-Time Dota 2 Draft Prediction System with Machine Learning
<p>I built an AI system that watches live Dota 2 pro matches and predicts which team will win based purely on the draft. Here's how it works under the hood.</p> <p><strong>The Problem</strong><br> Dota 2 has 127 heroes. A Captain's Mode draft produces roughly 10^15 possible combinations. Analysts spend years building intuition about which drafts work — I wanted to see if a model could learn those patterns from data.</p> <p><strong>Architecture</strong></p> <p><em>Live Match → Draft Detection → Feature Engineering → XGBoost + DraftNet → Prediction + SHAP Explanation</em></p> <p>The system runs 24/7 on Railway (Python/FastAPI). When a professional draft completes, it detects the picks within seconds, runs them through two models in parallel, and publishes the prediction to a Telegram channel
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
Building a Real-Time Dota 2 Draft Prediction System with Machine Learning
<p>I built an AI system that watches live Dota 2 pro matches and predicts which team will win based purely on the draft. Here's how it works under the hood.</p> <p><strong>The Problem</strong><br> Dota 2 has 127 heroes. A Captain's Mode draft produces roughly 10^15 possible combinations. Analysts spend years building intuition about which drafts work — I wanted to see if a model could learn those patterns from data.</p> <p><strong>Architecture</strong></p> <p><em>Live Match → Draft Detection → Feature Engineering → XGBoost + DraftNet → Prediction + SHAP Explanation</em></p> <p>The system runs 24/7 on Railway (Python/FastAPI). When a professional draft completes, it detects the picks within seconds, runs them through two models in parallel, and publishes the prediction to a Telegram channel
Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users - Futurism
<a href="https://news.google.com/rss/articles/CBMikwFBVV95cUxQWnR0SXhyVm01QXZhUTNsWDNYSFNoNDZnRWpuN3M0Skw5LXJVNFVOSWg4TWRXSEFqY2Zab0M2LWhKV1hZa0xKcDJId19RSW1WRndVREU1TFVZSl8tZ3U1MGk3U2kzWWtDbm9ZWmNMM3R5VFpMdXJ3ZzlHaXZGR2FQbHBqeWFZekppZHdhVTYyU3BnWDA?oc=5" target="_blank">Paper Finds That Leading AI Chatbots Like ChatGPT and Claude Remain Incredibly Sycophantic, Resulting in Twisted Effects on Users</a> <font color="#6f6f6f">Futurism</font>
Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ
<a href="https://news.google.com/rss/articles/CBMiogNBVV95cUxOUEdqRE9rOUU0Uldvd2xrbkdYd0pqQ3AxVnJ3UG9TNTlVQ3M4NF96T3hVYTloNkZiVGFoM1NUWTJPdkpIUldzVDNRa3JfaWpBWjVNVUR5YkM0SXhRVTRUZEhhVGJHR0lTV1dzb2FkVkVnZnNpcEdVa3M3Tm9wSDhfVnk1MWJDWEZTMmRWcmZzWXVkQXczb010Z1IzNGc5SlA2N0RzX3pQdThiR2J5UlVnZFd3NjFiRkNqQlVwaTN2X0ZWVGZ5bUVqRUhPUWdpUXJUalRKZm1HeWJicF9pbVlQbHVmZUkzYVBpM2NIR1l5SUVnY1R5TnEydlI0R0xfRW9RMHZYNGFnYlNvVEtZRC1leGZ2bndiSl9tZE5seFZsRWtXeFZVMVRRWXFpelBzTVdQeDdYVlR1ckNxcDRJbUFpOUtuNGNkN3A1aHE2R21CQUR3aXQtWnlvWkE1aHdUWFB0d01uRzRaa2JaYnZhRWFjcmptNGttaE9LWTM4WE9yT2p4MjZpSVFiNG1tZERlWnZYXzhxYjROb2ZseENWNW82TFln?oc=5" target="_blank">Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT</a> <font color="#6f6f6f">WSJ</font>
Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT - WSJ
<a href="https://news.google.com/rss/articles/CBMiogNBVV95cUxOdVlCQ2pGTkZxNW5LeWp0UUlmYU5MTm1jMTBtb1M0VVBjTmZYb2VYdjhGR1FHUWNrbVJvT0xNYzJQLXBNY1RTQ2JDUHlBRGZpUzYySG01S0ZOQzhUVjJIUFhYamU5YWNhWU5zRkl6ZkU5SG1NclFmcnN0cHZlZ2VJOGY0Q2x2Y1h6OXk1Nk5PdHl3MEdfOGlvRS1Wajdab1pzamZZdldtVmt5SVlLY2V5SlRkbWlic1J1OXNuYU9JdmxyR2s1WXozS2k4UXhVUmkzSFJfSUJReDk3U0lOVUJWb1BBVkktYW1zbVViRnhZaE40SVNOcXpURUZuQ2dhZ3NxbEdqRkRDc01tWDlONDhhQkt4Z3RhQWthVURoVmRjUzdCU2dZMkRzazdlZ09ST3VQS2piNlZhYjYycTdsZHF3ZmZDdk1CdEVQY0NVWHZrY1YyaHlQblBpOXNPMzdvWXhuWUhpNzloVlBBcnNvVjlJbWs5OTg0Mk8tdTl4eGlzcTI2TjlNUGk0RkVIY3U0azVTREgxenM2S2t4aTBtTTNHYnVR?oc=5" target="_blank">Exclusive | The Sudden Fall of OpenAI’s Most Hyped Product Since ChatGPT</a> <font color="#6f6f6f">WSJ</font>
Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!