Monitor Insider Trading Without Parsing SEC XML — Form 4 Data as Clean JSON
<h1> Monitor Insider Trading Without Parsing SEC XML — Form 4 Data as Clean JSON </h1> <p>SEC Form 4 filings are one of the most useful public datasets for tracking what company insiders (CEOs, directors, 10% owners) are doing with their stock. When a CEO buys 50,000 shares of their own company, that's a signal.</p> <p>The problem: getting this data from SEC EDGAR is genuinely painful.</p> <h2> The EDGAR Problem </h2> <p>EDGAR serves filings as nested XML/SGML documents. There's no proper REST API for structured Form 4 data. Here's what you're dealing with:</p> <ol> <li> <strong>CIK-based lookups</strong>: You need to map ticker symbols to CIK numbers (EDGAR's internal ID system)</li> <li> <strong>Full-text search returns everything</strong>: Searching for "AAPL" returns all 100+ filing ty
Monitor Insider Trading Without Parsing SEC XML — Form 4 Data as Clean JSON
SEC Form 4 filings are one of the most useful public datasets for tracking what company insiders (CEOs, directors, 10% owners) are doing with their stock. When a CEO buys 50,000 shares of their own company, that's a signal.
The problem: getting this data from SEC EDGAR is genuinely painful.
The EDGAR Problem
EDGAR serves filings as nested XML/SGML documents. There's no proper REST API for structured Form 4 data. Here's what you're dealing with:
-
CIK-based lookups: You need to map ticker symbols to CIK numbers (EDGAR's internal ID system)
-
Full-text search returns everything: Searching for "AAPL" returns all 100+ filing types — 10-Ks, 8-Ks, proxies — not just insider trades
-
XML parsing: Form 4 filings use XBRL/XML with nested schemas. The actual transaction data is buried inside and elements
-
No pagination or filtering: EDGAR's XBRL feeds dump everything. You build the filtering logic
Most developers spend 2-3 days just getting the XML parsing right before they can extract a single transaction.
Skip the XML
I built an API that does all the EDGAR parsing and returns clean JSON. Here's what a Form 4 query looks like:
curl "https://your-api/sec/insider-trades?ticker=AAPL&limit=5"
Enter fullscreen mode
Exit fullscreen mode
{ "success": true, "total": 1542, "filings": [ { "accession": "0000950123-26-001456", "filedDate": "2026-03-31", "periodOfReport": "2026-03-27", "insider": { "name": "Cook Timothy D", "cik": "0001234567" }, "company": { "name": "APPLE INC", "cik": "0000320193" } } ] }{ "success": true, "total": 1542, "filings": [ { "accession": "0000950123-26-001456", "filedDate": "2026-03-31", "periodOfReport": "2026-03-27", "insider": { "name": "Cook Timothy D", "cik": "0001234567" }, "company": { "name": "APPLE INC", "cik": "0000320193" } } ] }Enter fullscreen mode
Exit fullscreen mode
Get full transaction details for any filing:
curl "https://your-api/sec/insider-trades/filing/0000950123-26-001456"
Enter fullscreen mode
Exit fullscreen mode
{ "issuer": { "cik": "0000320193", "name": "APPLE INC", "ticker": "AAPL" }, "owner": { "name": "Cook Timothy D", "isDirector": true, "isOfficer": true, "title": "Chief Executive Officer" }, "transactions": [ { "security": "Common Stock", "date": "2026-03-25", "code": "P", "codeLabel": "Purchase", "shares": 50000, "pricePerShare": 178.50, "sharesAfter": 3500000, "ownership": "direct" } ] }{ "issuer": { "cik": "0000320193", "name": "APPLE INC", "ticker": "AAPL" }, "owner": { "name": "Cook Timothy D", "isDirector": true, "isOfficer": true, "title": "Chief Executive Officer" }, "transactions": [ { "security": "Common Stock", "date": "2026-03-25", "code": "P", "codeLabel": "Purchase", "shares": 50000, "pricePerShare": 178.50, "sharesAfter": 3500000, "ownership": "direct" } ] }Enter fullscreen mode
Exit fullscreen mode
Ticker-based search (no CIK mapping needed), pre-filtered to Form 4 only, transactions parsed into flat JSON.
Building an Insider Trading Monitor
Here's a Python script that checks for insider purchases above $100K:
import requests from datetime import datetime, timedeltaimport requests from datetime import datetime, timedeltaAPI_URL = "https://your-api/sec/insider-trades" API_KEY = "your-api-key" WATCHLIST = ["AAPL", "MSFT", "GOOGL", "AMZN", "TSLA"]
headers = {"X-Api-Key": API_KEY} yesterday = (datetime.now() - timedelta(days=1)).strftime("%Y-%m-%d")
for ticker in WATCHLIST: resp = requests.get( API_URL, params={"ticker": ticker, "startDate": yesterday}, headers=headers, ) data = resp.json()
for filing in data.get("filings", []):
Get full transaction details
detail = requests.get( f"{API_URL}/filing/{filing['accession']}", headers=headers, ).json()
for tx in detail.get("transactions", []): if tx["code"] == "P": # Purchase value = tx["shares"] * tx["pricePerShare"] if value > 100_000: print( f"🚨 {detail['owner']['name']} " f"({detail['owner'].get('title', 'Insider')}) " f"bought {tx['shares']:,} shares of {ticker} " f"at ${tx['pricePerShare']:.2f} " f"(${value:,.0f} total)" )`*
Enter fullscreen mode
Exit fullscreen mode
Run this on a daily cron and you've got an insider trading alert system. The code is ~30 lines because all the hard work (XML parsing, CIK resolution, filing filtering) is handled by the API.
Transaction Codes Explained
Form 4 transactions use single-letter codes:
Code Meaning Signal
P Open market purchase Bullish — insider buying with own money
S Open market sale Could be planned (10b5-1) or discretionary
A Grant/award Compensation, not a market signal
M Option exercise Converting options to shares
F Tax withholding Automatic, not discretionary
G Gift Estate planning, not a trading signal
The most interesting transactions are P (purchases) and S (sales) — these represent discretionary decisions by insiders.
What You'd Build Without This
For context, here's what the DIY version looks like:
-
Build a CIK-to-ticker mapping table (SEC provides a bulk file, ~13,000 companies)
-
Write an EDGAR full-text search query parser
-
Build XML/SGML parsers for Form 4 documents (two different schemas depending on filing date)
-
Handle XBRL footnotes, amendments (Form 4/A), and derivative transactions
-
Implement rate limiting (SEC throttles to 10 req/sec with a required User-Agent header)
-
Build storage and deduplication logic
That's a week of work minimum, plus ongoing maintenance when EDGAR changes their schema.
I vibe coded this for my own trading research. It's on RapidAPI — free tier if you want to poke around.
DEV Community
https://dev.to/lulzasaur/monitor-insider-trading-without-parsing-sec-xml-form-4-data-as-clean-json-3omeSign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
versioncompanymarketMy most common advice for junior researchers
Written quickly as part of the Inkhaven Fellowship . At a high level, research feedback I give to more junior research collaborators often can fall into one of three categories: Doing quick sanity checks Saying precisely what you want to say Asking why one more time In each case, I think the advice can be taken to an extreme I no longer endorse. Accordingly, I’ve tried to spell out the degree to which you should implement the advice, as well as what “taking it too far” might look like. This piece covers doing quick sanity checks, which is the most common advice I give to junior researchers. I’ll cover the other two pieces of advice in a subsequent piece. Doing quick sanity checks Research is hard (almost by definition) and people are often wrong. Every researcher has wasted countless hours
Open Source Project of the Day (Part 27): Awesome AI Coding - A One-Stop AI Programming Resource Navigator
<h2> Introduction </h2> <blockquote> <p>"AI coding tools and resources are scattered everywhere. A topically organized, searchable, contributable list can save enormous amounts of search time."</p> </blockquote> <p>This is Part 27 of the "Open Source Project of the Day" series. Today we explore <strong>Awesome AI Coding</strong> (<a href="https://github.com/chendongqi/awesome-ai-coding" rel="noopener noreferrer">GitHub</a>).</p> <p>When doing AI-assisted programming, you'll face questions like: which editor or terminal tool should I use? For multi-agent frameworks, should I pick MetaGPT or CrewAI? What RAG frameworks and vector databases are available? Where do I find MCP servers? What ready-made templates are there for Claude Code Rules and Skills? <strong>Awesome AI Coding</strong> is ex
Claude Code Architecture Explained: Agent Loop, Tool System, and Permission Model (Rust Rewrite Analysis)
<h2> Claude Code Deep Dive (Part 1): Architecture Overview and the Core Agent Loop </h2> <p>Claude Code’s leaked source code weighs in at over <strong>510,000 lines of TypeScript</strong>—far too large to analyze directly.</p> <p>Interestingly, a community-driven Rust rewrite reduced that complexity to around <strong>20,000 lines</strong>, while still preserving the core functionality.</p> <p>Starting from this simplified version makes one thing much clearer:</p> <blockquote> <p>What does an AI agent system <em>actually need</em> to work?</p> </blockquote> <h2> Why Start with the Rust Rewrite? </h2> <p>On March 31, 2026, Claude Code’s full source was unintentionally exposed due to an npm packaging mistake.</p> <p>The package <code>@anthropic-ai/claude-code v2.1.88</code> included a <strong
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Releases

缓存架构深度指南:如何设计高性能缓存系统
<h1> 缓存架构深度指南:如何设计高性能缓存系统 </h1> <blockquote> <p>在现代分布式系统中,缓存是提升系统性能的核心组件。本文将深入探讨缓存架构的设计原则、策略与实战技巧。</p> </blockquote> <h2> 为什么要使用缓存? </h2> <p>在软件系统中,缓存的本质是<strong>用空间换时间</strong>。通过将频繁访问的数据存储在高速存储介质中,减少对慢速数据源的访问次数,从而显著提升系统响应速度。</p> <p>典型场景:</p> <ul> <li>数据库查询结果缓存</li> <li>API响应缓存</li> <li>会话状态缓存</li> <li>计算结果缓存</li> </ul> <h2> 缓存架构设计原则 </h2> <h3> 1. 缓存层级策略 </h3> <p>现代系统通常采用多级缓存架构:<br> </p> <div class="highlight js-code-highlight"> <pre class="highlight plaintext"><code>┌─────────────────────────────────────────────┐ │ CDN (边缘缓存) │ ├─────────────────────────────────────────────┤ │ Redis/Memcached │ ├─────────────────────────────────────────────┤ │ 本地缓存 │ ├─────────────────────────────────────────────┤ │ 数据库 │ └─────────────────────────────────────────────┘ </code></pre> </div> <p><strong>原则<

How to Use the ES2026 Temporal API in Node.js REST APIs (2026 Guide)
<p>After 9 years in development and countless TC39 meetings, the JavaScript Temporal API officially reached <strong>Stage 4 on March 11, 2026</strong>, locking it into the ES2026 specification. That means it's no longer a proposal — it's the future of date and time handling in JavaScript, and you should start using it in your Node.js APIs today.</p> <p>If you've ever shipped a date-related bug in production — DST edge cases, wrong timezone conversions, silent mutation bugs from <code>Date.setDate()</code> — you're not alone. The <code>Date</code> object was designed in 1995, copied from Java, and has been causing developer pain ever since. Temporal is the fix.</p> <p>This guide covers <strong>how to use the ES2026 Temporal API in Node.js REST APIs</strong> with practical, real-world patter

Axios Hijack Post-Mortem: How to Audit, Pin, and Automate a Defense
<p>On March 31, 2026, the <code>axios</code> npm package was compromised via a hijacked maintainer account. Two versions, <code>1.14.1</code> and <code>0.30.4</code>, were weaponised with a malicious phantom dependency called <code>plain-crypto-js</code>. It functions as a Remote Access Trojan (RAT) that executes during the <code>postinstall</code> phase and silently exfiltrates environment variables: AWS keys, GitHub tokens, database credentials, and anything present in your <code>.env</code> at install time.</p> <p>The attack window was approximately 3 hours (00:21 to 03:29 UTC) before the packages were unpublished. A single CI run during that window is sufficient exposure.<br> This post documents the forensic audit and remediation steps performed on a Next.js production stack immediatel
Guilford Technical CC to Launch Degrees in AI, Digital Media - govtech.com
<a href="https://news.google.com/rss/articles/CBMipgFBVV95cUxQOXdfNFpXQjJyRlo4aTA1cjdwZk5IbTNTNi1BU25hQUNlSjVXcE5ZelJNbFRMYUZsVFNWZ3lxX21TQ3NocHdLbldydkR0Q1JURXR5eVhXd3ItNjlJcE1TdHFPMnA1c0FQWDBmbWtNRC04YWRIelU5LWU3Rl9ZWHctYU02d2M4WHJ5a2pwaW0xcTRyNkVqSThhNkNxbFlZSkF4Q2tIZHNn?oc=5" target="_blank">Guilford Technical CC to Launch Degrees in AI, Digital Media</a> <font color="#6f6f6f">govtech.com</font>

Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!