Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessFaraday Future Founder and Co-CEO YT Jia Shares Weekly Investor Update: FF to Establish the First Scaled EAI Education System in the United States With Deployment of Its EAI Robotics Products and Technology - The AI JournalGoogle News - AI roboticsThe Missing Guide: Configuring OpenClaw on AWS Lightsail in Under 30 MinutesTowards AIAIが再定義するエンタープライズのデータセンター要件ーー稼働率99.999%では不足の時代へCIO MagazineAgentic AI in Beauty: How ChatGPT Is Reshaping Discovery, Trust, and Conversion - beautymatter.comGoogle News: ChatGPTUnmathematical features of mathlesswrong.comRecent Advances in Algorithmic High-Dimensional Robust StatisticsDev.to AIIntroducing GEN-1 [video]Hacker News TopOpenAI Operations Chief Changes Jobs Amid IPO Preparations - PYMNTS.comGoogle News: OpenAINew: Elektor's robotics and automation special ... - eeNews EuropeGoogle News - AI roboticsAnthropic Copied OpenClaw’s Features, Then Banned OpenClaw. Here’s the Proof.Towards AIShow HN: TermHub – Open-source terminal control gateway built for AI AgentsHacker News AI TopPeople consistently devalue creative writing generated by artificial intelligence - PsyPostGoogle News: AIBlack Hat USADark ReadingBlack Hat AsiaAI BusinessFaraday Future Founder and Co-CEO YT Jia Shares Weekly Investor Update: FF to Establish the First Scaled EAI Education System in the United States With Deployment of Its EAI Robotics Products and Technology - The AI JournalGoogle News - AI roboticsThe Missing Guide: Configuring OpenClaw on AWS Lightsail in Under 30 MinutesTowards AIAIが再定義するエンタープライズのデータセンター要件ーー稼働率99.999%では不足の時代へCIO MagazineAgentic AI in Beauty: How ChatGPT Is Reshaping Discovery, Trust, and Conversion - beautymatter.comGoogle News: ChatGPTUnmathematical features of mathlesswrong.comRecent Advances in Algorithmic High-Dimensional Robust StatisticsDev.to AIIntroducing GEN-1 [video]Hacker News TopOpenAI Operations Chief Changes Jobs Amid IPO Preparations - PYMNTS.comGoogle News: OpenAINew: Elektor's robotics and automation special ... - eeNews EuropeGoogle News - AI roboticsAnthropic Copied OpenClaw’s Features, Then Banned OpenClaw. Here’s the Proof.Towards AIShow HN: TermHub – Open-source terminal control gateway built for AI AgentsHacker News AI TopPeople consistently devalue creative writing generated by artificial intelligence - PsyPostGoogle News: AI
AI NEWS HUBbyEIGENVECTOREigenvector

I built a faster alternative to cp and rsync — here's how it works

DEV Communityby krit.k83 (ΚρητικόςIGB)April 5, 20263 min read0 views
Source Quiz

I'm a systems engineer. I spend a lot of time copying files — backups to USB drives, transfers to NAS boxes, moving data between servers over SSH. And I kept running into the same frustrations: cp -r is painfully slow on HDDs when you have tens of thousands of small files rsync is powerful but complex, and still slow for bulk copies scp and SFTP top out at 1-2 MB/s on transfers that should be much faster No tool tells you upfront if the destination even has enough space So I built fast-copy — a Python CLI that copies files at maximum sequential disk speed. The core idea When you run cp -r , files are read in directory order — which is essentially random on disk. Every file seek on an HDD costs 5-10ms. Multiply that by 60,000 files and you're spending minutes just on head movement. fast-cop

I'm a systems engineer. I spend a lot of time copying files — backups to USB drives, transfers to NAS boxes, moving data between servers over SSH. And I kept running into the same frustrations:

  • cp -r is painfully slow on HDDs when you have tens of thousands of small files

  • rsync is powerful but complex, and still slow for bulk copies

  • scp and SFTP top out at 1-2 MB/s on transfers that should be much faster

  • No tool tells you upfront if the destination even has enough space

So I built fast-copy — a Python CLI that copies files at maximum sequential disk speed.

The core idea

When you run cp -r, files are read in directory order — which is essentially random on disk. Every file seek on an HDD costs 5-10ms. Multiply that by 60,000 files and you're spending minutes just on head movement.

fast-copy does something different: it resolves the physical disk offset of every file before copying. On Linux it uses FIEMAP, on macOS fcntl, on Windows FSCTL. Then it sorts files by block position and reads them sequentially.

That alone makes a big difference. But there's more.

Deduplication

Many directories have duplicate files — node_modules across projects, cached downloads, backup copies. fast-copy hashes every file with xxHash-128 (or SHA-256 as fallback), copies each unique file once, and creates hard links for duplicates.

In my test with 92K files, over half were duplicates — saving 379 MB and a lot of I/O time.

It also keeps a SQLite database of hashes, so repeated copies to the same destination skip files that were already copied in previous runs.

SSH tar streaming

This is the part I'm most proud of. Instead of using SFTP (which has significant protocol overhead), fast-copy streams files as chunked ~100 MB tar batches over raw SSH channels.

The remote side runs tar xf - and files land directly on disk — no temp files, no SFTP overhead. This even works on servers that have SFTP disabled, like some Synology NAS configurations.

Three modes are supported:

  • Local → Remote

  • Remote → Local

  • Remote → Remote (relay through your machine)

Real benchmarks

Local copy — 92K files to USB:

  • 44,718 unique files copied + 47,146 hard-linked

  • 509.8 MB written, 378.9 MB saved by dedup

  • 17.9 seconds, 28.5 MB/s

  • All files verified after copy

Remote to local — 92K files over LAN:

  • 509.8 MB downloaded in 14 minutes

  • 46,951 duplicates detected, saving 378.5 MB of transfer

  • 3x faster than SFTP

Getting started

The simplest way — just run the Python script:

python fast_copy.py /source /destination

Enter fullscreen mode

Exit fullscreen mode

Or download a standalone binary (no Python needed) from the Releases page — available for Linux, macOS, and Windows.

For SSH transfers, install paramiko:

pip install paramiko

Enter fullscreen mode

Exit fullscreen mode

For faster hashing:

pip install xxhash

Enter fullscreen mode

Exit fullscreen mode

Links

I'd love to hear feedback — especially from anyone dealing with large file transfers or backup workflows. What tools are you currently using? What's missing from them?

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

benchmarkreleaseavailable

Knowledge Map

Knowledge Map
TopicsEntitiesSource
I built a f…benchmarkreleaseavailablegithubDEV Communi…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 146 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!