Live
Black Hat USADark ReadingBlack Hat AsiaAI BusinessCollege students say they are changing their majors because of AIBusiness InsiderMen are ditching TV for YouTube as AI usage and social media fatigue growHacker News AI TopInside KPMG's push to turn tax experts into hands-on software buildersBusiness Insider'You’re not trying to be perfect': I tested ChatGPT’s advice for burnout — and it helped - TechRadarGoogle News: ChatGPTU.S. Postal Inspection Service warns public about new wave of scams powered by artificial intelligence - WMAR 2 News BaltimoreGNews AI USAGame to Lose Online Mode After Its Server Partner Pivots to You’ll Never Guess WhatGizmodoThis $400 (Not) AI Keychain Is Pointless, Extravagant, and Weirdly LovableGizmodoI used ChatGPT to transform my iPhone home screen — and now it feels like a brand-new device - Tom's GuideGoogle News: ChatGPTUganda to Unveil Comprehensive AI and Emerging Tech Roadmap by June - PC Tech MagazineGoogle News - AI UgandaHow Israel is expanding its use of AI warfare in Iran and Lebanon - Al JazeeraGNews AI IsraelAlibaba Rolls Out New Large Language Model Qwen3.6-Plus - MoomooGoogle News: LLMPresentation: Directing a Swarm of Agents for Fun and ProfitInfoQ AI/MLBlack Hat USADark ReadingBlack Hat AsiaAI BusinessCollege students say they are changing their majors because of AIBusiness InsiderMen are ditching TV for YouTube as AI usage and social media fatigue growHacker News AI TopInside KPMG's push to turn tax experts into hands-on software buildersBusiness Insider'You’re not trying to be perfect': I tested ChatGPT’s advice for burnout — and it helped - TechRadarGoogle News: ChatGPTU.S. Postal Inspection Service warns public about new wave of scams powered by artificial intelligence - WMAR 2 News BaltimoreGNews AI USAGame to Lose Online Mode After Its Server Partner Pivots to You’ll Never Guess WhatGizmodoThis $400 (Not) AI Keychain Is Pointless, Extravagant, and Weirdly LovableGizmodoI used ChatGPT to transform my iPhone home screen — and now it feels like a brand-new device - Tom's GuideGoogle News: ChatGPTUganda to Unveil Comprehensive AI and Emerging Tech Roadmap by June - PC Tech MagazineGoogle News - AI UgandaHow Israel is expanding its use of AI warfare in Iran and Lebanon - Al JazeeraGNews AI IsraelAlibaba Rolls Out New Large Language Model Qwen3.6-Plus - MoomooGoogle News: LLMPresentation: Directing a Swarm of Agents for Fun and ProfitInfoQ AI/ML
AI NEWS HUBbyEIGENVECTOREigenvector

Why LeBron James Shouldn’t Drive Your Recommendations: The Intuition Behind the Jaccard Coefficient

Neo4j Blogby corydon baylorJanuary 28, 20261 min read0 views
Source Quiz

Do LeBron James and I have a lot of friends in common? Instagram might tell you that we do. After all, many of the people who follow me on Instagram also follow LeBron. So logically, someone who follows Lebron might… Read more →

Do LeBron James and I have a lot of friends in common? Instagram might tell you that we do. After all, many of the people who follow me on Instagram also follow LeBron. So logically, someone who follows Lebron might also be interested in following me.

Swiss botanist Paul Jaccard and his Jaccard Coefficient would beg to differ. Not an avid Lakers fan, Jaccard was instead into studying different plant species in the Alps and Jura Mountains in the early 1900s. He wanted to answer a very basic question: “How similar are these places in terms of the plants that grow there?”

But simply counting plants wasn’t enough. One mountain might have more species overall because it’s larger, lower, or easier to reach—not because it’s meaningfully different. Jaccard realized that raw totals blurred the comparison he actually cared about. What mattered wasn’t how many species lived in each place, but how much their ecosystems overlapped.

And from that rather mundane question, he derived an incredibly impactful algorithm. Jaccard realized that he could get a good sense of how similar the ecologies of these two mountains were by counting the number of species they had in common (the intersection) and dividing it by the total number of species in both locations (the union). That led to this now-famous equation:

And with that simple insight, he discovered a stable way to compare sets—and earned himself a permanent spot in statistics textbooks.

So what do we mean by “quality” relationships? Let’s return to our LeBron James example. On Instagram, if you follow LeBron, what’s the likelihood you’re friends with one of his followers?

Get the Free eBook!

Learn the math and intuition behind five of our most popular graph algorithms with “A Practical Introduction to Graph Algorithms”.

Almost zero. He’s just too popular.

If Instagram recommended people to you simply because you both followed LeBron, your feed would be flooded with millions of strangers. We need a way to modulate the influence of someone like LeBron James in this network—a way to keep his popularity from overpowering the signal. Enter our now-familiar friend, the Jaccard Coefficient. Consider the graph below, where each node represents a person, and each edge represents a connection. We would like to know: who am I actually similar to, based on who I’m connected to?

Think back to that algorithm:

  • The numerator is the number of neighbors A and B have in common (the intersection).

  • The denominator is the total number of neighbors that both A and B have (the union), whether they overlap or not.

So, for our example, Ted and I have an intersection of one: Lebron. Next comes the union: every one of us is either connected to. That includes Ted, LeBron James, me, and Ted’s extra connection, for a total of four. One shared neighbor divided by four total neighbors gives a Jaccard similarity of 1/4, or 0.25.

Now for Lebron James. We have an intersection of one again (Ted). But the union is much larger this time: me, Ted, and Lebron’s five other connections, for a total of eight. One divided by eight gives a similarity of 1/8, or about 0.125.

So, when Instagram decides to recommend a new friend to me, naturally, they will recommend that mystery person Ted is friends with rather than one of LeBron’s many followers.

This is why people reach for Jaccard in the first place. It keeps popularity from overpowering everything else. You don’t want recommendations built on big outliers like LeBron James. In the same vein, an online retailer wouldn’t want to recommend a random product to a teacher just because a ‘super-shopper’ who buys everything under the sun also happened to buy pen and paper. That overlap doesn’t mean much. Instead, the retailer wants to find other shoppers with similar profiles and recommend products that are more relevant, such as whiteboard markers.

Jaccard helps businesses recommend products to customers with similar profiles. As much as it saddens me to say, I am much more similar to Ted than LeBron James.

Wherever your data lives, Neo4j Graph Analytics makes it easy to put these ideas into practice. In fact, we even have a follow-along blog for using the Jaccard Coefficient to power better recommendations! With Graph Analytics for Snowflake and Graph Intelligence for Microsoft Fabric, you can deploy algorithms like the Jaccard Coefficient directly on your existing data to power better recommendations, segmentation, and decision-making—without ETL or infrastructure overhead.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by Eigenvector · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Why LeBron …Neo4j Blog

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 135 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!