Live
Black Hat USAAI BusinessBlack Hat AsiaAI BusinessI Stress-Tested PAIO for OpenClaw: Faster Setup, Lower Token Use, Better Security?DEV CommunitySources: AI startup Poolside held talks with Google and others to revive a Texas data center project after a CoreWeave deal and a $2B Nvidia-led round collapsed (Stephen Morris/Financial Times)Techmeme🚀 I Built an API Documentation Generator That Works in 5 SecondsDEV CommunitySum, Count, and Reverse of Digits in Python (While Loop & Recursion)DEV CommunityWhen LangChain Is Enough: How to Build Useful AI Apps Without OverengineeringDEV CommunityThe Evolution of Natural Language Processing: A Journey from 1960 to 2020DEV CommunityApple Just Killed a $100M Vibe Coding App. Here's the Security Angle Nobody's Talking About.DEV CommunitySamsung SDS Unveils AI, Digital Twin Logistics Innovations at 2026 Conference - 조선일보GNews AI SamsungImplementing ECDSA from Scratch Without LibrariesDEV CommunityMachine Learning in Blockchain for AI Engineers and Blockchain Developers - Blockchain CouncilGoogle News: Machine LearningGitHub Issue Template: How to Get More Contributions and Build CommunityDEV CommunityAlpha Ladder Group and MetaComp Partner with Maqam International Holding, an Abu Dhabi (UAE) company, to Advance RWA Tokenisation and Web2.5 Payments Across Singapore-UAE Corridor - The AI JournalGNews AI UAEBlack Hat USAAI BusinessBlack Hat AsiaAI BusinessI Stress-Tested PAIO for OpenClaw: Faster Setup, Lower Token Use, Better Security?DEV CommunitySources: AI startup Poolside held talks with Google and others to revive a Texas data center project after a CoreWeave deal and a $2B Nvidia-led round collapsed (Stephen Morris/Financial Times)Techmeme🚀 I Built an API Documentation Generator That Works in 5 SecondsDEV CommunitySum, Count, and Reverse of Digits in Python (While Loop & Recursion)DEV CommunityWhen LangChain Is Enough: How to Build Useful AI Apps Without OverengineeringDEV CommunityThe Evolution of Natural Language Processing: A Journey from 1960 to 2020DEV CommunityApple Just Killed a $100M Vibe Coding App. Here's the Security Angle Nobody's Talking About.DEV CommunitySamsung SDS Unveils AI, Digital Twin Logistics Innovations at 2026 Conference - 조선일보GNews AI SamsungImplementing ECDSA from Scratch Without LibrariesDEV CommunityMachine Learning in Blockchain for AI Engineers and Blockchain Developers - Blockchain CouncilGoogle News: Machine LearningGitHub Issue Template: How to Get More Contributions and Build CommunityDEV CommunityAlpha Ladder Group and MetaComp Partner with Maqam International Holding, an Abu Dhabi (UAE) company, to Advance RWA Tokenisation and Web2.5 Payments Across Singapore-UAE Corridor - The AI JournalGNews AI UAE

Applied NLP with LLMs: Beyond Black-Box Monoliths

Explosion AI Blogby Ines MontaniOctober 9, 20241 min read0 views
Source Quiz

In this talk, Ines shows some practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components.

Resources

A practical guide to human-in-the-loop distillation

https://explosion.ai/blog/human-in-the-loop-distillation

This blog post presents practical solutions for using the latest state-of-the-art models in real-world applications and distilling their knowledge into smaller and faster components that you can run and maintain in-house.

Applied NLP Thinking: How to Translate Problems into Solutions

https://explosion.ai/blog/applied-nlp-thinking

This blog post discusses some of the biggest challenges for applied NLP and translating business problems into machine learning solutions, including the distinction between utility and accuracy.

How S&P Global is making markets more transparent with NLP, spaCy and Prodigy

https://explosion.ai/blog/sp-global-commodities

A case study on S&P Global’s efficient information extraction pipelines for real-time commodities trading insights in a high-security environment using human-in-the-loop distillation.

How GitLab uses spaCy to analyze support tickets and empower their community

https://explosion.ai/blog/gitlab-support-insights

A case study on GitLab’s large-scale NLP pipelines for extracting actionable insights from support tickets and usage questions.

Using LLMs for human-in-the-loop distillation in Prodigy

https://prodi.gy/docs/large-language-models

Prodigy comes with preconfigured workflows for using LLMs to speed up and automate annotation and create datasets for distilling large generative models into more accurate, smaller, faster and fully private task-specific components.

Transcript

  • Ines Montani Explosion LLM

  • Falcon MIXTRAL GPT-4 LLM

  • Falcon MIXTRAL GPT-4 good contextual results LLM

  • Pro t ot y pe & Productio n CLOSE THE

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation 🔮 assess utility, not just accuracy explosion.ai/blog/applied-nlp-thinking How to avoid the prototype plateau?

  • Pro t ot y pe & Productio n CLOSE THE

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation 🔮 assess utility, not just accuracy explosion.ai/blog/applied-nlp-thinking 🛠 work on data iteratively How to avoid the prototype plateau?

  • Pro t ot y pe & Productio n CLOSE THE

GAP BETWEEN CLOSE THE GAP BETWEEN 📝 standardize inputs and outputs 📈 start with evaluation 🔮 assess utility, not just accuracy explosion.ai/blog/applied-nlp-thinking 💬 consider structure and ambiguity of natural language 🛠 work on data iteratively How to avoid the prototype plateau?

  • in the loop H uma n explosion.ai/blog/human-in-the-loop-distillation LLM

  • Case Stud y : PyData NYC 8hr 400mb 2k+ 8hr

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation model size words/second data dev time spacy.fyi/pydata-nyc

  • Case Stud y : PyData NYC 8hr 400mb 2k+ 8hr

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation • 20× inference time speedup model size words/second data dev time spacy.fyi/pydata-nyc

  • Case Stud y : PyData NYC 8hr 400mb 2k+ 8hr

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation • 20× inference time speedup • beat few-shot LLM baseline of 0.74 with task-specific model model size words/second data dev time spacy.fyi/pydata-nyc

  • Case Stud y : PyData NYC 8hr 400mb 2k+ 8hr

400mb 2k+ • extracting dishes, ingredients and equipment from r/cooking Reddit posts • used LLM during annotation • 20× inference time speedup • beat few-shot LLM baseline of 0.74 with task-specific model model size words/second data dev time spacy.fyi/pydata-nyc

  • Case Stud y : S&P Global 99% 6mb 16k+ 99%

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment model size words/second F-score explosion.ai/blog/sp-global-commodities

  • Case Stud y : S&P Global 99% 6mb 16k+ 99%

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation model size words/second F-score explosion.ai/blog/sp-global-commodities

  • Case Stud y : S&P Global 99% 6mb 16k+ 99%

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation • 10× data development speedup with humans and model in the loop model size words/second F-score explosion.ai/blog/sp-global-commodities

  • Case Stud y : S&P Global 99% 6mb 16k+ 99%

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation • 10× data development speedup with humans and model in the loop • 8 market pipelines in production model size words/second F-score explosion.ai/blog/sp-global-commodities

  • Case Stud y : S&P Global 99% 6mb 16k+ 99%

6mb 16k+ • real-time commodities trading insights by extracting structured attributes • high-security environment • used LLM during annotation • 10× data development speedup with humans and model in the loop • 8 market pipelines in production model size words/second F-score explosion.ai/blog/sp-global-commodities

  • break down larger problems

  • break down larger problems make problem easier

  • break down larger problems make problem easier reassess dependencies

  • Case Stud y : GitLab 1 year 6× 1 year

6× • extract actionable insights from support tickets and usage questions • high-security environment speedup of support tickets explosion.ai/blog/gitlab-support-insights

  • Case Stud y : GitLab 1 year 6× 1 year

6× • extract actionable insights from support tickets and usage questions • high-security environment • easy to adapt to new scenarios and business questions speedup of support tickets explosion.ai/blog/gitlab-support-insights

  • Case Stud y : GitLab 1 year 6× 1 year

6× • extract actionable insights from support tickets and usage questions • high-security environment • easy to adapt to new scenarios and business questions • separated general-purpose features from product-specific logic speedup of support tickets explosion.ai/blog/gitlab-support-insights

  • Case Stud y : GitLab 1 year 6× 1 year

6× • extract actionable insights from support tickets and usage questions • high-security environment • easy to adapt to new scenarios and business questions • separated general-purpose features from product-specific logic speedup of support tickets explosion.ai/blog/gitlab-support-insights

  • Reason and refactor. The key to success lies in your

data and may surprise you! LLM Stay ambitious. Don’t compromise on best practices, e iciency and privacy. Summar y APPLIED NLP & GEN AI APPLIED NLP & GEN AI Iterate. The right tooling and mindset gets you past the “prototype plateau”.

Was this article helpful?

Sign in to highlight and annotate this article

AI
Ask AI about this article
Powered by AI News Hub · full article context loaded
Ready

Conversation starters

Ask anything about this article…

Daily AI Digest

Get the top 5 AI stories delivered to your inbox every morning.

More about

modelapplicationcomponent

Knowledge Map

Knowledge Map
TopicsEntitiesSource
Applied NLP…modelapplicationcomponentExplosion A…

Connected Articles — Knowledge Graph

This article is connected to other articles through shared AI topics and tags.

Knowledge Graph100 articles · 98 connections
Scroll to zoom · drag to pan · click to open

Discussion

Sign in to join the discussion

No comments yet — be the first to share your thoughts!

More in Models