How to integrate VS Code with Ollama for local AI assistance
If you’re starting your journey as a programmer and want to jump-start that process, you might be interested in taking The post How to integrate VS Code with Ollama for local AI assistance appeared first on The New Stack .
If you’re starting your journey as a programmer and want to jump-start that process, you might be interested in taking advantage of AI to make the process of getting up to speed a bit simpler. After all, coding can be a tough business to break into, and every advantage you can give yourself should be considered.
Before I continue, I will say this: use AI to help you learn the language that you’re interested in and not as a substitute for actually learning the language. Consider this an assistant, not a replacement for skill.
When I need to turn to AI, I always go for locally-installed options for a couple of reasons. First, using locally installed AI doesn’t put a strain on the electrical grid. Second, I don’t have to worry that a third party is going to get a glimpse of my queries, so privacy is actually possible.
To that end, I depend on Ollama as my chosen locally-installed AI tool. Ollama is easy to use, flexible, and reliable.
If your IDE of choice is Visual Studio Code, you’re in luck, as you can integrate it with a locally installed instance of Ollama.
I’m going to show you how this is done.
What you’ll need
To make this work, you’ll need a desktop OS running Linux, macOS, or Windows. I’ll demonstrate the process on a Ubuntu-based Linux distribution (Pop!OS). If you’re using either macOS or Windows, the only things that you’ll need to change are the installation of Ollama and VS Code. Fortunately, in both instances, it’s just a matter of downloading the binary installer of each tool, double-clicking the downloaded files, and walking through the setup process.
On Linux, it’s a bit different.
Let me show you.
Installing Ollama
The first thing we’ll do is install Ollama. If you’re using macOS or Windows, download the .dmg for Mac or the .exe for Windows, double-click the file, and you’re off.
On Linux, open a terminal window and issue the command:
curl -fsSL https://ollama.com/install.sh | sh
You’ll be prompted for your sudo password before the installation begins.
After the installation is complete, you’ll then need to pull a specific LLM for Ollama. On macOS and Windows, open the Ollama GUI, go to the query field, click the downward-pointing arrow, type codellama, and click the entry to install the model.
On Linux, open a terminal app and pull the necessary LLM with:
ollama pull codellama
Install VS Code
Next, you’ll need to install VS Code.
The same thing holds true: with macOS or Windows, download the VS Code executable binary for your OS of choice, double-click the downloaded file, and walk through the installation wizard.
On Linux, you’ll also need to download the installer for your distribution of choice (.deb for Debian-based distributions, .rpm for Fedora-based distributions, or the Snap package).
To install VS Code on Linux, change into the directory housing the installer file you downloaded. Install the app with one of the following commands:
-
For Ubuntu-based distributions: sudo dpkg -i code*.deb
-
For Fedora-based distributions: sudo rpm -i code*.rpm
-
For Snap packages: sudo snap install code –classic
You now have the two primary pieces to get you started.
Setting up VS Code
The next step is to set up VS Code to work with Ollama. To that, you’ll need to install an extension called Continue.
For that, hit Ctrl+P (on macOS, that’s Cmd+P).
In the resulting field, type:
ext install continue.continue
In the resulting page (Figure 1), click Install.
Figure 1: Installing the necessary extension on VS Code is simple.
Once the extension is installed, click on the Continue icon in the left sidebar. In the resulting window, click the Select Model drop-down and click Add Chat model (Figure 2).
Figure 2: You have to add a model before you can continue.
In the resulting window, select Ollama from the provider drop-down (Figure 3).
Figure 3: You can select from any one of the available models, but we’re going with Ollama.
Next, make sure to select Local from the tabs and then click the terminal icon to the right of each command. This will open the built-in terminal, where you’ll then need to hit Enter on your keyboard to execute the command (Figure 4).
Figure 4: This is where the meat of the configuration takes place.
When the first command (the Chat model command) completes, do the same for the second command (the Autocomplete model) and the third (the Embeddings model). This will take some time, so be patient. When each step is complete, you’ll see a green check by it.
After that’s completed, click Connect.
If you click the Continue extension, you should now see a new chat window that is connected to your locally installed instance of Ollama (Figure 5).
You are all set up and ready to rock.
TRENDING STORIES
Group Created with Sketch.
Sign in to highlight and annotate this article

Conversation starters
Daily AI Digest
Get the top 5 AI stories delivered to your inbox every morning.
More about
llamaollama
How I Built a Desktop AI App with Tauri v2 + React 19 in 2026
I wanted to build one app that does AI chat, image generation, and video generation — all running locally, no cloud, no Docker, no terminal. Just a .exe you download and run. The result is Locally Uncensored — a React 19 + TypeScript frontend with a Tauri v2 Rust backend that connects to Ollama for chat and ComfyUI for image/video generation. It ships as a standalone desktop app on Windows (.exe/.msi), Linux (.AppImage/.deb), and macOS (.dmg). This post covers the real technical challenges I hit and how I solved them. If you're building a Tauri app that talks to local services, manages large file downloads, or needs to auto-discover software on the user's machine, this is for you. The Stack Frontend : React 19, TypeScript, Tailwind CSS 4, Framer Motion, Zustand Desktop Shell : Tauri v2 (Ru

The All-in-One Local AI App: Chat + Images + Video Without the Cloud
There's a point in every local AI enthusiast's journey where you realize you're juggling too many tools. Ollama for chat. ComfyUI for images (if you can get it working). Some other tool for video. A separate app for voice transcription. And none of them talk to each other. You end up with five terminal windows, three browser tabs, and a growing suspicion that this shouldn't be this hard. That's why I built Locally Uncensored — a single desktop app that does AI chat, image generation, and video creation. Everything runs on your machine. Nothing touches the cloud. No Docker required. You download a .exe, double-click it, and you're done. The Problem: Death by a Thousand Tabs If you're running local AI today, your workflow probably looks something like this: Open a terminal, run ollama serve
Knowledge Map
Connected Articles — Knowledge Graph
This article is connected to other articles through shared AI topics and tags.
More in Models
How Databricks’ FlashOptim cuts LLM training memory by 50 percent
Training large language models usually requires a cluster of GPUs. FlashOptim changes the math, enabling full-parameter training on fewer accelerators. The post How Databricks’ FlashOptim cuts LLM training memory by 50 percent first appeared on TechTalks .
How Sakana AI’s new technique solves the problems of long-context LLM tasks
RePo, Sakana AI’s new technique, solves the "needle in a haystack" problem by allowing LLMs to organize their own memory. The post How Sakana AI’s new technique solves the problems of long-context LLM tasks first appeared on TechTalks .


Discussion
Sign in to join the discussion
No comments yet — be the first to share your thoughts!