Reframe from three modes to two worlds
Restructures section 01 from "web chat / in-editor / agentic" into "web chat vs. tools that live with your code," with the autocomplete / in-project chat / agentic spectrum as a sub-structure of the latter. Inline edits are reduced to a historical note tied to the 2023 instruction-tuned LLM era. - Rename 01-three-modes -> 01-two-worlds and 03-in-editor-workflow -> 03-autocomplete; section 03 narrows to autocomplete (ghost text habits, the autocomplete-your-verification trap) - Section 04 reframes in-project chat as the default venue, web chat as a special-case venue; adds "Carrying context across sessions" covering dev-log.md, CLAUDE.md, .cursorrules - Section 05 reworks intro to contrast against in-project chat instead of "editor extension"; tightens prose and removes em-dashes - Update cross-references and tool-mode language in 02, 06, 07, and the root README to match the new framing - Swap the CRDT example in section 04 for finite-volume methods, fitting the CHEG audience - Minor typo/wording fixes Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
This commit is contained in:
parent
5780cdf097
commit
d2ca02bd90
10 changed files with 308 additions and 270 deletions
|
|
@ -2,14 +2,14 @@
|
|||
|
||||
## Key idea
|
||||
|
||||
You do not have to use a frontier cloud model to use AI in your work. A "local" model runs entirely on your own hardware: no API, no per-token cost, no data leaving the machine. Local models are not a fourth *mode* on top of chat, editor, and agent — they cut across all three. The same workflow patterns apply; what changes is the tool that hosts the model and what you give up (and gain) by running it yourself.
|
||||
You do not have to use a frontier cloud model to use AI in your work. A "local" model runs entirely on your own hardware: no API, no per-token cost, no data leaving the machine. Local models cut across every workflow we've covered — web chat, autocomplete, in-project chat, and agentic — rather than being a separate mode. The same workflow patterns apply; what changes is the tool that hosts the model and what you give up (and gain) by running it yourself.
|
||||
|
||||
This section is about local models as a *user* of AI coding tools. If you want to understand how local models work under the hood, train your own, or build the infrastructure around them, see the [llm-workshop](https://lem.che.udel.edu/git/furst/llm-workshop).
|
||||
|
||||
## Key goals
|
||||
|
||||
- Understand why you might prefer a local model to a cloud model
|
||||
- Recognize which tools in each of the three modes support local models
|
||||
- Recognize which tools across the autocomplete/chat/agent spectrum support local models
|
||||
- Calibrate expectations about capability and latency relative to frontier cloud models
|
||||
- Identify the situations where local is the right choice and where cloud still wins
|
||||
|
||||
|
|
@ -47,11 +47,11 @@ A rough sense of what runs comfortably where, as of early 2026:
|
|||
If you took the time to fill out the spec table in [computing-setup section 01](https://lem.che.udel.edu/git/furst/computing-setup/src/branch/main/01-know-your-machine/), you already know what tier you're in.
|
||||
|
||||
|
||||
## Local models across the three modes
|
||||
## Local models across the workflow
|
||||
|
||||
The three-mode framing from [section 01](../01-three-modes/) still applies — what changes is the host.
|
||||
The framing from [section 01](../01-two-worlds/) still applies — what changes is the host. Below, we walk through where local models fit in each kind of work.
|
||||
|
||||
### Local in *chat* mode
|
||||
### Local in *web-chat* style
|
||||
|
||||
You can have a private, local ChatGPT-style experience entirely on your laptop.
|
||||
|
||||
|
|
@ -62,15 +62,15 @@ You can have a private, local ChatGPT-style experience entirely on your laptop.
|
|||
| **Open WebUI** | A self-hosted web UI (like ChatGPT) that talks to Ollama or any OpenAI-compatible backend. Good if you want a familiar chat experience or want to share access on a LAN. |
|
||||
| **Jan**, **GPT4All** | Other desktop chat apps with similar goals. |
|
||||
|
||||
The Ollama-powered backends in particular are useful well beyond chat — most of the editor and agentic tools below can connect to an Ollama endpoint, which means setting up Ollama once unlocks every mode.
|
||||
The Ollama-powered backends in particular are useful well beyond chat — most of the in-editor and agentic tools below can connect to an Ollama endpoint, which means setting up Ollama once unlocks every other use case.
|
||||
|
||||
### Local in *editor* mode
|
||||
### Local for autocomplete and in-project chat
|
||||
|
||||
Several VS Code extensions support local models. Notably, **GitHub Copilot, Microsoft Copilot, and the Claude extension do not** — they require their vendor's cloud service. If you want a local model in your editor, you need a different extension.
|
||||
Several VS Code extensions support local models for autocomplete and side-panel chat. Notably, **GitHub Copilot, Microsoft Copilot, and the Claude (legacy) extension do not** — they require their vendor's cloud service. If you want a local model in your editor, you need a different extension.
|
||||
|
||||
| Extension | Notes |
|
||||
|---|---|
|
||||
| **Continue.dev** | Open-source, the flagship local-friendly extension. Works with Ollama, LM Studio, llama.cpp, and many cloud providers. Supports autocomplete, inline edit, and a chat panel. The first tool to try. |
|
||||
| **Continue.dev** | Open-source, the flagship local-friendly extension. Works with Ollama, LM Studio, llama.cpp, and many cloud providers. Supports autocomplete and a chat panel. The first tool to try. |
|
||||
| **Cody** (Sourcegraph) | Has a "local context" mode and can use local models via Ollama. Also has a strong cloud product. |
|
||||
| **Llama Coder** | Ollama-focused, autocomplete-first. Lightweight. |
|
||||
| **Tabby** | A self-hosted code completion server. Heavier setup but good for shared use within a team or lab. |
|
||||
|
|
|
|||
Loading…
Add table
Add a link
Reference in a new issue