Yollama

by mwa ST4

Use AI from the comfort of Sublime Text to auto-complete code or chat with LLMs.

Details

Version 1.2.1
Homepage gitlab.com
Modified 5 months ago
Last Seen 2 hours ago
First Seen 1 year ago

Installs

Total 816
Win 240
Mac 345
Linux 231

	Aug 7	Aug 6	Aug 5	Aug 4	Aug 3	Aug 2	Aug 1	Jul 31	Jul 30	Jul 29	Jul 28	Jul 27	Jul 26	Jul 25	Jul 24	Jul 23	Jul 22	Jul 21	Jul 20	Jul 19	Jul 18	Jul 17	Jul 16	Jul 15	Jul 14	Jul 13	Jul 12	Jul 11	Jul 10	Jul 9	Jul 8	Jul 7	Jul 6	Jul 5	Jul 4	Jul 3	Jul 2	Jul 1	Jun 30	Jun 29	Jun 28	Jun 27	Jun 26	Jun 25	Jun 24	Jun 23
Windows	1	10	3	0	1	0	2	1	0	2	0	0	1	1	0	1	0	0	0	1	1	1	1	0	1	0	0	0	1	0	1	1	2	1	2	1	0	0	0	1	2	1	0	3	4	2
Mac	2	0	0	0	0	0	3	0	3	1	0	0	0	1	0	3	2	1	0	1	1	0	2	1	1	0	0	0	0	0	0	0	1	0	0	0	3	0	3	0	3	0	2	0	2	1
Linux	0	0	1	0	0	0	1	2	0	0	5	0	1	0	0	0	0	0	0	0	1	0	1	1	0	0	0	1	1	0	2	0	1	1	1	0	1	1	0	0	0	2	2	1	1	0

Readme

Source: gitlab.com

Yollama

Query LLM models from the comfort of Sublime Text, through a local instance of Ollama.

Setup

Install Ollama and run it.
In a terminal, type ollama pull llama3.1 and then ollama pull codegemma:2b-code. This may take a while.
From Sulime Text's command palette, type Package Control: Install Package
Type Yollama and enter

Use

Ask anything

Open the command palette and type Yollama: Ask anything, then type your question to the LLM and enter. This is just prompting the model for a (hopefully) interesting chat.

Ask about file or selection

This will append the file or selection to your prompt.

Open a file with Sublime Text
(Optional) Select a single region of text within the file
In the command palette, type Yollama: Ask about file or selection, then type a question about your file or selection

Autocomplete

Open a file with Sublime Text
Place the cursor where you want code completion to be inserted
(Optional) Select a single region of text
In the command palette, type Yollama: Autocomplete then enter

This operates in two different modes:

Codegen: when there is one or more characters selected in the file (not just a cursor), Yollama sends the selection as the prompt and expects the model to generate the most likely completion that comes after the selected text.
Infill: when there is no selection in the file (just a cursor), Yollama will use the infill model along with the prompt template to get a fill-in-the-middle completion at the cursor position. The prefix is the text from the beginning of the file down to the cursor, and the suffix is the text from the cursor to the end of the file.

As a reminder, the mode used is displayed in the status text while waiting for Ollama's response.

Configure

Use the command palette, Yollama: Settings, or the settings menu.

The default configuration should look like:

{
    "url": "http://127.0.0.1:11434/api/generate",
    "model": "llama3.1",
    "codegen_model": "codegemma:2b-code",
    "infill_model": "codegemma:2b-code",
    "infill_prompt_template": "<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>"
}

url: An HTTP(S) URL to your running instance of Ollama. The default works, for a default installation of Ollama. Customize it if you're running Ollama on a different host, or if you are running a Ollama-compatible API from another software.
model: The model used for Ask anything and Ask about file or selection.
codegen_model: The model used for Yollama: Autocomplete in codegen mode. This must be a model that replies with the most likely code coming after the prompt.
infill_model: The model used for Yollama: Autocomplete in infill mode. This must be a model that support Fill-in-the-middle (infill/FIM).
infill_prompt_template: The template used for fill-in-the-middle completion. This varies between FIM models, the default supports CodeGemma. Use <PRE> {prefix} <SUF>{suffix} <MID> for CodeLlama models. The {prefix} and {suffix} are respectively replaced with the beginning and end of the file split at the cursor position.

Non-features

Proactively search for code completion: not implemented, as IMHO this is disturbing.
Query remote (non-Llama/Ollama) models: prefers to run local and not send your code to random private companies.
Other shiny features: I try to keep it simple! (easier to maintain, easier to review, easier to configure)
Concurrency: Yollama does not expect you to send multiple queries at the same time.

About

If you're curious, you can also read how this project started.