ctrl+shift+p filters: :st2 :st3 :win :osx :linux
Browse

Yollama

by mwa ST4

Use AI from the comfort of Sublime Text to auto-complete code or chat with LLMs.

Details

  • 1.2.1
  • gitlab.​com
  • 6 days ago
  • 2 days ago
  • 1 year ago

Installs

  • Total 489
  • Win 127
  • Mac 221
  • Linux 141
Mar 31 Mar 30 Mar 29 Mar 28 Mar 27 Mar 26 Mar 25 Mar 24 Mar 23 Mar 22 Mar 21 Mar 20 Mar 19 Mar 18 Mar 17 Mar 16 Mar 15 Mar 14 Mar 13 Mar 12 Mar 11 Mar 10 Mar 9 Mar 8 Mar 7 Mar 6 Mar 5 Mar 4 Mar 3 Mar 2 Mar 1 Feb 28 Feb 27 Feb 26 Feb 25 Feb 24 Feb 23 Feb 22 Feb 21 Feb 20 Feb 19 Feb 18 Feb 17 Feb 16 Feb 15 Feb 14
Windows 0 1 1 3 0 1 2 1 0 1 1 1 1 3 1 0 1 0 1 0 0 1 0 1 0 4 2 0 1 1 1 0 2 6 0 1 0 0 2 1 1 1 1 1 1 2
Mac 1 1 0 1 0 2 0 2 1 1 2 0 0 3 2 1 1 0 1 1 1 1 3 2 1 2 1 2 0 1 1 2 1 0 7 2 2 0 0 0 1 2 2 0 0 0
Linux 0 2 0 0 0 6 2 0 1 1 1 2 1 0 4 0 2 1 0 2 1 1 0 0 0 1 1 1 1 0 2 0 0 1 2 0 0 4 0 0 0 2 1 1 2 1

Readme

Source
gitlab.​com

Yollama

Query LLM models from the comfort of Sublime Text, through a local instance of Ollama.

Setup

  1. Install Ollama and run it.
  2. In a terminal, type ollama pull llama3.1 and then ollama pull codegemma:2b-code. This may take a while.
  3. From Sulime Text's command palette, type Package Control: Install Package
  4. Type Yollama and enter

Use

Ask anything

Open the command palette and type Yollama: Ask anything, then type your question to the LLM and enter. This is just prompting the model for a (hopefully) interesting chat.

Ask about file or selection

This will append the file or selection to your prompt.

  1. Open a file with Sublime Text
  2. (Optional) Select a single region of text within the file
  3. In the command palette, type Yollama: Ask about file or selection, then type a question about your file or selection

Autocomplete

  1. Open a file with Sublime Text
  2. Place the cursor where you want code completion to be inserted
  3. (Optional) Select a single region of text
  4. In the command palette, type Yollama: Autocomplete then enter

This operates in two different modes:

  • Codegen: when there is one or more characters selected in the file (not just a cursor), Yollama sends the selection as the prompt and expects the model to generate the most likely completion that comes after the selected text.
  • Infill: when there is no selection in the file (just a cursor), Yollama will use the infill model along with the prompt template to get a fill-in-the-middle completion at the cursor position. The prefix is the text from the beginning of the file down to the cursor, and the suffix is the text from the cursor to the end of the file.

As a reminder, the mode used is displayed in the status text while waiting for Ollama's response.

Configure

Use the command palette, Yollama: Settings, or the settings menu.

The default configuration should look like:

{
    "url": "http://127.0.0.1:11434/api/generate",
    "model": "llama3.1",
    "codegen_model": "codegemma:2b-code",
    "infill_model": "codegemma:2b-code",
    "infill_prompt_template": "<|fim_prefix|>{prefix}<|fim_suffix|>{suffix}<|fim_middle|>"
}
  • url: An HTTP(S) URL to your running instance of Ollama. The default works, for a default installation of Ollama. Customize it if you're running Ollama on a different host, or if you are running a Ollama-compatible API from another software.
  • model: The model used for Ask anything and Ask about file or selection.
  • codegen_model: The model used for Yollama: Autocomplete in codegen mode. This must be a model that replies with the most likely code coming after the prompt.
  • infill_model: The model used for Yollama: Autocomplete in infill mode. This must be a model that support Fill-in-the-middle (infill/FIM).
  • infill_prompt_template: The template used for fill-in-the-middle completion. This varies between FIM models, the default supports CodeGemma. Use <PRE> {prefix} <SUF>{suffix} <MID> for CodeLlama models. The {prefix} and {suffix} are respectively replaced with the beginning and end of the file split at the cursor position.

Non-features

  • Proactively search for code completion: not implemented, as IMHO this is disturbing.
  • Query remote (non-Llama/Ollama) models: prefers to run local and not send your code to random private companies.
  • Other shiny features: I try to keep it simple! (easier to maintain, easier to review, easier to configure)
  • Concurrency: Yollama does not expect you to send multiple queries at the same time.

About

If you're curious, you can also read how this project started.