NeoVim AI Coding Assistant w/Avante

NeoVim AI Coding Assistant w/Avante

I wanted to try more integrated AI Coding Assistance with NeoVim. In the past I try various LLM's by asking to generate full projects and see how they do but I realize that is too much to ask and too wide open leaving the LLM to make a lot of assumptions. No different then a Product Manager giving vague requirements.

I generated a simple app with Gemini 3 with the prompt: "create an express.js project that has user registration and profile editing". It created a typical 1 file index.js app with no breakdown into functional areas in separate files. 

So for the first dive into contextual AI assistance I loaded up Avante into my NeoVim (similar to Cursors) which was pretty straightforward. Avante setup for LazyVim https://gitlab.com/geoffcorey/dotfiles/-/blob/master/.config/nvim/lua/plugins/avante.lua I used the config on Avante's Readme which used claude-sonnet-4-20250514

Then brought up the index.js file and opened avante (aa) and it asked for my Anthropic API Key so I bought $5 worth and generated an API key and put it in.  In avante I asked, "generate tests for /register route" and off it went and made a task list of 8 things to plow through and chugged on that for 5 minutes and it was done.   Ran the tests and they all worked. It also separated out code in index.js to new file app.js.    You can see all the code generation at https://gitlab.com/geoffcorey/simple

Total cost of test generation was $1.04

Next I switched to Claude 4.5 model (claude-sonnet-4-5-20250929) and asked "Add post /logout route to invalidate the JWT token." Claude added the route and additional tests for about $1.60

I have llama-cpp setup local and avante also setup to use that option but my video card isn't up to snuff to handle it so it drops back to CPU and takes forever. New vid card on the way so maybe I can try out local models. Stay tuned.

Update 12/24/2025

New 9070 card installed and used llama-cpp to offload to the AMD card. And to serve up Qwen3 Coder model. Served the model via

llama-serve --model qwen3-code-reasoning-4b.Q4_K_M.gguf

Setup lazyvim/avante .config/nvim/lua/plugins/avante.lua

return {
  {

    "yetone/avante.nvim",
    -- if you want to build from source then do `make BUILD_FROM_SOURCE=true`
    -- ⚠️ must add this setting! ! !
    build = vim.fn.has("win32") ~= 0 and "powershell -ExecutionPolicy Bypass -File Build.ps1 -BuildFromSource false"
      or "make",
    event = "VeryLazy",
    version = false, -- Never set this value to "*"! Never!
    ---@module 'avante'
    ---@type avante.Config
    opts = {
      -- add any opts here
      -- this file can contain specific instructions for your project
      instructions_file = "avante.md",
      -- for example
      provider = "llamacpp",
      providers = {
        llamacpp = {
          __inherited_from = "openai", -- Treat it as an OpenAI-compatible server
          endpoint = "http://localhost:8080/v1", -- Must match your server's host and port
          api_key_name = "no-key-required", -- Llama.cpp server typically doesn't need a key
          model = "Qwen3-Code-Reasoning", -- A descriptive name for the model you are running
          -- Optional: add extra request parameters
          extra_request_body = {
            temperature = 0.7,
            -- ... other parameters like top_k, top_p, etc.
          },
        },
        ollama = {
          model = "codellama", -- Replace with the specific model you pulled (e.g., mistral, llama2, etc.)
          endpoint = "http://localhost:11434", -- Ollama's default local address
          is_env_set = function()
            return true
          end,
          extra_request_body = {
            max_tokens = 8192,
            temperature = 0.5,
          },
          --You can adjust other settings like request_timeout, etc.
        },
        claude = {
          endpoint = "https://api.anthropic.com",
          model = "claude-sonnet-4-5-20250929",
          timeout = 30000, -- Timeout in milliseconds
          extra_request_body = {
            temperature = 0.75,
            max_tokens = 20480,
          },
        },
      },
    },
    dependencies = {
      "nvim-lua/plenary.nvim",
      "MunifTanjim/nui.nvim",
      --- The below dependencies are optional,
      "nvim-mini/mini.pick", -- for file_selector provider mini.pick
      "nvim-telescope/telescope.nvim", -- for file_selector provider telescope
      "hrsh7th/nvim-cmp", -- autocompletion for avante commands and mentions
      "ibhagwan/fzf-lua", -- for file_selector provider fzf
      "stevearc/dressing.nvim", -- for input provider dressing
      "folke/snacks.nvim", -- for input provider snacks
      "nvim-tree/nvim-web-devicons", -- or echasnovski/mini.icons
      "zbirenbaum/copilot.lua", -- for providers='copilot'
      {
        -- support for image pasting
        "HakonHarnes/img-clip.nvim",
        event = "VeryLazy",
        opts = {
          -- recommended settings
          default = {
            embed_image_as_base64 = false,
            prompt_for_file_name = false,
            drag_and_drop = {
              insert_mode = true,
            },
            -- required for Windows users
            use_absolute_path = true,
          },
        },
      },
      {
        -- Make sure to set this up properly if you have lazy=true
        "MeanderingProgrammer/render-markdown.nvim",
        opts = {
          file_types = { "markdown", "Avante" },
        },
        ft = { "markdown", "Avante" },
      },
    },
  },
}

I did the same prompt as the one for Gemini-CLI article and it was very performant and started cranking out code. However, the quality of the code was not great and a lot of errors, unlike Gemni CLI. It would take heavy editing to fix the issues, whereas the Gemini generated code was minimal changes.

Update: 12/27/2025

Since I upgraded my computer with a AMD 9070 I decided to try llama.cpp to see how it would do refactoring existing code.

Started llama-cli

llama-cli     -hf unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL     --jinja -ngl 99 --threads -1 --ctx-size 32768     --temp 0.7 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05

I added the test files and package.json and asked it to "Change testing framework from Jest to Node.js native test framework. And it did a nice job. I tried integrating with Avante but had some difficulty.

the new code is these two commits as I forgot to commit the new tests after moving them to a new directory.

Using locall LLM (89d5e9b8) · Commits · Geoff Corey / simple · GitLab
llama-cli -hf unsloth/Qwen3-Coder-30B-A3B-Instruct-GGUF:Q4_K_XL --jinja -ngl 99 --threads -1 --ctx-size 32768 --temp 0.7 --min-p 0.0 --top-p 0.80 --top-k 20 --repeat-penalty 1.05 Key Changes Made: 1. **Replaced Jest with Node.js native test…
llama.cpp rewritten tests (80658e6e) · Commits · Geoff Corey / simple · GitLab
GitLab.com
 $ node --test
▶ POST /login
  ▶ Successful login
    ✔ should login with valid credentials and return JWT token (98.477053ms)
    ✔ should return a valid JWT token that can be decoded (86.093453ms)
    ✔ should set token expiration to 1 hour (84.156964ms)
    ✔ should allow multiple logins for the same user (1139.832059ms)
    ✔ should include correct user ID in token payload (83.738578ms)
  ✔ Successful login (1492.909895ms)
  ▶ Invalid credentials
    ✔ should return 401 for non-existent email (1.430516ms)
    ✔ should return 401 for incorrect password (83.362375ms)
    ✔ should not reveal whether email exists or password is wrong (84.03863ms)
    ✔ should handle case-sensitive email (42.770058ms)
  ✔ Invalid credentials (211.855854ms)
  ▶ Validation errors
    ✔ should return 400 when email is missing (1.062438ms)
    ✔ should return 400 when password is missing (0.844469ms)
    ✔ should return error when both fields are missing (0.841772ms)
    ✔ should handle empty string email (0.753774ms)
    ✔ should handle empty string password (1.455005ms)
  ✔ Validation errors (5.123873ms)
  ▶ Integration with registration
    ✔ should successfully register and login a user (83.179901ms)
    ✔ should login multiple registered users independently (166.878327ms)
    ✔ should not allow login before registration (0.884875ms)
  ✔ Integration with registration (251.068068ms)
  ▶ Token usage
    ✔ should be able to access protected route with valid token (83.930578ms)
  ✔ Token usage (84.000534ms)
SyntaxError: Unexpected token 'i', "invalid json" is not valid JSON
    at JSON.parse (<anonymous>)
    at createStrictSyntaxError (/home/gcorey/src/bs/node_modules/body-parser/lib/types/json.js:116:10)
    at parse (/home/gcorey/src/bs/node_modules/body-parser/lib/types/json.js:68:15)
    at /home/gcorey/src/bs/node_modules/body-parser/lib/read.js:163:18
    at AsyncResource.runInAsyncScope (node:async_hooks:214:14)
    at invokeCallback (/home/gcorey/src/bs/node_modules/raw-body/index.js:238:16)
    at done (/home/gcorey/src/bs/node_modules/raw-body/index.js:227:7)
    at IncomingMessage.onEnd (/home/gcorey/src/bs/node_modules/raw-body/index.js:287:7)
    at IncomingMessage.emit (node:events:508:28)
    at endReadableNT (node:internal/streams/readable:1701:12)
  ▶ Error handling
    ✔ should handle malformed JSON (1.740157ms)
    ✔ should return JSON content-type (0.783904ms)
  ✔ Error handling (2.609928ms)
  ▶ Password security
    ✔ should handle special characters in password during login (82.73939ms)
    ✔ should handle unicode characters in password during login (82.926071ms)
    ✔ should reject password that differs by one character (82.841225ms)
  ✔ Password security (248.638282ms)
✔ POST /login (2296.599528ms)
▶ POST /logout
  ▶ Successful logout
    ✔ should logout successfully with valid token (101.062265ms)
    ✔ should invalidate the token after logout (86.735179ms)
    ✔ should not allow using invalidated token for any protected route (86.892964ms)
    ✔ should allow user to login again after logout (1135.288498ms)
    ✔ should handle multiple logouts for different users (169.772825ms)
    ✔ should return JSON content-type (83.651641ms)
  ✔ Successful logout (1664.084988ms)
  ▶ Logout without authentication
    ✔ should return 401 when no token is provided (0.997326ms)
    ✔ should return 401 when token is empty (1.506881ms)
    ✔ should return 403 when token is invalid (0.884655ms)
    ✔ should return 401 when Authorization header is malformed (0.795714ms)
  ✔ Logout without authentication (4.3873ms)
  ▶ Token blacklist behavior
    ✔ should add token to blacklist on logout (85.171767ms)
    ✔ should not allow logout with already invalidated token (83.864707ms)
    ✔ should maintain separate blacklist entries for different tokens (1138.312156ms)
  ✔ Token blacklist behavior (1307.479425ms)
  ▶ Integration flow
    ✔ should complete full register -> login -> access -> logout -> deny flow (84.948869ms)
  ✔ Integration flow (85.021628ms)
✔ POST /logout (3061.836644ms)
▶ POST /register
  ▶ Successful registration
    ✔ should register a new user with valid data (55.16912ms)
    ✔ should hash the password correctly (83.973143ms)
    ✔ should assign sequential user IDs (85.388966ms)
    ✔ should set default values for bio and location (42.37121ms)
    ✔ should set registeredAt timestamp (42.329527ms)
  ✔ Successful registration (309.888843ms)
  ▶ Validation errors
    ✔ should return 400 when username is missing (1.622775ms)
    ✔ should return 400 when email is missing (1.123144ms)
    ✔ should return 400 when password is missing (1.294332ms)
    ✔ should return 400 when all fields are missing (1.320781ms)
    ✔ should return 400 when username is empty string (0.993141ms)
    ✔ should return 400 when email is empty string (0.831038ms)
    ✔ should return 400 when password is empty string (0.854469ms)
  ✔ Validation errors (8.308809ms)
  ▶ Duplicate email handling
    ✔ should return 409 when email already exists (42.368809ms)
    ✔ should allow same username with different email (82.748911ms)
    ✔ should be case sensitive for email comparison (83.924618ms)
  ✔ Duplicate email handling (209.176233ms)
SyntaxError: Unexpected token 'i', "invalid json" is not valid JSON
    at JSON.parse (<anonymous>)
    at createStrictSyntaxError (/home/gcorey/src/bs/node_modules/body-parser/lib/types/json.js:116:10)
    at parse (/home/gcorey/src/bs/node_modules/body-parser/lib/types/json.js:68:15)
    at /home/gcorey/src/bs/node_modules/body-parser/lib/read.js:163:18
    at AsyncResource.runInAsyncScope (node:async_hooks:214:14)
    at invokeCallback (/home/gcorey/src/bs/node_modules/raw-body/index.js:238:16)
    at done (/home/gcorey/src/bs/node_modules/raw-body/index.js:227:7)
    at IncomingMessage.onEnd (/home/gcorey/src/bs/node_modules/raw-body/index.js:287:7)
    at IncomingMessage.emit (node:events:508:28)
    at endReadableNT (node:internal/streams/readable:1701:12)
  ▶ Error handling
    ✔ should handle malformed JSON (2.028495ms)
    ✔ should handle content-type application/json (40.804292ms)
    ✔ should handle additional fields gracefully (41.358326ms)
  ✔ Error handling (84.320887ms)
  ▶ Password security
    ✔ should not store password in plain text (45.998188ms)
    ✔ should use different salts for same passwords (82.759619ms)
    ✔ should handle special characters in password (81.620544ms)
    ✔ should handle unicode characters in password (81.731923ms)
  ✔ Password security (292.26365ms)
  ▶ Response format
    ✔ should return proper JSON content-type (41.169376ms)
    ✔ should return consistent response structure (41.092821ms)
  ✔ Response format (82.367613ms)
✔ POST /register (986.71206ms)
ℹ tests 61
ℹ suites 20
ℹ pass 61
ℹ fail 0
ℹ cancelled 0
ℹ skipped 0
ℹ todo 0
ℹ duration_ms 3168.626951