LocalLLM_VisualCodeTest

LocalLLM Visual Code Test

Javascript Benchmarking Local LLMs using Llamacpp / Koboldcpp

View the current results

This is the output from the current prompts.

Visual LLM Benchmark

License: MIT

Ready to unleash the power of your local Large Language Models? 🔥

This project provides a powerful and flexible Python suite to systematically benchmark multiple .gguf language models running locally via the fantastic KoboldCpp backend. Pit your models against each other using your custom prompts (especially geared towards JavaScript generation in this setup!) and see how they perform head-to-head on your hardware!

Stop guessing, start measuring! 📊

✨ Features ✨

(How it Works)

  1. Configure: Set your paths and KoboldCpp settings in run_benchmarks.py.
  2. Discover: The script scans your specified directories for .gguf models (within size limits) and .md prompt files.
  3. Launch & Loop:
    • For each model found:
      • It launches a dedicated KoboldCpp instance with the specified arguments and the current model.
      • It waits for the KoboldCpp server and model to be fully loaded and ready.
      • For each prompt found:
        • It checks if results already exist. If so, it skips.
        • It reads the prompt content and applies any model-specific filters.
        • It constructs the API payload, applying model-specific parameters.
        • It sends the generation request to the KoboldCpp API.
        • If the request times out, it attempts a fallback check to retrieve partial results.
        • It saves the generated text (plus timing info) to a unique .md file in the results directory.
    • It shuts down the KoboldCpp instance for the current model.
    • It waits briefly before starting the next model.
  4. Extract (Optional): Run extract_html.py to scan the results folder and pull out any complete HTML blocks into .html files for easy browser viewing.

Getting Started

Prerequisites

Setup & Configuration

  1. Clone the Repository:
    git clone https://github.com/electricazimuth/LocalLLM_VisualCodeTest.git # Replace with your repo URL
    cd LocalLLM_VisualCodeTest
    
  2. ❗ Configure config.py ❗: Update the paths for your setup. Open config.py in a text editor and carefully update the following paths and settings near the top of the file:
    • KOBOLDCPP_SCRIPT: Absolute path to your koboldcpp.py script.
    • MODEL_DIR: Absolute path to the directory containing your .gguf models.
    • PROMPT_DIR: Absolute path to the directory containing your .md prompt files.
    • RESULTS_DIR: Path where the benchmark results (.md files) will be saved.
    • KOBOLDCPP_ARGS: Crucial! Adjust these arguments for your hardware and KoboldCpp setup.
      • Pay special attention to --usecublas (or --useclblast, etc.) and GPU layer settings (--gpulayers). Ensure the --port matches the API_URL.
      • Tip: Start with conservative settings (e.g., fewer GPU layers) and increase if stable.
    • MAX_SIZE_BYTES / MIN_SIZE_BYTES: Filter models by file size if needed.
    • API_PAYLOAD_TEMPLATE: Modify default generation parameters (temperature, top_p, max_length, etc.) if desired.
    • SERVER_STARTUP_WAIT: Increase if your models take longer to load.
    • PRIMARY_API_TIMEOUT: Increase if you expect very long generation times.
  3. Prepare Your Models & Prompts: Ensure your .gguf files are in the MODEL_DIR and your .md prompt files are in the PROMPT_DIR.

Usage

Running the Benchmarks

  1. Navigate to the project directory in your terminal.
  2. Execute the main script eg(backend is either “llamacpp” or “koboldcpp”):
    python run_benchmarks.py --backend llamacpp
    
  3. For long runs, it’s highly recommended to use nohup (on Linux/macOS) to prevent the process from stopping if you close the terminal:
    nohup python run_benchmarks.py > runbench.log 2>&1 &
    

    This will run the script in the background and log all output to runbench.log. You can monitor the log using tail -f runbench.log.

  4. Watch the console (or log file) for progress updates! The script will print which model and prompt it’s currently processing, timings, and any errors.

Extracting HTML Results

  1. After the benchmarks have generated some .md files in your results directory (ensure this directory exists and contains results).
  2. Make sure extract_html.py is configured correctly (the SOURCE_FOLDER_NAME should match your RESULTS_DIR name, default is “results”).
  3. Run the extraction script from the project root directory:
    python extract_html.py
    
  4. Check your results directory – you should now see corresponding .html files for any markdown files that contained valid <!DOCTYPE html>...</html> blocks. Open them in your browser!
  5. Generate a static viewer use the static_viewer.php

📊 Results Interpretation

🛠️ Customization & Filtering


Happy Benchmarking! 🎉