Capability · Framework — fine-tuning

ctransformers

ctransformers by Ravindra Marella was one of the first Python wrappers around GGML inference, covering a variety of model families (Llama, Falcon, MPT, GPT-2, StarCoder) with a simple `AutoModelForCausalLM` API and LangChain compatibility. Development has slowed in favour of llama-cpp-python, but ctransformers is still useful for older GGML models and minimal-dependency environments.

Framework facts

Category
fine-tuning
Language
Python / C++
License
MIT
Repository
https://github.com/marella/ctransformers

Install

pip install ctransformers

Quickstart

from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained(
    'TheBloke/Llama-2-7B-Chat-GGML',
    model_type='llama',
    gpu_layers=50,
)
print(llm('Hello, '))

Alternatives

  • llama-cpp-python — actively maintained
  • Ollama — high-level
  • mlc-llm — cross-platform

Frequently asked questions

Should I start a new project with ctransformers?

Usually no — llama-cpp-python is more actively developed and has broader model support via GGUF. Stick with ctransformers if you already rely on its LangChain integration or very old GGML weights.

Does it support GGUF?

Partial — modern GGUF support has lagged behind llama.cpp. Convert GGUF → older GGML formats if needed, or switch to llama-cpp-python.

Sources

  1. ctransformers GitHub — accessed 2026-04-20
  2. ctransformers PyPI — accessed 2026-04-20