Capability · Framework — fine-tuning

ctransformers

ctransformers by Ravindra Marella was one of the first Python wrappers around GGML inference, covering a variety of model families (Llama, Falcon, MPT, GPT-2, StarCoder) with a simple `AutoModelForCausalLM` API and LangChain compatibility. Development has slowed in favour of llama-cpp-python, but ctransformers is still useful for older GGML models and minimal-dependency environments.

Framework facts

Category: fine-tuning
Language: Python / C++
License: MIT
Repository: https://github.com/marella/ctransformers

Install

pip install ctransformers

Quickstart

from ctransformers import AutoModelForCausalLM
llm = AutoModelForCausalLM.from_pretrained(
    'TheBloke/Llama-2-7B-Chat-GGML',
    model_type='llama',
    gpu_layers=50,
)
print(llm('Hello, '))

Alternatives

llama-cpp-python — actively maintained
Ollama — high-level
mlc-llm — cross-platform

Frequently asked questions

Should I start a new project with ctransformers?

Usually no — llama-cpp-python is more actively developed and has broader model support via GGUF. Stick with ctransformers if you already rely on its LangChain integration or very old GGML weights.

Does it support GGUF?

Partial — modern GGUF support has lagged behind llama.cpp. Convert GGUF → older GGML formats if needed, or switch to llama-cpp-python.

Sources

ctransformers GitHub — accessed 2026-04-20
ctransformers PyPI — accessed 2026-04-20