Fine-Tune an Open-Source LLM on Custom Data (Low-Cost Walkthrough)

Large Language Models (LLMs) like GPT-4, Claude, and Gemini are incredibly powerful, but what if you want your model to sound like your brand, understand your documents, or perform your specialised tasks?

That’s where fine-tuning comes in. Fine-tuning lets you train an existing LLM on your own data, giving it new domain knowledge or a unique conversational style, without starting from scratch. And the best part? You can do it on a budget, thanks to open-source models like LLaMA 3, Mistral, Falcon, and Gemma, combined with lightweight techniques like LoRA and QLoRA.

In this step-by-step guide, you’ll learn how to fine-tune an open-source LLM on your own dataset — affordably and effectively.


Why Fine-Tune Instead of Starting Fresh?

Training an LLM from scratch can cost millions in GPU hours and data collection. Fine-tuning, on the other hand, is like giving an already-trained model a “masterclass” in your specific domain.

For example:

  • A healthcare startup can fine-tune an LLM on medical research papers.
  • A SaaS company can fine-tune it on customer support tickets.
  • A marketing team can fine-tune it on brand tone and writing style.

This results in faster, cheaper, and highly targeted performance.


Fine-tune open-source LLM

Step 1. Choose Your Base Model

You’ll want an open-source model that fits your hardware and target use case. Here are a few great starting points:

ModelSizeIdeal Use Case
LLaMA 3 (Meta)8B–70BGeneral-purpose reasoning
Mistral 7B7BFast, efficient, and multilingual
Gemma (Google)2B–7BLightweight and easy to deploy
Falcon 7B7BOpen-weight, strong for instruction tuning

For this walkthrough, we’ll use Mistral 7B, as it’s small, powerful, and freely available on Hugging Face.


Step 2. Prepare Your Custom Dataset

Fine-tuning data is typically in JSONL (JSON Lines) format, where each line contains an instruction, input, and output.

Example:

{"instruction": "Summarize the following article.", "input": "The Indian space program achieved...", "output": "ISRO successfully completed..."}
{"instruction": "Answer customer query about billing", "input": "How can I update my payment method?", "output": "You can update it under Account > Billing Settings."}

👉 Keep your data clean, concise, and aligned with the task you’re optimising for (e.g., summarisation, Q&A, dialogue).


Fine-tune open-source LLM - 2

Step 3. Environment Setup

Install the required dependencies:

pip install transformers datasets peft bitsandbytes accelerate
  • transformers: Core model interface
  • datasets: For loading your dataset
  • peft: For efficient fine-tuning using LoRA
  • bitsandbytes: Enables quantised training (saves GPU memory)
  • accelerate: Handles distributed/optimised training

Step 4. Load the Model and Tokeniser

You can load the base model from Hugging Face easily:

from transformers import AutoModelForCausalLM, AutoTokenizer

model_name = "mistralai/Mistral-7B-v0.1"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForCausalLM.from_pretrained(
    model_name,
    load_in_8bit=True,
    device_map="auto"
)

This loads the model efficiently in 8-bit precision, significantly reducing GPU usage.


Step 5. Apply LoRA for Efficient Fine-Tuning

Instead of modifying all parameters, LoRA (Low-Rank Adaptation) adds a few small trainable layers on top of the base model, drastically reducing cost and time.

from peft import LoraConfig, get_peft_model

lora_config = LoraConfig(
    r=16,
    lora_alpha=32,
    target_modules=["q_proj", "v_proj"],
    lora_dropout=0.05,
    bias="none",
    task_type="CAUSAL_LM"
)

model = get_peft_model(model, lora_config)

Step 6. Load Your Dataset

Load your prepared JSONL dataset:

from datasets import load_dataset

dataset = load_dataset("json", data_files="custom_data.jsonl")
train_data = dataset["train"].shuffle(seed=42)

Step 7. Fine-Tune the Model

Now, it’s time to train:

from transformers import TrainingArguments, Trainer

training_args = TrainingArguments(
    output_dir="./mistral-finetuned",
    per_device_train_batch_size=4,
    gradient_accumulation_steps=4,
    learning_rate=2e-4,
    num_train_epochs=3,
    fp16=True,
    logging_steps=50,
    save_total_limit=2
)

trainer = Trainer(
    model=model,
    args=training_args,
    train_dataset=train_data
)

trainer.train()

This setup utilises mixed precision (FP16) and LoRA to accelerate training and reduce costs, even on a single GPU like the RTX 3090.


Step 8. Test Your Fine-Tuned Model

After training, you can test how well it performs:

prompt = "Explain the refund process in simple terms."
inputs = tokenizer(prompt, return_tensors="pt").to("cuda")
outputs = model.generate(**inputs, max_new_tokens=150)
print(tokenizer.decode(outputs[0], skip_special_tokens=True))

Compare outputs before and after fine-tuning. You’ll notice how your model becomes more domain-specific and natural for your use case.


Step 9. Optimise for Cost

Fine-tuning doesn’t need expensive infrastructure.
Here are a few low-cost strategies:

  • Use LoRA/QLoRA instead of full fine-tuning.
  • Run on Google Colab Pro, Lambda Cloud, or RunPod.
  • Use 4-bit quantisation (bnb_4bit=True) for smaller GPUs.
  • Fine-tune fewer epochs with high-quality data.

Even with a $20–$50 GPU rental, you can achieve excellent specialised results.


Step 10. Save and Deploy Your Model

Once trained, you can push your model to Hugging Face or run it locally:

model.push_to_hub("your-username/mistral-custom-finetuned")
tokenizer.push_to_hub("your-username/mistral-custom-finetuned")

Or, integrate it directly with LangChain or a FastAPI endpoint to build your custom AI app.


Final Thoughts

You’ve just fine-tuned an open-source LLM on your own data, without massive costs or a data centre! With LoRA and tools like Hugging Face Transformers, customising AI for your business or project has never been easier. Whether you’re building a medical assistant, a legal chatbot, or a brand voice generator, fine-tuning puts true personalisation within reach.


Recommended Resources

Spread the love
Scroll to Top
×