Soramai · Docs

Fine-tuning.

LoRA fine-tuning on open base models. Pick a tier, point at a dataset, set a step count, and watch the run live. Soramai handles GPU provisioning, monitoring, and shutdown.

Open fine-tuning page All docs

Fine-tuning tiers

Each tier is a (base model, GPU) pairing tuned for a quality / cost trade-off.

Tier	Base model	GPU	Use when
Beginner	Qwen 2.5 7B	A100 80GB	Prototypes, single-task fine-tunes, fast iteration
Pro	Qwen 2.5 14B / Mistral 7B	A100 80GB	Production workloads with moderate quality demands
Elite	Llama 3.1 70B / Qwen 2.5 72B	H100 80GB	Highest-quality tier; only worthwhile with 100K+ examples
Image · FLUX	FLUX.1 dev	H100 80GB	Best-in-class image LoRA fine-tuning for product photography, illustration, character likeness
Image · SDXL	Stable Diffusion XL	A100 80GB	Faster + cheaper image LoRAs when FLUX's quality isn't required

Live tier availability is shown on the fine-tuning page. A tier may be temporarily disabled if the operator has taken it down for maintenance.

LoRA hyperparameters

Soramai ships with sensible defaults. Most users never change these — but the knobs are exposed for advanced runs.

Step count

How many optimiser steps to run. 100 is a quick recipe-check. 500 is the standard production setting. 1000+ is for high-data regimes (10K+ rows).

Rank (r)

LoRA matrix rank. Default 16. Higher = more capacity but more VRAM. Most tasks don’t benefit beyond 32.

Alpha

Scaling factor on the LoRA delta. Default 32 (= 2× r). Standard choice; rarely needs changing.

Learning rate

Default 2.0e-4 with a cosine schedule + 120-step warm-up. Lower for very small datasets (5.0e-5); higher for massive datasets (5.0e-4).

Batch size

Per-device batch. Default 32. Bound by the base model + GPU VRAM. Soramai picks the maximum that fits.

Sequence length

Maximum tokens per row, fixed at 2,048. Longer examples are truncated; the truncation count is reported in the fine-tuning log.

Live monitoring

Every fine-tuning run is observable in real time. Logs stream from the pod via R2 with incremental byte-range fetching.

What you see

· Live log tail (step, loss, lr, grad norm)
· Coin-burn meter (live per-second)
· Total elapsed time
· Current phase (Starting → Preparing → Running → Saving)
· Final adapter size + metrics on completion

What you don't see

· Pod IP / SSH access (Soramai-only)
· Container logs from non-fine-tuning processes
· Hardware temperature / fan speed
· Cross-user fine-tuning queue

You can safely close the browser tab during a fine-tuning run. The worker keeps running; logs persist to R2; billing continues against your wallet. When you reopen the fine-tuning page, the live log resumes from where you left off via byte-range fetch — no re-download of the full file.

Refund policy for platform faults

If a fine-tuning run fails because of something Soramai broke, you get the coins back automatically. No support ticket required.

Pod terminated unexpectedly (e.g. RunPod eviction) — full refund of the run’s billed time. Issued by the training-reaper cron within ~5 minutes.
Adapter validation failed (worker reported success but uploaded a malformed/empty adapter) — full refund. The failed adapter is preserved for audit but never billed.
Worker failed to start within the early-failure window (no logs, no markers, no progress) — full refund. The pod is force-terminated and you’re not charged.
Time-ceiling reached (6h) — the run is killed and the elapsed time refunded. We absorb the cost; you can resume from the latest checkpoint if one was saved.
Not covered: refunds for runs that completed but you didn’t like the results. Quality is your call; you only pay for compute, not outcomes.

Refunds appear in your wallet history as a credit row with the originating job id. The audit trail is permanent.

Cost estimation

Worked numbers for typical runs. Live estimates are also shown on the fine-tuning page before you confirm.

Recipe	Approx. time	Approx. cost
Beginner · 100 steps · 200 rows	3 min	8 coins (~$0.08)
Beginner · 500 steps · 1000 rows	6 min	22 coins (~$0.22)
Pro · 500 steps · 1000 rows	8 min	34 coins (~$0.34)
Elite · 500 steps · 5000 rows	22 min	180 coins (~$1.80)
FLUX · 1000 steps · 30 images	14 min	120 coins (~$1.20)
SDXL · 1500 steps · 50 images	11 min	78 coins (~$0.78)

Times and costs are rough. Exact numbers depend on dataset characteristics (sequence length distribution, batch packing efficiency) and live GPU pricing. The fine-tuning page always shows the current estimate before you commit a coin.

Related docs

Once your fine-tuning is done, move on to deployment and inference.

Datasets →

JSONL format, image-dataset format, validation, AI generation.

Inference & Deploy →

Playground, Deploy API, request/response schemas, streaming.

Pricing →

Coin packs, per-tier rates, worked examples.