Soramai · Docs
Fine-tuning.
LoRA fine-tuning on open base models. Pick a tier, point at a dataset, set a step count, and watch the run live. Soramai handles GPU provisioning, monitoring, and shutdown.
Fine-tuning tiers
Each tier is a (base model, GPU) pairing tuned for a quality / cost trade-off.
| Tier | Base model | GPU | Use when |
|---|---|---|---|
| Beginner | Qwen 2.5 7B | A100 80GB | Prototypes, single-task fine-tunes, fast iteration |
| Pro | Qwen 2.5 14B / Mistral 7B | A100 80GB | Production workloads with moderate quality demands |
| Elite | Llama 3.1 70B / Qwen 2.5 72B | H100 80GB | Highest-quality tier; only worthwhile with 100K+ examples |
| Image · FLUX | FLUX.1 dev | H100 80GB | Best-in-class image LoRA fine-tuning for product photography, illustration, character likeness |
| Image · SDXL | Stable Diffusion XL | A100 80GB | Faster + cheaper image LoRAs when FLUX's quality isn't required |
Live tier availability is shown on the fine-tuning page. A tier may be temporarily disabled if the operator has taken it down for maintenance.
LoRA hyperparameters
Soramai ships with sensible defaults. Most users never change these — but the knobs are exposed for advanced runs.
Step count
How many optimiser steps to run. 100 is a quick recipe-check. 500 is the standard production setting. 1000+ is for high-data regimes (10K+ rows).
Rank (r)
LoRA matrix rank. Default 16. Higher = more capacity but more VRAM. Most tasks don’t benefit beyond 32.
Alpha
Scaling factor on the LoRA delta. Default 32 (= 2× r). Standard choice; rarely needs changing.
Learning rate
Default 2.0e-4 with a cosine schedule + 120-step warm-up. Lower for very small datasets (5.0e-5); higher for massive datasets (5.0e-4).
Batch size
Per-device batch. Default 32. Bound by the base model + GPU VRAM. Soramai picks the maximum that fits.
Sequence length
Maximum tokens per row, fixed at 2,048. Longer examples are truncated; the truncation count is reported in the fine-tuning log.
Live monitoring
Every fine-tuning run is observable in real time. Logs stream from the pod via R2 with incremental byte-range fetching.
What you see
- · Live log tail (step, loss, lr, grad norm)
- · Coin-burn meter (live per-second)
- · Total elapsed time
- · Current phase (Starting → Preparing → Running → Saving)
- · Final adapter size + metrics on completion
What you don't see
- · Pod IP / SSH access (Soramai-only)
- · Container logs from non-fine-tuning processes
- · Hardware temperature / fan speed
- · Cross-user fine-tuning queue
You can safely close the browser tab during a fine-tuning run. The worker keeps running; logs persist to R2; billing continues against your wallet. When you reopen the fine-tuning page, the live log resumes from where you left off via byte-range fetch — no re-download of the full file.
Refund policy for platform faults
If a fine-tuning run fails because of something Soramai broke, you get the coins back automatically. No support ticket required.
- Pod terminated unexpectedly (e.g. RunPod eviction) — full refund of the run’s billed time. Issued by the training-reaper cron within ~5 minutes.
- Adapter validation failed (worker reported success but uploaded a malformed/empty adapter) — full refund. The failed adapter is preserved for audit but never billed.
- Worker failed to start within the early-failure window (no logs, no markers, no progress) — full refund. The pod is force-terminated and you’re not charged.
- Time-ceiling reached (6h) — the run is killed and the elapsed time refunded. We absorb the cost; you can resume from the latest checkpoint if one was saved.
- Not covered: refunds for runs that completed but you didn’t like the results. Quality is your call; you only pay for compute, not outcomes.
Refunds appear in your wallet history as a credit row with the originating job id. The audit trail is permanent.
Cost estimation
Worked numbers for typical runs. Live estimates are also shown on the fine-tuning page before you confirm.
| Recipe | Approx. time | Approx. cost |
|---|---|---|
| Beginner · 100 steps · 200 rows | 3 min | 8 coins (~$0.08) |
| Beginner · 500 steps · 1000 rows | 6 min | 22 coins (~$0.22) |
| Pro · 500 steps · 1000 rows | 8 min | 34 coins (~$0.34) |
| Elite · 500 steps · 5000 rows | 22 min | 180 coins (~$1.80) |
| FLUX · 1000 steps · 30 images | 14 min | 120 coins (~$1.20) |
| SDXL · 1500 steps · 50 images | 11 min | 78 coins (~$0.78) |
Times and costs are rough. Exact numbers depend on dataset characteristics (sequence length distribution, batch packing efficiency) and live GPU pricing. The fine-tuning page always shows the current estimate before you commit a coin.
Related docs
Once your fine-tuning is done, move on to deployment and inference.