How to Train an AI Model on Your Own Face: The Complete LoRA Self-Portrait Guide

Published April 7, 2026 · 12 min read · By RealAI Girls

Imagine generating AI portraits of yourself in any art style, any setting, any outfit, and having them actually look like you every single time. Not a vague approximation of your face. Not a "this kind of looks like me if I squint" situation. Actually you.

That is what a trained LoRA model does. And in 2026, the process has gotten dramatically easier than it was even a year ago. Whether you want to create fantasy portraits, professional headshots in different styles, or just experiment with putting your face into wild AI-generated scenarios, training a LoRA on your own face is the way to do it.

This guide covers everything from taking the right photos to choosing between local and cloud training, step-by-step workflows, and the mistakes that will ruin your results if you are not careful.

What Is a LoRA and Why Does It Work So Well for Faces?

LoRA stands for Low-Rank Adaptation. In plain language, it is a lightweight training method that teaches an existing AI image model to understand a new concept, like your face, without retraining the entire model from scratch. Think of it like adding a small plugin to a massive AI brain that says "hey, when someone says 'ohwx person,' generate this specific face."

The reason LoRA works so well for face consistency is that it focuses the training on the specific visual features that make you look like you. Your jawline, your eye shape, the way your nose sits, your skin tone, your hairline. The base model already knows how to generate realistic faces. The LoRA just teaches it what your face looks like.

A trained face LoRA is typically only 20-150 MB in size, compared to the full model which can be 6+ GB. That small file contains everything the AI needs to reliably reproduce your likeness in any style or setting the base model supports.

Step 1: Take the Right Photos (This Is Where Most People Fail)

Your training photos are the single most important factor in the quality of your LoRA. Bad photos in, bad model out. No amount of clever training settings will fix a bad dataset.

You need between 15 and 50 photos. The sweet spot for most face LoRAs is around 20-30 images. More is not always better because too many similar photos can cause overfitting, where the model memorizes specific images instead of learning your general appearance.

Photo requirements for a good face LoRA dataset:

About 1/3 to 1/2 of images should be close-up face shots. The rest should be a mix of half-body and full-body shots. Include front-facing, three-quarter view, and profile angles. Use different lighting conditions (indoor, outdoor, natural light, artificial light). Wear different clothing and use different backgrounds. Include different facial expressions: neutral, smiling, serious, laughing. Make sure your face is clearly visible and in focus in every shot.

What to avoid: Do not use photos where your face is obscured by sunglasses, masks, or heavy shadows. Avoid group photos (crop them first if you must use them). Skip blurry or low-resolution images. Do not use heavily filtered or edited photos, as the model needs to learn your real appearance.

The best approach is to dedicate 15 minutes to a mini photo session. Stand in front of a plain wall, use your phone camera, and take photos at multiple angles. Then go outside and take a few more in natural light. Then take a few in different outfits. That simple routine will give you a much better dataset than pulling random selfies from your camera roll.

Step 2: Prepare Your Dataset

Once you have your photos, you need to prepare them for training. Crop each image so your face and upper body are the main focus. Resize all images to a consistent resolution. For SDXL-based models, use 1024x1024. For FLUX models, 1024x1024 also works well, though some trainers support 768x1152 for portrait-oriented images.

Next comes captioning. Every image needs a text caption that describes what is in it. This tells the model what parts of the image are "you" versus what parts are just background or clothing. A good caption format looks like this:

Example caption: "ohwx woman, close-up portrait, brown hair, brown eyes, neutral expression, indoor lighting, white wall background"

The trigger word (like "ohwx") is a unique token that the model will associate with your face. Use something uncommon that will not conflict with existing words the model knows. Many people use ohwx, sks, or a made-up word like zxjface.

You can caption images manually, or use auto-captioning tools built into platforms like Civitai's trainer or Kohya_ss. Auto-captioning is good enough for most cases, but you should review the captions and make sure your trigger word is included.

Step 3: Choose Your Training Method

In 2026, you have three main options for training your face LoRA, each with different tradeoffs:

Method	Difficulty	Cost	Privacy	Time
Civitai Cloud Trainer	Beginner-friendly	500+ Buzz (~$5)	Cloud (shared infra)	30-60 min
Kohya_ss (Local)	Intermediate	Free (needs GPU)	Fully private	60-90 min
Cloud GPU (RunPod/Vast.ai)	Intermediate	$1-3 per run	Cloud (your instance)	30-45 min

Option A: Civitai Cloud Trainer (Easiest)

If you want the simplest possible experience, the Civitai on-site trainer is your best bet. You do not need a powerful computer or any technical knowledge beyond uploading files and clicking buttons.

Step 1: Create a free Civitai account and navigate to the Create page. Click "Train a model."

Step 2: Upload your images (you can drag and drop a zip file with images and caption .txt files, or just upload loose images).

Step 3: Select your base model. For photorealistic faces, choose an SDXL base like Juggernaut XL or RealVisXL. For FLUX-based training, select FLUX.1-dev if available.

Step 4: Set your trigger word and let the auto-captioner handle the rest.

Step 5: Start training. It typically takes 30-60 minutes depending on your dataset size and the queue.

The Civitai trainer costs Buzz (their in-app currency), starting at about 500 Buzz for SD 1.5 and SDXL models. You earn Buzz through community activity or can purchase it directly.

Option B: Kohya_ss Local Training (Most Private)

If you do not want your face photos on anyone else's servers, local training with Kohya_ss is the way to go. You need a GPU with at least 12 GB of VRAM (an RTX 3060 12GB is the minimum comfortable setup, though 8GB cards can work with GGUF quantization in FLUX).

Install Kohya_ss from the GitHub repository by following the setup instructions for your operating system. The tool has a browser-based GUI that makes configuration much less intimidating than raw command-line training.

Recommended Kohya_ss settings for face LoRA (SDXL):

Network Rank: 16 (good balance of quality and file size). Network Alpha: 8 (half of rank). Learning Rate: 1e-4 with cosine scheduler. Batch Size: 1. Epochs: 10-20 (save checkpoints every 5 epochs). Resolution: 1024x1024. Mixed Precision: fp16. Enable flip augmentation for symmetry. Set Min SNR Gamma to 5 for training stability.

The most important thing with local training is to save checkpoints every 5 epochs and review them. Training too long causes overfitting, where the model can only reproduce your training photos instead of generating new images of you. Pick the checkpoint from just before overfitting starts. You will know overfitting when generated images start looking exactly like one of your training photos instead of being creative variations.

Option C: Cloud GPU Training (Best of Both Worlds)

Cloud GPU services like RunPod and Vast.ai let you rent powerful hardware (A100 or RTX 4090 instances) for $0.50-$1.50 per hour. A typical training run costs $1-3 total. You get the speed of high-end hardware without buying it, and your training data exists on your rented instance, not on a shared platform.

The workflow is essentially the same as local Kohya_ss training, just running on rented hardware. Many community members have created one-click notebooks for RunPod that set up the entire environment automatically.

Privacy: Should You Care About Where Your Face Data Goes?

This is a genuinely important consideration, and the answer depends on your comfort level. When you train on a cloud platform like Civitai, your face photos are uploaded to shared infrastructure. While reputable platforms have privacy policies that prohibit using your training data for other purposes, the photos do temporarily exist on their servers.

Local training with Kohya_ss keeps everything on your own machine. Your photos never leave your computer. For people who are privacy-conscious or training models for clients, this is the safest option.

Cloud GPU services (RunPod, Vast.ai) fall in the middle. You are renting dedicated hardware, so your data is not on a "shared" platform in the same way, but it does exist on rented infrastructure during training. Deleting your instance after training removes the data.

My recommendation: if you are just making fun portraits of yourself, the privacy risk on reputable platforms is low. If you are training models of other people (with their permission), or for commercial work, local training is the responsible choice.

Using Your Trained LoRA: The Fun Part

Once your LoRA is trained, you load it into your preferred image generation tool (ComfyUI, AUTOMATIC1111, Forge, or the Civitai generator) alongside the same base model you trained on. Then you use your trigger word in prompts to activate the LoRA.

A prompt like ohwx woman, oil painting portrait, renaissance style, golden frame, dramatic lighting would generate a Renaissance-style painting of your face. Change it to ohwx woman, cyberpunk cityscape, neon lights, rain, cinematic and you get a completely different vibe with the same face.

Start with a LoRA weight of 0.7-0.8. Going to 1.0 can cause the model to overpower the style of the image. Too low (below 0.5) and the face might not look enough like you. Finding the sweet spot is part of the experimentation.

Common Mistakes and How to Fix Them

Problem: Generated images look like one specific training photo. This is overfitting. You trained too long or had too few unique images. Solution: use an earlier checkpoint and add more diverse photos to your dataset.

Problem: Face looks right from one angle but weird from others. Your training data lacked angle diversity. Solution: retrain with more profile and three-quarter view photos.

Problem: Model gets your hair color right but eye color wrong. Your captions may be confusing the model. Solution: make sure every caption accurately describes your eye and hair color, and that the trigger word is consistently placed.

Problem: Face looks great but skin texture is plastic or uncanny. You may be using too high a learning rate. Solution: try lowering the learning rate to 5e-5 and training for more epochs instead.

Problem: The LoRA only works well with specific prompts. Your training images were too similar in style. Solution: include photos with different lighting, backgrounds, and contexts so the model learns your face independently of any specific setting.

Training a face LoRA is genuinely one of the most satisfying things you can do with AI image generation. There is something magical about seeing your own face rendered in completely impossible scenarios with perfect consistency. And with the tools available in 2026, the barrier to entry has never been lower. Take 30 photos, pick a training method, and give it a shot. The worst that happens is you train a weird-looking version of yourself, and honestly, that is pretty funny too.

← Back to Blog