sdxl

This is a model that can be used to generate and modify images based on text prompts. It is a Latent Diffusion Model that uses two fixed, pretwRAIned text encoders (OpenCLIP-ViT/G and CLIP-ViT/L).

Hold: 0
Required: 25,000 $wRAI
52.3k runs
Demo
Examples
Input
prompt
Input prompt
negative_prompt
Input Negative Prompt
image
Input image for img2img or inpaint mode
mask
Input mask for inpaint mode. Black areas will be preserved, white areas will be inpainted.
width
Width of output image
height
Height of output image
scheduler
scheduler
num_inference_steps
Number of denoising steps (minimum: 1; maximum: 500)
guidance_scale
Scale for classifier-free guidance (minimum: 1; maximum: 50)
prompt_strength
Prompt strength when using img2img / inpaint. 1.0 corresponds to full destruction of information in image (maximum: 1)
seed
Random seed. Leave blank to randomize the seed
refine
scheduler
high_noise_frac
for expert_ensemble_refiner, the fraction of noise to use (maximum: 1)
refine_steps
for base_image_refiner, the number of steps to refine, defaults to num_inference_steps
Hold at least 25,000 wRAI to use this model
Multi AI platform is completely free, but most models are only accessible to wRAI token holders. If you have any questions, feel free to ask in our Telegram chat
Input
guidance_scale
7.5
height
0
high_noise_frac
0.8
num_inference_steps
50
num_outputs
1
prompt
A studio photo of a wRAInbow coloured cat
prompt_strength
0.8
refine
expert_ensemble_refiner
refine_steps
0
scheduler
KarrasDPM
seed
326447
width
1024

Readme

Text-to-image generation

To generates images, enter a prompt and run the model.

Image in-painting

SDXL supports in-painting, which lets you "fill in" parts of an existing image with generated content.

  • Enter a prompt for the in-painted pixels
  • Select an input image in the image field
  • In the mask field, select a black-and-white mask image of the same shape as the input image. All white pixels will be in-painted according to the prompt, which black pixels will be preserved.

Image-to-image generation

Image-to-image lets you start with an input image and transform it "towards" a prompt. For example, you can transform a children's drawing of castle to a photorealistic castle.

  • Enter a prompt that describes what you want the output image to look like
  • Select an input image in the image field
  • The prompt_strength field changes how strongly the prompt is applied to the input image

Refinement

With SDXL you can use a separate refiner model to add finer detail to your output.

You can use the refiner in two ways:

  • As an ensemble of experts
  • One after the other (base_model_refiner option)

Ensemble of experts

  • In this mode the SDXL base model handles the steps at the beginning (high noise), before handing over to the refining model for the final steps (low noise)
  • You get a more detailed image from fewer steps
  • You can change the point at which that handover happens, we default to 0.8 (80%)

Evaluation

#

The chart above evaluates user preference for SDXL (with and without refinement) over SDXL 0.9 and Stable Diffusion 1.5 and 2.1. The SDXL base model performs significantly better than the previous variants, and the model combined with the refinement module achieves the best overall performance.