Z-Image Turbo (Burn)
Z-Image Turbo model weights converted to Burn's Burnpack format for use with the Burn deep learning framework.
Model Description
Z-Image is a fast image generation model based on flow-matching diffusion, designed for generating high-quality images from text prompts. This repository contains the model weights in Burnpack (.bpk) format, optimized for inference on Apple Silicon (Metal) and other Burn backends.
Files
| File | Size | Description |
|---|---|---|
z_image_turbo_bf16.bpk |
11 GB | Transformer weights in BF16 precision |
Additional Required Files
To run Z-Image, you also need:
- Tokenizer: From holgt/qwen3-0.6b-burn (tokenizer.json)
- Text Encoder: From Comfy-Org/z_image_turbo (qwen_3_4b.safetensors)
- Autoencoder: From black-forest-labs/FLUX.1-dev (ae.safetensors)
Usage
With the z-image-app
See the z-image-app for a macOS GUI application.
Rust Code Example
use burn::backend::candle::{Candle, CandleDevice};
use half::bf16;
use qwen3_burn::{Qwen3Config, Qwen3Model, Qwen3Tokenizer};
use z_image::{GenerateFromTextOpts, modules::ae::AutoEncoderConfig, modules::transformer::ZImageModelConfig};
type Backend = Candle<bf16, i64>;
fn main() {
let device = CandleDevice::metal(0);
let model_dir = PathBuf::from("./models");
// Load components
let tokenizer = Qwen3Tokenizer::from_file(model_dir.join("qwen3-tokenizer.json")).unwrap();
let mut text_encoder: Qwen3Model<Backend> = Qwen3Config::z_image_text_encoder().init(&device);
text_encoder.load_weights(model_dir.join("qwen3_4b_text_encoder.safetensors")).unwrap();
let mut transformer = ZImageModelConfig::default().init(&device);
transformer.load_weights(model_dir.join("z_image_turbo_bf16.bpk")).unwrap();
let mut ae = AutoEncoderConfig::flux_ae().init(&device);
ae.load_weights(model_dir.join("ae.safetensors")).unwrap();
// Generate image
let opts = GenerateFromTextOpts {
prompt: "A beautiful sunset over mountains".to_string(),
out_path: PathBuf::from("output.png"),
width: 512,
height: 512,
};
z_image::generate_from_text(&opts, &tokenizer, &text_encoder, &ae, &transformer, &device).unwrap();
}
Requirements
- Apple Silicon Mac with Metal support, or
- CUDA-capable GPU (with appropriate Burn backend)
- ~16GB RAM for 512x512 images
- Rust 2024 edition
License
Apache 2.0
Acknowledgments
- Burn - Deep learning framework
- Comfy-Org - Original Z-Image model
- black-forest-labs - FLUX autoencoder