File size: 1,149 Bytes
f08720d
2a2627b
 
 
 
f08720d
0a0672e
2a2627b
f08720d
0a0672e
a49cfa0
f08720d
 
2a2627b
 
ebf1dc0
 
a49cfa0
 
 
 
 
 
 
 
 
 
 
 
 
ebf1dc0
 
a49cfa0
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
---
title: MoMask
emoji: 🎭
colorFrom: pink
colorTo: purple
sdk: gradio
sdk_version: "6.1.0"
app_file: app_new.py
pinned: false
python_version: "3.10"
short_description: Text-to-3D motion generation using ONNX models
---

# MoMask: Text-to-Motion Generation

Generate 3D human skeleton animations from text descriptions using [MoMask](https://github.com/EricGuo5513/momask-codes).

## Features
- Text-to-motion generation with classifier-free guidance
- Download BVH files for Blender import
- ~7 seconds of motion per generation

## Model Architecture (ONNX FP32, ~416MB total)
| Model | Size | Purpose |
|-------|------|---------|
| CLIP Text Encoder | 254MB | Text embedding |
| Mask Transformer | 56MB | Initial motion tokens |
| Residual Transformer | 55MB | Refine motion details |
| VQ-VAE Decoder | 46MB | Decode to motion |
| Length Estimator | 0.5MB | Predict motion length |

## Usage
1. Enter a text description (e.g., "A person walks forward")
2. Optionally set duration and seed
3. Click Generate
4. Download MP4 video or BVH for Blender

## Credits
Based on [MoMask](https://github.com/EricGuo5513/momask-codes) by Chuan Guo et al.