DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation
- [2026/6/19] DiffMath is now on arXiv, focusing on structured content generation.
- [2026/3/21] Code and pretrained weights are released.
- [2026/1/29] DiffInk is accepted by ICLR 2026 🎉🎉🎉.
- [2025/10/1] The DiffInk paper can be found at arXiv.
Text-to-Online Handwriting Generation (TOHG) refers to the task of synthesizing realistic pen trajectories
(a) A two-stage pipeline combining handwritten font generation with layout post-processing; (b) DiffInk (Ours), which takes text and a style reference to directly output complete text lines. Unlike the two-stage pipeline, DiffInk generates more natural character connections rather than mechanically stitching bounding boxes.
- Create environment::
conda create -n diffink python=3.8 -y - Install dependencies:
conda activate diffink && pip install -r requirements.txt
Download the dataset and pretrained weights from Google Drive or Baidu Cloud.
-
Train the InkVAE model with:
bash scripts/train_vae_ddp.sh -
Then, train the InkDiT model with:
bash scripts/train_dit_ddp.sh -
Finally, fine-tune the model on real data with:
bash scripts/tune_dit_ddp.sh
Run inference with: CUDA_VISIBLE_DEVICES=0 python val_dit.py
-
ConvNeXt-V2, DiT, F5-TTS, WriteLikeU, and OLHWG provide valuable inspiration for this work.
-
CASIA-OLHWDB and IAM-OnDB datasets are valuable resources for this work.
If you find this work useful or use this code in your research, please consider citing the following paper.
@inproceedings{pan2026diffink,
title={DiffInk: Glyph- and Style-Aware Latent Diffusion Transformer for Text to Online Handwriting Generation},
author={Wei Pan and Huiguo He and Hiuyi Cheng and Yilin Shi and Lianwen Jin},
booktitle={The Fourteenth International Conference on Learning Representations},
year={2026},
url={https://openreview.net/forum?id=XKOEQFKFdL}
}Our code is released under the MIT License. The pretrained weights, which are trained using part of the CASIA-OLHWDB dataset, follow the original license of the dataset.

