Repulsive Guidance for Memorization Mitigation in Text-to-Music Diffusion Models
- Author(s)
- Kim, Taehyeon; Lee, Hangyeol; Ahn, Chang Wook; Kim, Man-Je
- Type
- Article
- Citation
- MATHEMATICS, v.14, no.9
- Issued Date
- 2026-04
- Abstract
- Recent progress in text-to-music generation has enabled high-quality audio synthesis from natural language prompts. However, such models are at risk of unintended replication, raising concerns regarding originality and intellectual property. While training-time mitigation strategies can address this issue, they typically require retraining or curated datasets, limiting their practicality for large-scale systems. Inference-time methods provide a lightweight alternative but often involve a trade-off between fidelity and memorization risk. This work introduces repulsive guidance (RG), a systematic inference-time mitigation strategy that reduces memorization without disrupting the intended conditional guidance from the text prompt. RG operates by enforcing divergence between dual diffusion trajectories through a repulsive term applied only during early denoising steps, without reversing the conditional guidance from the prompt. Experiments on MusicBench with the Tango model demonstrate that RG offers a complementary mitigation strategy, providing new insights into balancing fidelity and memorization risk.
- Publisher
- MDPI
- DOI
- 10.3390/math14091512
- URI
- https://scholar.gist.ac.kr/handle/local/34147
- 공개 및 라이선스
-
- 파일 목록
-
Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.