OAK

Repulsive Guidance for Memorization Mitigation in Text-to-Music Diffusion Models

Metadata Downloads
Author(s)
Kim, TaehyeonLee, HangyeolAhn, Chang WookKim, Man-Je
Type
Article
Citation
MATHEMATICS, v.14, no.9
Issued Date
2026-04
Abstract
Recent progress in text-to-music generation has enabled high-quality audio synthesis from natural language prompts. However, such models are at risk of unintended replication, raising concerns regarding originality and intellectual property. While training-time mitigation strategies can address this issue, they typically require retraining or curated datasets, limiting their practicality for large-scale systems. Inference-time methods provide a lightweight alternative but often involve a trade-off between fidelity and memorization risk. This work introduces repulsive guidance (RG), a systematic inference-time mitigation strategy that reduces memorization without disrupting the intended conditional guidance from the text prompt. RG operates by enforcing divergence between dual diffusion trajectories through a repulsive term applied only during early denoising steps, without reversing the conditional guidance from the prompt. Experiments on MusicBench with the Tango model demonstrate that RG offers a complementary mitigation strategy, providing new insights into balancing fidelity and memorization risk.
Publisher
MDPI
DOI
10.3390/math14091512
URI
https://scholar.gist.ac.kr/handle/local/34147
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.