OAK

NCWP:Unsupervised Semantic Embedding Alignment Post-processing for Improving RAG in Language Models

Metadata Downloads
Author(s)
GangHo Lee
Type
Thesis
Degree
Master
Department
정보컴퓨팅대학 AI융합학과
Advisor
Lee, Yong-Gu
Abstract
This paper investigates structural limitations of large language model embeddings in Retrieval-Augmented Generation (RAG) systems, focusing on anisotropy and high dimensionality. When most variance is concentrated in a few principal directions, cosine similarity becomes distorted; at the same time, thousand-dimensional vectors incur substantial memory, indexing, and latency costs. Classical post-processing methods such as mean-centering, PCA/LPP, whitening, and random projection can partially restore isotropy by rescaling variances, but they do not explicitly learn to preserve neighborhood structure and rankings without labels. Contrastive fine-tuning of encoders can improve retrieval, yet it requires updating the whole model and is expensive to deploy.

To address this, the paper proposes NCWP (Neighbour-Contrastive Whitening Projection), a purely post-hoc method that keeps the backbone language model frozen and learns only a single linear projection 𝑊. NCWP first applies ZCA-shrink whitening to obtain an approximately isotropic initial space, then constructs positive pairs and hard negatives from k-nearest neighbors and trains 𝑊 with an InfoNCE-style contrastive loss. Output covariance regularization, orthogonal regularization, and periodic QR retraction are used to prevent collapse and maintain isotropy even at low dimensions. Experiments on a synthetic sentence corpus with STS-based labels (STS-Embed) and traditional IR-based labels (U2/U3 from TF-IDF, Jaccard, BM25) show that NCWP consistently outperforms PCA-Whitening, LPP, and Random Projection in mAP and nDCG@10, with particularly large gains for low dimensions (𝑟 ≤64). While the base model exhibits anisotropy_mean around 0.39, NCWP reduces this value to near zero across all tested dimensions, and self_sim decreases, indicating stronger separation between non-matching sentences. At the same time, dimensionality reduction with NCWP reduces latency and increases QPS by up to 2–3 times, yielding a better quality–efficiency Pareto trade-off than existing post-processing methods. These results demonstrate that NCWP is a practical embedding post-processing strategy for improving RAG retrieval quality without modifying or fine-tuning the underlying language model.
URI
https://scholar.gist.ac.kr/handle/local/33792
Fulltext
http://gist.dcollection.net/common/orgView/200000952386
Alternative Author(s)
이강호
Appears in Collections:
Department of AI Convergence > 3. Theses(Master)
공개 및 라이선스
  • 공개 구분공개
파일 목록
  • 관련 파일이 존재하지 않습니다.

Items in Repository are protected by copyright, with all rights reserved, unless otherwise indicated.