RT Journal Article T1 Improving large-scale k-nearest neighbor text categorization with label autoencoders A1 Ribadas Pena, Francisco Jose A1 Cao, Shuyuan A1 Darriba Bilbao, Victor Manuel K1 1203.17 Informática K1 12 Matemáticas K1 3304 Tecnología de Los Ordenadores AB In this paper, we introduce a multi-label lazy learning approach to deal with automatic semantic indexing in large document collections in the presence of complex and structured label vocabularies with high inter-label correlation. The proposed method is an evolution of the traditional k-Nearest Neighbors algorithm which uses a large autoencoder trained to map the large label space to a reduced size latent space and to regenerate the predicted labels from this latent space. We have evaluated our proposal in a large portion of the MEDLINE biomedical document collection which uses the Medical Subject Headings (MeSH) thesaurus as a controlled vocabulary. In our experiments we propose and evaluate several document representation approaches and different label autoencoder configurations. PB Mathematics SN 22277390 YR 2022 FD 2022-08-11 LK http://hdl.handle.net/11093/3804 UL http://hdl.handle.net/11093/3804 LA eng NO Mathematics, 10(16): 2867 (2022) NO Ministerio de Ciencia e Innovación | Ref. PID2020-113230RB-C22 DS Investigo RD 04-dic-2024