An FCA-based method for multilingual documents clustering

Abstract:

Cross-Lingual Information Retrieval (CLIR) is a sub_eld of Information Retrieval (IR). It aims to identify relevant documents in a language which is di_erent to the user's query language. One promis-ing strategy for improving the performance of CLIR consists in using an appropriate document-clustering method that retrieves multilingual documents. In this paper, we propose an FCA-based method for Multi-lingual Document-Clustering (MDC). This method uses Formal Concept analysis (FCA) for modeling corpus content, Latent Semantic Indexing (LSI) for de_ning comparability relationships between documents in dif-ferent languages and Relational Concept Analysis (RCA) for upgrading initial lattices using comparability relationships between documents. The obtained lattices, so-called upgraded lattices, will be used for retrieval process.