A Document Clustering Approach for Automatic Building of Ontologies

In Proc. of he Second IEEE International Workshop on Data Science Engineering and its Applications, 2018.
In the context of globalization, companies need to capitalize on their knowledge. The knowledge of a company is present in two forms tacit and explicit. Explicit knowledge represents all formalized information i.e all documents (pdf,words ...). Tacit knowledge is present in documents and mind of employees, this kind of knowledge is not formalized, it needs a reasoning process to discover it. In this paper, we propose a novel approach for documents clustering that is based on word clusters automatically extracted from the documents. The word concepts are considered as concepts candidates. In a second step, word clusters are used to ease the automatic building of ontologies from a given corpus. Some experiments allows for a validation of the whole approach. The chosen corpus is Reuters- 21578 that is among the most used for text categorization research.
Knowledge management, Ontologies, Document Clustering
Publication Category:
International conference with proceedings
