Please use this identifier to cite or link to this item:
Title: Semantic Topic Modelling
Authors: Ferrugento, Adriana Figueiredo 
Orientador: Oliveira, Hugo Ricardo Gonçalo
Keywords: Semantic Topic Modelling
Issue Date: 15-Jul-2015
Serial title, monograph or event: Semantic Topic Modelling
Place of publication or event: Coimbra
Abstract: Topic models came to improve the way search, browse and summarization of large sets of texts is performed. These models are used for uncovering the main theme of the documents in a corpus, where topics are probability distributions over a collection of words that is representative of a document. The most widely used topic model is called Latent Dirichlet Allocation (LDA) and it enables for documents to be characterized by more than one topic. This allows for a more accurate representation of what happens with real documents, where a text may have more than one underlying theme. However, this popular model is still far from producing excellent topics, given that it does not account for the semantic relations between words. It may thus result in redundant topics that contain di erent words, but with the same meaning. This thesis o ers a way to improve the LDA algorithm and, hence, solve the problem of not considering the semantics of words. The model proposed here uses the LDA algorithm as a starting point, however some changes are made, since it is our interest to introduce semantic relations in this model. A main component of the proposed model is the use of a lexical database for English, WordNet, which enables the integration of semantics by accessing its content.
Description: Dissertação de Mestrado em Engenharia Informática apresentada à Faculdade de Ciências e Tecnologia da Universidade de Coimbra
Rights: openAccess
Appears in Collections:UC - Dissertações de Mestrado
FCTUC Eng.Informática - Teses de Mestrado

Files in This Item:
File Description SizeFormat
Semantic Topic Modelling.pdf1.02 MBAdobe PDFView/Open
Show full item record

Page view(s) 20

checked on Sep 21, 2020

Download(s) 50

checked on Sep 21, 2020

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.