WIKISER - Wikipedia Semantic Entity Relatedness

WIKISER - Wikipedia Semantic Entity Relatedness

By Joaquim Filipe
Posted 06 Jan 2014 | 11:14 GMT

The semantic relatedness of two concepts indicates how far apart these concepts are when represented in a conceptual network or taxonomy, by using in the computation of this distance value all the seman- tic relationships that exist between them(Ponzetto and Strube, 2007). These semantic relationships may be hierarchical (IS-A type, hypernymy, hyponomy), of equivalence (synonyms) or associative (cause/effect) (Jiang and Conrath, 1997). Semantic relatedness computation techniques may be grouped into two categories:

  1. Distributive measures, that rely on non-structured data, such as large corpora. The underlying hy- pothesis is that similar words appear in similar contexts, thus they must have similar meanings. These approaches capture relationships between words. 
  2. Measures based on structured databases, such as taxonomies or ontologies, where relationships be- tween semantic concepts are captured.

Our proposed measure takes into account both the distance between concepts and the relationship of the concepts with others in the taxonomy. It is then generalized to measure the relatedness between concept sets or entities. As defined in (Rodriguez and Egenhofer, 2003), the term entity refers to groups of concepts or objects in the real world that are somehow semantically related.

Our goal is then to present a cost function that allows us to make a decision on similarity between two or more entities, based on their representations as concept sets. As an ontological basis, we use the english version of Wikipedia to help establish semantic relationships between concepts and entities. In recent years, Wikipedia has been explored as a potential knowledge base for a number of information retrieval tasks, such as text categorization, named entity recognition and semantic relatedness computation (Zesch et al., 2008). However, a lot remains to be done.