Max's Tech Blog


Explicit semantic analysis

Explicit semantic analysis (ESA) is a vectorial representation of text that uses a document corpus as a knowledge base.

In ESA, a word is represented as a column vector in the tf-idf matrix of the text corpus and a document is represented as the centroid of the vectors representing its words. Typicall, the text corpus is Wikipedia, though other corpora have been used.

  1. Represent texts as weighted mix of predetermined set of natural concepts

  2. Natrual concepts: Defined by humans, easily explained

Read More
Load More…