These are ambitious questions with enormous relevance for
pure scholarship, scientific knowledge, and all of our daily lives.
And they are questions which those of us working on the infomap
project are trying to answer, in practical and robust ways.
Our main approach is to build words and meanings into mathematical spaces, in such a way that the relationships between words in these spaces reflects the way words and meanings are related in documents. We develop these in such a way that the spaces can be built directly from the documents themselves, without a human intermediary saying which bit should go where.
There are two reasons for this. The first is practical - building lexical resources like dictionaries and thesauri is time-consuming and expensive. The number of documents available and the terminology used in different fields is growing so quickly that we need to develop tools which can automatically help with this task: especially if we want to provide good quality resources for more languages than just English. The second is theoretical. The fact that words can be learnt and used to refer to concepts is fascinating, and any abstract system that can perform aspects of this task may shed light on it.
One way of building words into an abstract space is to give each word a list of identifying "co-ordinates", which measure how important a certain feature or property is in defining that word. A good analogy is the way we use latitude and longitude for describing the location of a point on the earth's surface. One of the best things about these numbers is that two places which are close together have similar co-ordinates. In a similar way, Infomap seeks the best way to assign "meaning co-ordinates" to a word, influenced by the words nearby in a sentence or document.
Follow this link to read a simple description of how we accomplish
this in practice. We are working to improve the sensitivity of our
models and our variety of applications: we encourage you to visit us
regularly. You can test some of the resulting models on our demonstrations, and we would
welcome your response.
A more thorough description of our technique can be found in:
Yasuhiro Takayama, Raymond Flournoy, Stefan Kaufmann,
and Stanley Peters:
Information Mapping: Concept-based Information Retrieval based on Word
Associations.
Watch this space.
Back to Infomap Home