Word-Sense Disambiguation (Questionnaire 2)

2009/04/28 at 7:59 pm Utzi iruzkina

Word sense disambiguation (well known as WSD) have place in software programs which are designed to interpret the language. There are many ambiguous words and sentences which have more than one sense and can be understood in many different ways although only one sense in pretended. The main goal of disambiguation is to figure out the pretended or intended meaning.

This is a clear example of a written word which has two distinct senses or meanings:

For example, the written word “bass” has those two senses:

  1. bass = a kind of fish
  2. bass = tones of low frequency

When the word bass is introduced in a two different sentences, we actually know what  we are talking about:

  1. I went fishing for some sea “bass”.
  2. The “bass” line of the song is too weak.

A human knows that in the first sentence, the word “bass” is used to refer to a kind of fish and that the word “bass” which appears in the second sentence, is used to refer to a tone of low frequency.

Otherwise, we have to bear in mind that developing algorithms or algorism (a precise rule, or set of rules, specifying how to solve some problem) to reproduce this human ability can frequently be a tough mission.

Word-sense disambiguation systems are tested by comparing their results against the humans’  results. But we have to take into account, that  humans don’t always agree of which of the senses is the correct one. So it is not so logical to demand to  a machine (a computer in this case), to give a better performance that a human does (being the human the standard,  the computer can’t be better than this; it’s illogical). Researches of a coarse-grained divergence or distinction, are most effective since the human is better on coarse-grained divergence or distinction than on fine-grained one.

To end up, I also have to say that there are four conventional approaches to WSD (Word Sense Disambiguation):

  1. DICTIONARY- AND KNOWLEDGE-BASED METHODS: these methods mainly rely on dictionaries, on thesauri (book of synonyms), and also on lexical knowledge bases.
  2. SUPERVISED METHODS: these are methods which are based on the hypothesis that the context can give us some proofs in order to disambiguate words.
  3. SEMI-SUPERVISED METHODS: semi-supervised or minimally-supervised methods, use secondary resource of knowledge like a small annotated corpus as seed data in a bootstrapping process, or a word-aligned bilingual corpus.
  4. UNSUPERVISED METHODS (or word sense discrimination): these methods dodge barely fully external information and work straightaway from unprocessed unannotated corpora.

REFERENCES

Advertisements

Entry filed under: HLT, Littera. Tags: , .

Automatic Summarization (Questionnaire 2) Machine Translation (Questionnaire 3)

Utzi erantzun bat

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Aldatu )

Twitter picture

You are commenting using your Twitter account. Log Out / Aldatu )

Facebook photo

You are commenting using your Facebook account. Log Out / Aldatu )

Google+ photo

You are commenting using your Google+ account. Log Out / Aldatu )

Connecting to %s

Trackback this post  |  Subscribe to the comments via RSS Feed


Azken postak

Artxiboak

2009(e)ko apirila
A A A A O I I
« Mar   maiatza »
 12345
6789101112
13141516171819
20212223242526
27282930  

Feeds

RSS amaiaren bloga

RSS CiteUlike

RSS Google Books

  • An error has occurred; the feed is probably down. Try again later.

%d bloggers like this: