T
Token Type
In order to solve the following question, http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html:
★ Use one of the predefined similarity measures to score the similarity of each of the following pairs of words. Rank the pairs in order of decreasing similarity. How close is your ranking to the order given here, an order that was established experimentally by (Miller & Charles, 1998): car-automobile, gem-jewel, journey-voyage, boy-lad, coast-shore, asylum-madhouse, magician-wizard, midday-noon, furnace-stove, food-fruit, bird-****, bird-crane, tool-implement, brother-monk, lad-brother, crane-implement, journey-car, monk-oracle, cemetery-woodland, food-rooster, coast-hill, forest-graveyard, shore-woodland, monk-slave, coast-forest, lad-wizard, chord-smile, glass-magician, rooster-voyage, noon-string.
(1) First, I put the word pairs in a list eg.
pairs = [(car, automobile), (gem, jewel), (journey, voyage) ]. According to http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html, I need to put them in the following format so as to calculate teh semantic similarity : wn..synset('right_whale.n.01').path_similarity(wn.synset('minke_whale.n.01')).
In this case, I need to use loop to iterate each element in the above pairs.. How can I refer to each element in the above pairs, i.e. pairs = [(car,automobile), (gem, jewel), (journey, voyage) ]. What's the index for 'car'and for 'automobile'? Thanks for your tips.
(2) Since I can't solve the above index issue. I try to use dictionary as follows:word1 = wn.synset(str(key) + '.n.01')
word2 = wn.synset(str(pairs[key])+'.n.01')
similarity = word1.path_similarity(word2)
print key+'-'+pairs[key],similarity
car-automobile 1.0
journey-voyage 0.25
gem-jewel 0.125
Now it seems that I can calculate the semantic similarity for each groups in the above dictionary. However, I want to sort according to the similarityvalue in the result before print the result out. Can sort dictionary elements according to their values? This is one of the requirement in this exercise. How can we make each group of words (e.g. car-automobile, jounrney-voyage, gem-jewel)
sorted according to their similarity value?
Thanks for your tips.
★ Use one of the predefined similarity measures to score the similarity of each of the following pairs of words. Rank the pairs in order of decreasing similarity. How close is your ranking to the order given here, an order that was established experimentally by (Miller & Charles, 1998): car-automobile, gem-jewel, journey-voyage, boy-lad, coast-shore, asylum-madhouse, magician-wizard, midday-noon, furnace-stove, food-fruit, bird-****, bird-crane, tool-implement, brother-monk, lad-brother, crane-implement, journey-car, monk-oracle, cemetery-woodland, food-rooster, coast-hill, forest-graveyard, shore-woodland, monk-slave, coast-forest, lad-wizard, chord-smile, glass-magician, rooster-voyage, noon-string.
(1) First, I put the word pairs in a list eg.
pairs = [(car, automobile), (gem, jewel), (journey, voyage) ]. According to http://nltk.googlecode.com/svn/trunk/doc/book/ch02.html, I need to put them in the following format so as to calculate teh semantic similarity : wn..synset('right_whale.n.01').path_similarity(wn.synset('minke_whale.n.01')).
In this case, I need to use loop to iterate each element in the above pairs.. How can I refer to each element in the above pairs, i.e. pairs = [(car,automobile), (gem, jewel), (journey, voyage) ]. What's the index for 'car'and for 'automobile'? Thanks for your tips.
(2) Since I can't solve the above index issue. I try to use dictionary as follows:word1 = wn.synset(str(key) + '.n.01')
word2 = wn.synset(str(pairs[key])+'.n.01')
similarity = word1.path_similarity(word2)
print key+'-'+pairs[key],similarity
car-automobile 1.0
journey-voyage 0.25
gem-jewel 0.125
Now it seems that I can calculate the semantic similarity for each groups in the above dictionary. However, I want to sort according to the similarityvalue in the result before print the result out. Can sort dictionary elements according to their values? This is one of the requirement in this exercise. How can we make each group of words (e.g. car-automobile, jounrney-voyage, gem-jewel)
sorted according to their similarity value?
Thanks for your tips.