M
Michael Granger
Hi fellow Rubyists,
I'd like to announce the second release of the Linguistics module, a
generic, language-neutral framework for extending Ruby objects with
linguistic methods.
This release fixes some of the bugs that were discovered since the last
version, and adds a few new features:
== Infinitives
New in version 0.02:
"leaving".en.infinitive
# => "leave"
"left".en.infinitive
# => "leave"
"leaving".en.infinitive.suffix
# => "ing"
== WordNetÆ Integration
Also new in version 0.02, if you have the Ruby-WordNet module
installed, you can
look up WordNet synsets using the Linguistics interface:
# Test to be sure the WordNet module loaded okay.
Linguistics::EN.has_wordnet?
# => true
# Fetch the default synset for the word "balance"
"balance".synset
# => #<WordNet::Synset:0x40376844 balance (noun): "a state of
equilibrium"
(derivations: 3, antonyms: 1, hypernyms: 1, hyponyms: 3)>
# Fetch the synset for the first verb sense of "balance"
"balance".en.synset( :verb )
# => #<WordNet::Synset:0x4033f448 balance, equilibrate, equilibrize,
equilibrise
(verb): "bring into balance or equilibrium; "She has to balance work
and her
domestic duties"; "balance the two weights"" (derivations: 7,
antonyms: 1,
verbGroups: 2, hypernyms: 1, hyponyms: 5)>
# Fetch the second noun sense
"balance".en.synset( 2, :noun )
# => #<WordNet::Synset:0x404ebb24 balance (noun): "a scale for
weighing; depends
on pull of gravity" (hypernyms: 1, hyponyms: 5)>
# Fetch the second noun sense's hypernyms (more-general words, like a
superclass)
"balance".en.synset( 2, :noun ).hypernyms
# => [#<WordNet::Synset:0x404e5620 scale, weighing machine (noun): "a
measuring
instrument for weighing; shows amount of mass" (derivations: 2,
hypernyms: 1,
hyponyms: 2)>]
# A simpler way of doing the same thing:
"balance".en.hypernyms( 2, :noun )
# => [#<WordNet::Synset:0x404e5620 scale, weighing machine (noun): "a
measuring
instrument for weighing; shows amount of mass" (derivations: 2,
hypernyms: 1,
hyponyms: 2)>]
# Fetch the first hypernym's hypernyms
"balance".en.synset( 2, :noun ).hypernyms.first.hypernyms
# => [#<WordNet::Synset:0x404c60b8 measuring instrument, measuring
system,
measuring device (noun): "instrument that shows the extent or amount
or quantity
or degree of something" (hypernyms: 1, hyponyms: 83)>]
# Find the synset to which both the second noun sense of "balance"
and the
# default sense of "shovel" belong.
("balance".en.synset( 2, :noun ) | "shovel".en.synset)
# => #<WordNet::Synset:0x40473da4 instrumentality, instrumentation
(noun): "an
artifact (or system of artifacts) that is instrumental in
accomplishing some
end" (derivations: 1, hypernyms: 1, hyponyms: 13)>
# Fetch just the words for the other kinds of "instruments"
"instrument".en.hyponyms.collect {|synset| synset.words}.flatten
# => ["analyzer", "analyser", "cautery", "cauterant", "drafting
instrument",
"extractor", "instrument of execution", "instrument of punishment",
"measuring
instrument", "measuring system", "measuring device", "medical
instrument",
"navigational instrument", "optical instrument", "plotter",
"scientific
instrument", "sonograph", "surveying instrument", "surveyor's
instrument",
"tracer", "weapon", "arm", "weapon system", "whip"]
There are many more WordNet methods supported – too many to list here.
See the
documentation for the complete list.
== LinkParser Integration
Another new feature in version 0.02 is integration with the Ruby
version of the
CMU Link Grammar Parser by Martin Chase. If you have the LinkParser
module
installed, you can create linkages from English sentences that let you
query for
parts of speech:
# Test to see whether or not the link parser is loaded.
Linguistics::EN.has_link_parser?
# => true
# Diagram the first linkage for a test sentence
puts "he is a big dog".sentence.linkages.first.to_s
+---O*---+
| +--Ds--+
+Ss+ | +-A-+
| | | | |
he is a big dog
# Find the verb in the sentence
"he is a big dog".en.sentence.verb.to_s
# => "is"
# Combined infinitive + LinkParser: Find the infinitive form of the
verb of the
given sentence.
"he is a big dog".en.sentence.verb.infinitive
# => "be"
# Find the direct object of the sentence
"he is a big dog".en.sentence.object.to_s
# => "dog"
# Look at the raw LinkParser::Word for the direct object of the
sentence.
"he is a big dog".en.sentence.object
# => #<LinkParser::Word:0x403da0a0 @definition=[[{@A-}, Ds-, {@M+},
J-], [{@A-},
Ds-, {@M+}, Os-], [{@A-}, Ds-, {@M+}, Ss+, {@CO-}, {C-}], [{@A-},
Ds-, {@M+},
Ss+, R-], [{@A-}, Ds-, {@M+}, SIs-], [{@A-}, Ds-, {R+}, {Bs+}, J-],
[{@A-}, Ds-,
{R+}, {Bs+}, Os-], [{@A-}, Ds-, {R+}, {Bs+}, Ss+, {@CO-}, {C-}],
[{@A-}, Ds-,
{R+}, {Bs+}, Ss+, R-], [{@A-}, Ds-, {R+}, {Bs+}, SIs-]], @right=[],
@suffix="",
@left=[#<LinkParser::Connection:0x403da028
@rword=#<LinkParser::Word:0x403da0a0
...>, @lword=#<LinkParser::Word:0x403da0b4 @definition=[[Ss-, O+,
{@MV+}], [Ss-,
B-, {@MV+}], [Ss-, P+], [Ss-, AF-], [RS-, Bs-, O+, {@MV+}], [RS-,
Bs-, B-,
{@MV+}], [RS-, Bs-, P+], [RS-, Bs-, AF-], [{Q-}, SIs+, O+, {@MV+}],
[{Q-}, SIs+,
B-, {@MV+}], [{Q-}, SIs+, P+], [{Q-}, SIs+, AF-]],
@right=[#<LinkParser::Connection:0x403da028 ...>], @suffix="",
@left=[],
@name="is", @position=1>, @subName="*", @name="O", @length=3>],
@name="dog",
@position=4>
# Combine WordNet + LinkParser to find the definition of the direct
object of
# the sentence
"he is a big dog".en.sentence.object.gloss
# => "a member of the genus Canis (probably descended from the common
wolf) that
has been domesticated by man since prehistoric times; occurs in many
breeds;
\"the dog barked all night\""
To find out more, visit the project's home page:
<http://www.devEiate.org/code/linguistics.html>
You can also download the module directly from:
<http://www.devEiate.org/code/Linguistics-0.02.tar.gz>
Thanks for your time.
I'd like to announce the second release of the Linguistics module, a
generic, language-neutral framework for extending Ruby objects with
linguistic methods.
This release fixes some of the bugs that were discovered since the last
version, and adds a few new features:
== Infinitives
New in version 0.02:
"leaving".en.infinitive
# => "leave"
"left".en.infinitive
# => "leave"
"leaving".en.infinitive.suffix
# => "ing"
== WordNetÆ Integration
Also new in version 0.02, if you have the Ruby-WordNet module
installed, you can
look up WordNet synsets using the Linguistics interface:
# Test to be sure the WordNet module loaded okay.
Linguistics::EN.has_wordnet?
# => true
# Fetch the default synset for the word "balance"
"balance".synset
# => #<WordNet::Synset:0x40376844 balance (noun): "a state of
equilibrium"
(derivations: 3, antonyms: 1, hypernyms: 1, hyponyms: 3)>
# Fetch the synset for the first verb sense of "balance"
"balance".en.synset( :verb )
# => #<WordNet::Synset:0x4033f448 balance, equilibrate, equilibrize,
equilibrise
(verb): "bring into balance or equilibrium; "She has to balance work
and her
domestic duties"; "balance the two weights"" (derivations: 7,
antonyms: 1,
verbGroups: 2, hypernyms: 1, hyponyms: 5)>
# Fetch the second noun sense
"balance".en.synset( 2, :noun )
# => #<WordNet::Synset:0x404ebb24 balance (noun): "a scale for
weighing; depends
on pull of gravity" (hypernyms: 1, hyponyms: 5)>
# Fetch the second noun sense's hypernyms (more-general words, like a
superclass)
"balance".en.synset( 2, :noun ).hypernyms
# => [#<WordNet::Synset:0x404e5620 scale, weighing machine (noun): "a
measuring
instrument for weighing; shows amount of mass" (derivations: 2,
hypernyms: 1,
hyponyms: 2)>]
# A simpler way of doing the same thing:
"balance".en.hypernyms( 2, :noun )
# => [#<WordNet::Synset:0x404e5620 scale, weighing machine (noun): "a
measuring
instrument for weighing; shows amount of mass" (derivations: 2,
hypernyms: 1,
hyponyms: 2)>]
# Fetch the first hypernym's hypernyms
"balance".en.synset( 2, :noun ).hypernyms.first.hypernyms
# => [#<WordNet::Synset:0x404c60b8 measuring instrument, measuring
system,
measuring device (noun): "instrument that shows the extent or amount
or quantity
or degree of something" (hypernyms: 1, hyponyms: 83)>]
# Find the synset to which both the second noun sense of "balance"
and the
# default sense of "shovel" belong.
("balance".en.synset( 2, :noun ) | "shovel".en.synset)
# => #<WordNet::Synset:0x40473da4 instrumentality, instrumentation
(noun): "an
artifact (or system of artifacts) that is instrumental in
accomplishing some
end" (derivations: 1, hypernyms: 1, hyponyms: 13)>
# Fetch just the words for the other kinds of "instruments"
"instrument".en.hyponyms.collect {|synset| synset.words}.flatten
# => ["analyzer", "analyser", "cautery", "cauterant", "drafting
instrument",
"extractor", "instrument of execution", "instrument of punishment",
"measuring
instrument", "measuring system", "measuring device", "medical
instrument",
"navigational instrument", "optical instrument", "plotter",
"scientific
instrument", "sonograph", "surveying instrument", "surveyor's
instrument",
"tracer", "weapon", "arm", "weapon system", "whip"]
There are many more WordNet methods supported – too many to list here.
See the
documentation for the complete list.
== LinkParser Integration
Another new feature in version 0.02 is integration with the Ruby
version of the
CMU Link Grammar Parser by Martin Chase. If you have the LinkParser
module
installed, you can create linkages from English sentences that let you
query for
parts of speech:
# Test to see whether or not the link parser is loaded.
Linguistics::EN.has_link_parser?
# => true
# Diagram the first linkage for a test sentence
puts "he is a big dog".sentence.linkages.first.to_s
+---O*---+
| +--Ds--+
+Ss+ | +-A-+
| | | | |
he is a big dog
# Find the verb in the sentence
"he is a big dog".en.sentence.verb.to_s
# => "is"
# Combined infinitive + LinkParser: Find the infinitive form of the
verb of the
given sentence.
"he is a big dog".en.sentence.verb.infinitive
# => "be"
# Find the direct object of the sentence
"he is a big dog".en.sentence.object.to_s
# => "dog"
# Look at the raw LinkParser::Word for the direct object of the
sentence.
"he is a big dog".en.sentence.object
# => #<LinkParser::Word:0x403da0a0 @definition=[[{@A-}, Ds-, {@M+},
J-], [{@A-},
Ds-, {@M+}, Os-], [{@A-}, Ds-, {@M+}, Ss+, {@CO-}, {C-}], [{@A-},
Ds-, {@M+},
Ss+, R-], [{@A-}, Ds-, {@M+}, SIs-], [{@A-}, Ds-, {R+}, {Bs+}, J-],
[{@A-}, Ds-,
{R+}, {Bs+}, Os-], [{@A-}, Ds-, {R+}, {Bs+}, Ss+, {@CO-}, {C-}],
[{@A-}, Ds-,
{R+}, {Bs+}, Ss+, R-], [{@A-}, Ds-, {R+}, {Bs+}, SIs-]], @right=[],
@suffix="",
@left=[#<LinkParser::Connection:0x403da028
@rword=#<LinkParser::Word:0x403da0a0
...>, @lword=#<LinkParser::Word:0x403da0b4 @definition=[[Ss-, O+,
{@MV+}], [Ss-,
B-, {@MV+}], [Ss-, P+], [Ss-, AF-], [RS-, Bs-, O+, {@MV+}], [RS-,
Bs-, B-,
{@MV+}], [RS-, Bs-, P+], [RS-, Bs-, AF-], [{Q-}, SIs+, O+, {@MV+}],
[{Q-}, SIs+,
B-, {@MV+}], [{Q-}, SIs+, P+], [{Q-}, SIs+, AF-]],
@right=[#<LinkParser::Connection:0x403da028 ...>], @suffix="",
@left=[],
@name="is", @position=1>, @subName="*", @name="O", @length=3>],
@name="dog",
@position=4>
# Combine WordNet + LinkParser to find the definition of the direct
object of
# the sentence
"he is a big dog".en.sentence.object.gloss
# => "a member of the genus Canis (probably descended from the common
wolf) that
has been domesticated by man since prehistoric times; occurs in many
breeds;
\"the dog barked all night\""
To find out more, visit the project's home page:
<http://www.devEiate.org/code/linguistics.html>
You can also download the module directly from:
<http://www.devEiate.org/code/Linguistics-0.02.tar.gz>
Thanks for your time.