----- Original Message -----
From: "Mark Jeffcoat" <
[email protected]>
Newsgroups: comp.lang.java.programmer
Sent: Monday, February 26, 2007 8:46 PM
Subject: Re: Algorithm considerations
Cool. Can you make your preposition list available? Your
problem is likely solved, but I'd be interested looking at
the performance numbers for a couple of different approachs;
I'm curious about how much difference micro-optimizations
could really make in a case like this.
I"VE ENCLOSED IT AT THE BOTTOM AS SQL COMMANDS IN CASE SOMEONE WANTS TO USE
IT IN MYSQL AS I AM DOING.
On a tangent, I'd always write this function as returning a
String, returning where you have "//here's the match" (and
probably null on no match.) I think that style gives you
better functional decomposition, and makes it much simpler
to test startWithPrep in isolation.
I"M ACTUALLY USING IT TO RETAG PARTS-OF-SPEECH, WITH BRILL POS TAGS. SO IN
MY INSTANTIATION OF IT, I AM PASSING THE FUNCTION ANOTHER ARRAY WHICH IS THE
TAGS CORRESPONDING TO THE WORDS, AND SWAPPING THE TAGS WITH 'IN' IN LIEU OF
WHAT THEY ARE (UNLESS THEY ARE ALREADY TAGGED 'CS').
HERE'S MY LIST. I AM NOT INCLUDING POSTPREPOSITIONS (LIKE 'ago') .
ADDITIONALLY, MY alsoCS FIELD MEANS THAT THIS WORD IS ALSO OFTEN USED AS A
SUBORDINATING CONJUNCTION:
# --------------------------------------------------------
#
# Table structure for table `prepositions`
#
CREATE TABLE `prepositions` (
`id` int(3) NOT NULL auto_increment,
`preposition` varchar(40) NOT NULL default '',
`alsoCS` int(3) NOT NULL default '0',
PRIMARY KEY (`id`)
) ENGINE=InnoDB DEFAULT CHARSET=latin1;
#
# Dumping data for table `prepositions`
#
INSERT INTO `prepositions` VALUES (1, 'abaft', 0);
INSERT INTO `prepositions` VALUES (2, 'about', 0);
INSERT INTO `prepositions` VALUES (3, 'above', 0);
INSERT INTO `prepositions` VALUES (4, 'according to', 0);
INSERT INTO `prepositions` VALUES (5, 'across', 0);
INSERT INTO `prepositions` VALUES (6, 'after', 1);
INSERT INTO `prepositions` VALUES (7, 'against', 0);
INSERT INTO `prepositions` VALUES (8, 'ahead of', 0);
INSERT INTO `prepositions` VALUES (9, 'along', 0);
INSERT INTO `prepositions` VALUES (10, 'along with', 0);
INSERT INTO `prepositions` VALUES (11, 'alongside', 0);
INSERT INTO `prepositions` VALUES (12, 'amid', 0);
INSERT INTO `prepositions` VALUES (13, 'among', 0);
INSERT INTO `prepositions` VALUES (14, 'amongst', 0);
INSERT INTO `prepositions` VALUES (15, 'apart from', 0);
INSERT INTO `prepositions` VALUES (16, 'around', 0);
INSERT INTO `prepositions` VALUES (17, 'as', 1);
INSERT INTO `prepositions` VALUES (18, 'as far as', 0);
INSERT INTO `prepositions` VALUES (19, 'as well as', 0);
INSERT INTO `prepositions` VALUES (20, 'at', 0);
INSERT INTO `prepositions` VALUES (21, 'back of', 0);
INSERT INTO `prepositions` VALUES (22, 'before', 1);
INSERT INTO `prepositions` VALUES (23, 'behind', 0);
INSERT INTO `prepositions` VALUES (24, 'below', 0);
INSERT INTO `prepositions` VALUES (25, 'beneath', 0);
INSERT INTO `prepositions` VALUES (26, 'beside', 0);
INSERT INTO `prepositions` VALUES (27, 'between', 0);
INSERT INTO `prepositions` VALUES (28, 'beyond', 0);
INSERT INTO `prepositions` VALUES (29, 'but', 0);
INSERT INTO `prepositions` VALUES (30, 'by', 0);
INSERT INTO `prepositions` VALUES (31, 'concerning', 0);
INSERT INTO `prepositions` VALUES (32, 'contrary to', 0);
INSERT INTO `prepositions` VALUES (33, 'despite', 0);
INSERT INTO `prepositions` VALUES (34, 'down', 0);
INSERT INTO `prepositions` VALUES (35, 'during', 0);
INSERT INTO `prepositions` VALUES (36, 'except', 0);
INSERT INTO `prepositions` VALUES (37, 'excepting', 0);
INSERT INTO `prepositions` VALUES (38, 'for', 0);
INSERT INTO `prepositions` VALUES (39, 'from', 0);
INSERT INTO `prepositions` VALUES (40, 'in', 0);
INSERT INTO `prepositions` VALUES (41, 'in addition to', 0);
INSERT INTO `prepositions` VALUES (42, 'in back of', 0);
INSERT INTO `prepositions` VALUES (43, 'in front of', 0);
INSERT INTO `prepositions` VALUES (44, 'in lieu of', 0);
INSERT INTO `prepositions` VALUES (45, 'in place of', 0);
INSERT INTO `prepositions` VALUES (46, 'in regard to', 0);
INSERT INTO `prepositions` VALUES (47, 'in spite of', 0);
INSERT INTO `prepositions` VALUES (48, 'in view of', 0);
INSERT INTO `prepositions` VALUES (49, 'inside', 0);
INSERT INTO `prepositions` VALUES (50, 'instead of', 0);
INSERT INTO `prepositions` VALUES (51, 'into', 0);
INSERT INTO `prepositions` VALUES (52, 'like', 0);
INSERT INTO `prepositions` VALUES (53, 'near', 0);
INSERT INTO `prepositions` VALUES (54, 'of', 0);
INSERT INTO `prepositions` VALUES (55, 'off', 0);
INSERT INTO `prepositions` VALUES (56, 'on', 0);
INSERT INTO `prepositions` VALUES (57, 'on account of', 0);
INSERT INTO `prepositions` VALUES (58, 'out', 0);
INSERT INTO `prepositions` VALUES (59, 'out of', 0);
INSERT INTO `prepositions` VALUES (60, 'outside', 0);
INSERT INTO `prepositions` VALUES (61, 'over', 0);
INSERT INTO `prepositions` VALUES (62, 'past', 0);
INSERT INTO `prepositions` VALUES (63, 'rather than', 0);
INSERT INTO `prepositions` VALUES (64, 'regarding', 0);
INSERT INTO `prepositions` VALUES (65, 'round', 0);
INSERT INTO `prepositions` VALUES (66, 'since', 1);
INSERT INTO `prepositions` VALUES (67, 'through', 0);
INSERT INTO `prepositions` VALUES (68, 'throughout', 0);
INSERT INTO `prepositions` VALUES (69, 'till', 0);
INSERT INTO `prepositions` VALUES (70, 'to', 0);
INSERT INTO `prepositions` VALUES (71, 'together with', 0);
INSERT INTO `prepositions` VALUES (72, 'toward', 0);
INSERT INTO `prepositions` VALUES (73, 'towards', 0);
INSERT INTO `prepositions` VALUES (74, 'under', 0);
INSERT INTO `prepositions` VALUES (75, 'underneath', 0);
INSERT INTO `prepositions` VALUES (76, 'until', 1);
INSERT INTO `prepositions` VALUES (77, 'unto', 0);
INSERT INTO `prepositions` VALUES (78, 'up', 0);
INSERT INTO `prepositions` VALUES (79, 'up to', 0);
INSERT INTO `prepositions` VALUES (80, 'upon', 0);
INSERT INTO `prepositions` VALUES (81, 'versus', 0);
INSERT INTO `prepositions` VALUES (82, 'via', 0);
INSERT INTO `prepositions` VALUES (83, 'with', 0);
INSERT INTO `prepositions` VALUES (84, 'with regard to', 0);
INSERT INTO `prepositions` VALUES (85, 'within', 0);
INSERT INTO `prepositions` VALUES (86, 'without', 0);
INSERT INTO `prepositions` VALUES (87, 'worth', 0);
INSERT INTO `prepositions` VALUES (88, 'with regards to', 0);
INSERT INTO `prepositions` VALUES (89, 'aboard', 0);
INSERT INTO `prepositions` VALUES (90, 'absent', 0);
INSERT INTO `prepositions` VALUES (91, 'amidst', 0);
INSERT INTO `prepositions` VALUES (92, 'astride', 0);
INSERT INTO `prepositions` VALUES (93, 'atop', 0);
INSERT INTO `prepositions` VALUES (94, 'besides', 0);
INSERT INTO `prepositions` VALUES (95, 'following', 0);
INSERT INTO `prepositions` VALUES (96, 'notwithstanding', 0);
INSERT INTO `prepositions` VALUES (97, 'mid', 0);
INSERT INTO `prepositions` VALUES (98, 'minus', 0);
INSERT INTO `prepositions` VALUES (99, 'onto', 0);
INSERT INTO `prepositions` VALUES (100, 'opposite', 0);
INSERT INTO `prepositions` VALUES (101, 're', 0);
INSERT INTO `prepositions` VALUES (102, 'subsequent to', 0);
INSERT INTO `prepositions` VALUES (103, 'prior to', 0);
INSERT INTO `prepositions` VALUES (104, 'next to', 0);
INSERT INTO `prepositions` VALUES (105, 'near to', 0);
INSERT INTO `prepositions` VALUES (106, 'owing to', 0);
INSERT INTO `prepositions` VALUES (107, 'outside of', 0);
INSERT INTO `prepositions` VALUES (108, 'on to', 0);
INSERT INTO `prepositions` VALUES (109, 'in to', 0);
INSERT INTO `prepositions` VALUES (110, 'inside of', 0);
INSERT INTO `prepositions` VALUES (111, 'far from', 0);
INSERT INTO `prepositions` VALUES (112, 'as to', 0);
INSERT INTO `prepositions` VALUES (113, 'aside from', 0);
INSERT INTO `prepositions` VALUES (114, 'because of', 0);
INSERT INTO `prepositions` VALUES (115, 'close to', 0);
INSERT INTO `prepositions` VALUES (116, 'due to', 0);
INSERT INTO `prepositions` VALUES (117, 'by means of', 0);
INSERT INTO `prepositions` VALUES (118, 'in accordance with', 0);
INSERT INTO `prepositions` VALUES (119, 'on behalf of', 0);
INSERT INTO `prepositions` VALUES (120, 'on top of', 0);
INSERT INTO `prepositions` VALUES (121, 'in case of', 0);
INSERT INTO `prepositions` VALUES (122, 'betwixt', 0);
INSERT INTO `prepositions` VALUES (123, 'circa', 0);
INSERT INTO `prepositions` VALUES (124, 'anti', 0);
INSERT INTO `prepositions` VALUES (125, 'cum', 0);
INSERT INTO `prepositions` VALUES (126, 'per', 0);
INSERT INTO `prepositions` VALUES (127, 'qua', 0);
INSERT INTO `prepositions` VALUES (128, 'sans', 0);
INSERT INTO `prepositions` VALUES (129, 'vis-a-vis', 0);
INSERT INTO `prepositions` VALUES (130, 'vis a vis', 0);