intersection of 2 strings

A

Antoine Logean

Hi,

What is the easiest way to get the intersection of two strings in
python (a kind a "and" operator) ?
ex:

string_1 = "the_car_of_my_fried_is_bigger_as_mine_but_my_girlfriend_is_more_beautifull"

string_2 =
"my_girlfriend_is_more_beautifull_and_has_blue_eyes"

and the intersection :
string_1 "and" string_2 = "my_girlfriend_is_more_beautifull"

thanks for your help

Antoine
 
L

Larry Bates

At the risk of doing your homework,
one way would be:

list_1=string_1.split('_')
list_2=string_2.split('_')
#
# The next line uses some "magic" to eliminate duplicates"
#
list_3=[x for x in list_1 if x in list_2 if x not in
locals()['_[1]'].__self__]
string_3="_".join(list_3)
print string_3

If you have Python 2.3 you also could use sets() module
and it would be even easier.

HTH,
Larry Bates
 
C

Cousin Stanley

Antoine said:
Hi,

What is the easiest way to get the intersection of two strings in
python (a kind a "and" operator) ?
ex:

string_1 =
"the_car_of_my_fried_is_bigger_as_mine_but_my_girlfriend_is_more_beautifull"

string_2 =
"my_girlfriend_is_more_beautifull_and_has_blue_eyes"

and the intersection :
string_1 "and" string_2 = "my_girlfriend_is_more_beautifull"

thanks for your help

Antoine ....

You might try the sets module ....

The following provides the intersection,
but I don't know how to maintain the order
from the original strings ....

sk@cpq1 : /mnt/win_k/Python/py_Work/Sets $ python
Python 2.3.4 (#2, Jul 5 2004, 09:15:05)
[GCC 3.3.4 (Debian 1:3.3.4-2)] on linux2
Type "help", "copyright", "credits" or "license" for more information.
str_1 = "the_car_of_my_fried_is_bigger_as_mine_but_my_girlfriend_is_more_beautifull"

str_2 = "my_girlfriend_is_more_beautifull_and_has_blue_eyes"

list_1 = str_1.split( '_' )
list_2 = str_2.split( '_' )

import sets

set_1 = sets.Set( list_1 )
set_2 = sets.Set( list_2 )

is_12 = set_1.intersection( set_2 )

print is_12 Set(['girlfriend', 'is', 'my', 'beautifull', 'more'])
 
?

=?ISO-8859-1?Q?J=F8rgen_Cederberg?=

Antoine said:
Hi,

What is the easiest way to get the intersection of two strings in
python (a kind a "and" operator) ?
ex:

string_1 = "the_car_of_my_fried_is_bigger_as_mine_but_my_girlfriend_is_more_beautifull"

string_2 =
"my_girlfriend_is_more_beautifull_and_has_blue_eyes"

and the intersection :
string_1 "and" string_2 = "my_girlfriend_is_more_beautifull"

thanks for your help

Antoine

Hi

difflib seems to appropiate, http://docs.python.org/lib/module-difflib.html
and especially
http://docs.python.org/lib/sequence-matcher.html

Here is some code that works

from difflib import SequenceMatcher
s1 = "my_girlfriend_is_a_python"
s2 = "my_girlfriend_is_more_beautifull"
m = SequenceMatcher(None, s1, s2)
print "Matches between: %s and %s" %(s1,s2)
for match in m.get_matching_blocks():
i,j,n = match
if n>1: # We don't want to match single chars.
print s1[i:i+n]

HTH
Jorgen Cederberg
 
T

Tim Churches

Hi,

What is the easiest way to get the intersection of two strings in
python (a kind a "and" operator) ?
ex:

string_1 = "the_car_of_my_fried_is_bigger_as_mine_but_my_girlfriend_is_more_beautifull"

string_2 =
"my_girlfriend_is_more_beautifull_and_has_blue_eyes"

and the intersection :
string_1 "and" string_2 = "my_girlfriend_is_more_beautifull"

thanks for your help
From your example, I suspect that you really want to find the "longest
common subsequence" of the two strings, rather than the intersection,
because surely the first occurrences of "my" and "is" in string_1 also
qualify for being in the the intersection set of string_1 and string_2.
It is a bit hard to know what you mean by "intersection" when it is not
clear whether you regard the strings as sets of characters or sets of
words.

There is a public domain implementation of an LCS algorithm by Yusuke
Shinyama at http://www.unixuser.org/~euske/python/lcs.py - and it
produces the result you are expecting from your test data.

--

Tim C

PGP/GnuPG Key 1024D/EAF993D0 available from keyservers everywhere
or at http://members.optushome.com.au/tchur/pubkey.asc
Key fingerprint = 8C22 BF76 33BA B3B5 1D5B EB37 7891 46A9 EAF9 93D0
 
P

Peter Abel

Hi,

What is the easiest way to get the intersection of two strings in
python (a kind a "and" operator) ?
ex:

string_1 = "the_car_of_my_fried_is_bigger_as_mine_but_my_girlfriend_is_more_beautifull"

string_2 =
"my_girlfriend_is_more_beautifull_and_has_blue_eyes"

and the intersection :
string_1 "and" string_2 = "my_girlfriend_is_more_beautifull"

thanks for your help

Antoine

If you take Jørgen Cederberg's solution you can inherit from
str and create your own "&" operator.
.... def __and__(self,other):
.... m = SequenceMatcher(None, self, other)
.... equals=[]
.... for (i,j,n) in m.get_matching_blocks():
.... if n>1:
.... equals.append(self[i:i+n])
.... return equals
.... ['my_girlfriend_is_more_beautifull']

Regrads
Peter
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,202
Messages
2,571,057
Members
47,666
Latest member
selsetu

Latest Threads

Top