Concatenating dictionary values and keys, and further operations

Girish Sahani · Jun 5, 2006

I wrote the following code to concatenate every 2 keys of a dictionary and
their corresponding values.
e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get
tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of
features.
Now i want to check each pair to see if they are connected...element of
this pair will be one from the first list and one from the second....e.g
for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and
5,then 2 and 3,then 2 and 4,then 2 and 5.
The information of this connected thing is in a text file as follows:
1,'a',2,'b'
3,'a',5,'a'
3,'a',6,'a'
3,'a',7,'b'
8,'a',7,'b'
..
..
This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected
and so on.
I am not able to figure out how to do this.Any pointers would be helpful
Here is the code i have written till now:

Code:

def genTI(tiDict):
    tiDict1 = {}
    tiList = [tiDict1.keys(),tiDict1.values()]
    length =len(tiDict1.keys())-1
    for i in range(0,length,1):
        for j in range(0,length,1):
            for k in range(1,length+1,1):
                if j+k <= length:
                    key = tiList[i][j] + tiList[i][j+k]
                    value = [tiList[i+1][j],tiList[i+1][j+k]]
                    tiDict2[key] = value
                    continue
                continue
            continue
        return tiDict2

Thanks in advance,
girish

Gerard Flanagan · Jun 5, 2006

Girish said:
I wrote the following code to concatenate every 2 keys of a dictionary and
their corresponding values.
e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get
tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of
features.
Now i want to check each pair to see if they are connected...element of
this pair will be one from the first list and one from the second....e.g
for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and
5,then 2 and 3,then 2 and 4,then 2 and 5.
The information of this connected thing is in a text file as follows:
1,'a',2,'b'
3,'a',5,'a'
3,'a',6,'a'
3,'a',7,'b'
8,'a',7,'b'
.
.
This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected
and so on.
I am not able to figure out how to do this.Any pointers would be helpful

Girish

It seems you want the Cartesian product of every pair of lists in the
dictionary, including the product of lists with themselves (but you
don't say why ;-)).

I'm not sure the following is exactly what you want or if it is very
efficient, but maybe it will start you off. It uses a function
'xcombine' taken from a recipe in the ASPN cookbook by David
Klaffenbach (2004).

(It should give every possibility, which you then check in your file)

Gerard

-------------------------------------------------------------------------

def nkRange(n,k):
m = n - k + 1
indexer = range(0, k)
vector = range(1, k+1)
last = range(m, n+1)
yield vector
while vector != last:
high_value = -1
high_index = -1
for i in indexer:
val = vector
if val > high_value and val < m + i:
high_value = val
high_index = i
for j in range(k - high_index):
vector[j+high_index] = high_value + j + 1
yield vector

def kSubsets( alist, k ):
n = len(alist)
for vector in nkRange(n, k):
ret = []
for i in vector:
ret.append( alist[i-1] )
yield ret

data = { 'a': [1,2], 'b': [3,4,5], 'c': [1,4,7] }

pairs = list( kSubsets(data.keys(),2) ) + [ [k,k] for k in
data.iterkeys() ]
print pairs
for s in pairs:
for t in xcombine( data[s[0]], data[s[1]] ):
print "%s,'%s',%s,'%s'" % ( t[0], s[0], t[1], s[1] )

-------------------------------------------------------------------------

1,'a',1,'c'
1,'a',4,'c'
1,'a',7,'c'
2,'a',1,'c'
2,'a',4,'c'
2,'a',7,'c'
1,'a',3,'b'
1,'a',4,'b'
1,'a',5,'b'
2,'a',3,'b'
2,'a',4,'b'
2,'a',5,'b'
1,'c',3,'b'
1,'c',4,'b'
1,'c',5,'b'
4,'c',3,'b'
4,'c',4,'b'
4,'c',5,'b'
7,'c',3,'b'
7,'c',4,'b'
7,'c',5,'b'
1,'a',1,'a'
1,'a',2,'a'
2,'a',1,'a'
2,'a',2,'a'
1,'c',1,'c'
1,'c',4,'c'
1,'c',7,'c'
4,'c',1,'c'
4,'c',4,'c'
4,'c',7,'c'
7,'c',1,'c'
7,'c',4,'c'
7,'c',7,'c'
3,'b',3,'b'
3,'b',4,'b'
3,'b',5,'b'
4,'b',3,'b'
4,'b',4,'b'
4,'b',5,'b'
5,'b',3,'b'
5,'b',4,'b'
5,'b',5,'b'

Gerard Flanagan · Jun 5, 2006

Gerard said:
Girish said:

I wrote the following code to concatenate every 2 keys of a dictionary and
their corresponding values.
e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get
tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of
features.
Now i want to check each pair to see if they are connected...element of
this pair will be one from the first list and one from the second....e.g
for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and
5,then 2 and 3,then 2 and 4,then 2 and 5.
The information of this connected thing is in a text file as follows:
1,'a',2,'b'
3,'a',5,'a'
3,'a',6,'a'
3,'a',7,'b'
8,'a',7,'b'
.
.
This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected
and so on.
I am not able to figure out how to do this.Any pointers would be helpful

Click to expand...

Girish

It seems you want the Cartesian product of every pair of lists in the
dictionary, including the product of lists with themselves (but you
don't say why ;-)).

I'm not sure the following is exactly what you want or if it is very
efficient, but maybe it will start you off. It uses a function
'xcombine' taken from a recipe in the ASPN cookbook by David
Klaffenbach (2004).

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302478

Girish Sahani · Jun 6, 2006

I have a text file in the following format:

1,'a',2,'b'
3,'a',5,'c'
3,'a',6,'c'
3,'a',7,'b'
8,'a',7,'b'
..
..
..
Now i need to generate 2 things by reading the file:
1) A dictionary with the numbers as keys and the letters as values.
e.g the above would give me a dictionary like
{1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........}
2) A list containing pairs of numbers from each line.
The above formmat would give me the list as
[[1,2],[3,5],[3,6][3,7][8,7]......]

I wrote the following codes for both of these but the problem is that
lines returns a list like ["1,'a',2,'b'","3,'a',5,'c","3,'a',6,'c'".....]
Now due to the "" around each line,it is treated like one object
and i cannot access the elements of a line.

Code:

#code to generate the dictionary
def get_colocations(filename):
    lines = open(filename).read().split("\n")
    colocnDict = {}
    i = 0
    for line in lines:
        if i <= 2:
            colocnDict[line[i]] = line[i+1]
            i+=2
            continue
        return colocnDict

Code:

def genPairs(filename):
    lines = open(filename).read().split("\n")
    pairList = []
    for line in lines:
        pair = [line[0],line[2]]
        pairList.append(pair)
        i+=2
        continue
return pairList

Please help

(

K.S.Sreeram · Jun 6, 2006

Girish said:
1) A dictionary with the numbers as keys and the letters as values.
e.g the above would give me a dictionary like
{1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........}

def get_dict( f ) :
out = {}
for line in file(f) :
n1,s1,n2,s2 = line.split(',')
out.update( { int(n1):s1[1], int(n2):s2[1] } )
return out

2) A list containing pairs of numbers from each line.
The above formmat would give me the list as
[[1,2],[3,5],[3,6][3,7][8,7]......]

def get_pairs( f ) :
out = []
for line in file(f) :
n1,_,n2,_ = line.split(',')
out.append( [int(n1),int(n2)] )
return out

Regards
Sreeram

-----BEGIN PGP SIGNATURE-----
Version: GnuPG v1.4.2.2 (MingW32)
Comment: Using GnuPG with Mozilla - http://enigmail.mozdev.org

iD8DBQFEhQdNrgn0plK5qqURAiVkAJ9Rr0XRRhofIP4Z2eYF1nFvvHTCUgCgmMkM
6U9ieDTmvItGbW8QKUCWrFo=
=wwVC
-----END PGP SIGNATURE-----

John Machin · Jun 6, 2006

I have a text file in the following format:

1,'a',2,'b'
3,'a',5,'c'
3,'a',6,'c'
3,'a',7,'b'
8,'a',7,'b'

Check out the csv module.

.
.
.
Now i need to generate 2 things by reading the file:
1) A dictionary with the numbers as keys and the letters as values.
e.g the above would give me a dictionary like
{1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........}
2) A list containing pairs of numbers from each line.
The above formmat would give me the list as
[[1,2],[3,5],[3,6][3,7][8,7]......]

I wrote the following codes for both of these but the problem is that
lines returns a list like ["1,'a',2,'b'","3,'a',5,'c","3,'a',6,'c'".....]
Now due to the "" around each line,it is treated like one object
and i cannot access the elements of a line.

You managed to split the file contents into lines using
lines = open(filename).read().split("\n")
Same principle applies to each line:

|>>> lines = ["1,'a',2,'b'","3,'a',5,'c","3,'a',6,'c'"]
|>>> lines[0].split(',')
['1', "'a'", '2', "'b'"]
|>>> lines[1].split(',')
['3', "'a'", '5', "'c"]
|>>>

Code:

#code to generate the dictionary
def get_colocations(filename):
lines = open(filename).read().split("\n")
colocnDict = {}
i = 0
for line in lines:
if i <= 2:
colocnDict[line[i]] = line[i+1]
i+=2
continue
return colocnDict[/QUOTE]
The return is indented too far; would return after 1st line.[QUOTE]

Code:

def genPairs(filename):
lines = open(filename).read().split("\n")
pairList = []
for line in lines:
pair = [line[0],line[2]]
pairList.append(pair)
i+=2[/QUOTE]

i is not defined. This would cause an exception. Please *always* post 
the code that you actually ran.
[QUOTE]
continue
return pairList[/QUOTE]

dedented too far!!
[QUOTE]

Please help (

def get_both(filename):
lines = open(filename).read().split("\n")
colocnDict = {}
pairList = []
for line in lines:
n1, b1, n2, b2 = line.split(",")
n1 = int(n1)
n2 = int(n2)
a1 = b1.strip("'")
a2 = b2.strip("'")
colocnDict[n1] = a1
colocnDict[n2] = a2
pairList.append([n1, n2])
return colocnDict, pairList

def get_both_csv(filename):
import csv
reader = csv.reader(open(filename, "rb"), quotechar="'")
colocnDict = {}
pairList = []
for n1, a1, n2, a2 in reader:
n1 = int(n1)
n2 = int(n2)
colocnDict[n1] = a1
colocnDict[n2] = a2
pairList.append([n1, n2])
return colocnDict, pairList

HTH,
John

Gerard Flanagan · Jun 6, 2006

Girish said:
Gerard said:

Girish said:

I wrote the following code to concatenate every 2 keys of a dictionary and
their corresponding values.
e.g if i have tiDict1 = tiDict1 = {'a':[1,2],'b':[3,4,5]} i should get
tiDict2={'ab':[1,2][3,4,5]} and similarly for dicts with larger no. of
features.
Now i want to check each pair to see if they are connected...element of
this pair will be one from the first list and one from the second....e.g
for 'ab' i want to check if 1 and 3 are connected,then 1 and 4,then 1 and
5,then 2 and 3,then 2 and 4,then 2 and 5.
The information of this connected thing is in a text file as follows:
1,'a',2,'b'
3,'a',5,'a'
3,'a',6,'a'
3,'a',7,'b'
8,'a',7,'b'
.
.
This means 1(type 'a') and 2(type 'b') are connected,3 and 5 are connected
and so on.
I am not able to figure out how to do this.Any pointers would be

Click to expand...

helpful

Girish

It seems you want the Cartesian product of every pair of lists in the
dictionary, including the product of lists with themselves (but you
don't say why ;-)).

I'm not sure the following is exactly what you want or if it is very
efficient, but maybe it will start you off. It uses a function
'xcombine' taken from a recipe in the ASPN cookbook by David
Klaffenbach (2004).

http://aspn.activestate.com/ASPN/Cookbook/Python/Recipe/302478

Click to expand...

Thanks a lot Gerard and Roberto.but i think i should explain the exact
thing with an example.
Roberto what i have right now is concatenating the keys and the
corresponding values:
e.g {'a':[1,2],'b':[3,4,5],'c':[6,7]} should give me
{'ab':[1,2][3,4,5] 'ac':[1,2][6,7] 'bc':[3,4,5][6,7]}
The order doesnt matter here.It could be 'ac' followed by 'bc' and 'ac'.
Also order doesnt matter in a string:the pair 'ab':[1,2][3,4,5] is same as
'ba':[3,4,5][1,2].
This representation means 'a' corresponds to the list [1,2] and 'b'
corresponds to the list [3,4,5].
Now, for each key-value pair,e.g for 'ab' i must check each feature in the
list of 'a' i.e. [1,2] with each feature in list of 'b' i.e. [3,4,5].So I
want to take cartesian product of ONLY the 2 lists [1,2] and [3,4,5].
Finally i want to check each pair if it is present in the file,whose
format i had specified.
The code Gerard has specified takes cartesian products of every 2 lists.

Hi Garish,

it's better to reply to the Group.

Now, for each key-value pair,e.g for 'ab' i must check each feature in the
list of 'a' i.e. [1,2] with each feature in list of 'b' i.e. [3,4,5].So I
want to take cartesian product of ONLY the 2 lists [1,2] and [3,4,5].

I'm confused. You say *for each* key-value pair, and you wrote above
that the keys were the 'concatenation' of "every 2 keys of a
dictionary".

Sorry, too early for me. Maybe if you list every case you want, given
the example data.

All the best.

Gerard

Girish Sahani · Jun 6, 2006

Really sorry for that indentation thing

I tried out the code you have given, and also the one sreeram had written.
In all of these,i get the same error of this type:
Error i get in Sreeram's code is:
n1,_,n2,_ = line.split(',')
ValueError: need more than 1 value to unpack

And error i get in your code is:
for n1, a1, n2, a2 in reader:
ValueError: need more than 0 values to unpack

Any ideas why this is happening?

Thanks a lot,
girish

John Machin · Jun 6, 2006

Really sorry for that indentation thing
I tried out the code you have given, and also the one sreeram had written.
In all of these,i get the same error of this type:
Error i get in Sreeram's code is:
n1,_,n2,_ = line.split(',')
ValueError: need more than 1 value to unpack

And error i get in your code is:
for n1, a1, n2, a2 in reader:
ValueError: need more than 0 values to unpack

Any ideas why this is happening?

In the case of my code, this is consistent with the line being empty,
probably the last line. As my mentor Bruno D. would say, your test data
does not match your spec

Which do you want to change, the spec or
the data?

You can change my csv-reading code to detect dodgy data like this (for
example):

for row in reader:
if not row:
continue # ignore empty lines, wherever they appear
if len(row) != 4:
raise ValueError("Malformed row %r" % row)
n1, a1, n2, a2 = row

In the case of Sreeram's code, perhaps you could try inserting
print "line = ", repr(line)
before the statement that is causing the error.

skip · Jun 6, 2006

Girish> I have a text file in the following format:
Girish> 1,'a',2,'b'
Girish> 3,'a',5,'c'
Girish> 3,'a',6,'c'
Girish> 3,'a',7,'b'
Girish> 8,'a',7,'b'
Girish> .
Girish> .
Girish> .
Girish> Now i need to generate 2 things by reading the file:
Girish> 1) A dictionary with the numbers as keys and the letters as values.
Girish> e.g the above would give me a dictionary like
Girish> {1:'a', 2:'b', 3:'a', 5:'c', 6:'c' ........}
Girish> 2) A list containing pairs of numbers from each line.
Girish> The above formmat would give me the list as
Girish> [[1,2],[3,5],[3,6][3,7][8,7]......]

Running this:

open("some.text.file", "w").write("""\
1,'a',2,'b'
3,'a',5,'c'
3,'a',6,'c'
3,'a',7,'b'
8,'a',7,'b'
""")

import csv

class dialect(csv.excel):
quotechar = "'"
reader = csv.reader(open("some.text.file", "rb"), dialect=dialect)
mydict = {}
mylist = []
for row in reader:
numbers = [int(n) for n in row[::2]]
letters = row[1::2]
mydict.update(dict(zip(numbers, letters)))
mylist.append(numbers)

print mydict
print mylist

import os

os.unlink("some.text.file")

displays this:

{1: 'a', 2: 'b', 3: 'a', 5: 'c', 6: 'c', 7: 'b', 8: 'a'}
[[1, 2], [3, 5], [3, 6], [3, 7], [8, 7]]

That seems to be approximately what you're looking for.

Skip

Girish Sahani · Jun 7, 2006

In the case of my code, this is consistent with the line being empty,
probably the last line. As my mentor Bruno D. would say, your test data
does not match your spec Which do you want to change, the spec or
the data?

Thanks John, i just changed my Data file so as not to contain any empty
lines, i guess that was the easier solution

Roberto Bonvallet · Jun 7, 2006

Girish said, through Gerard's forwarded message:

Thanks a lot Gerard and Roberto.but i think i should explain the exact
thing with an example.
Roberto what i have right now is concatenating the keys and the
corresponding values:
e.g {'a':[1,2],'b':[3,4,5],'c':[6,7]} should give me
{'ab':[1,2][3,4,5] 'ac':[1,2][6,7] 'bc':[3,4,5][6,7]}
The order doesnt matter here.It could be 'ac' followed by 'bc' and 'ac'.
Also order doesnt matter in a string:the pair 'ab':[1,2][3,4,5] is same as
'ba':[3,4,5][1,2].
This representation means 'a' corresponds to the list [1,2] and 'b'
corresponds to the list [3,4,5].

Click to expand...

The problem if that the two lists aren't distinguishable when
concatenated, so what you get is [1, 2, 3, 4, 5]. You have to pack
both lists in a tuple: {'ab': ([1, 2], [3, 4, 5]), ...}

d = {'a':[1, 2], 'b':[3, 4, 5], 'c':[6, 7]}
d2 = dict(((i + j), (d, d[j])) for i in d for j in d if i < j)
d2

Click to expand...

Click to expand...

{'ac': ([1, 2], [6, 7]), 'ab': ([1, 2], [3, 4, 5]), 'bc': ([3, 4, 5], [6, 7])}

Now, for each key-value pair,e.g for 'ab' i must check each feature in the
list of 'a' i.e. [1,2] with each feature in list of 'b' i.e. [3,4,5].So I
want to take cartesian product of ONLY the 2 lists [1,2] and [3,4,5].

Click to expand...

Click to expand...

You can do this without creating an additional dictionary:

d = {'a':[1, 2], 'b':[3, 4, 5], 'c':[6, 7]}
pairs = [i + j for i in d for j in d if i < j]
for i, j in pairs:

Click to expand...

Click to expand...

Click to expand...

.... cartesian_product = [(x, y) for x in d for y in d[j]]
.... print i + j, cartesian_product
....
ac [(1, 6), (1, 7), (2, 6), (2, 7)]
ab [(1, 3), (1, 4), (1, 5), (2, 3), (2, 4), (2, 5)]
bc [(3, 6), (3, 7), (4, 6), (4, 7), (5, 6), (5, 7)]

You can do whatever you want with this cartesian product inside the loop.

I don't understand the semantics of the file format, so I leave this
as an exercise to the reader
Best regards.

C program: memory leak/ segmentation fault/ memory limit exceeded	0	Nov 12, 2022
How to multiply two matrices of size in using inline assembly in C++	3	Mar 3, 2024
Need help for javascript code	3	Sep 28, 2022
All CRUD operations work except POST. Why?	2	May 28, 2023
Dictionary and List	1	Apr 26, 2021
Mutability issue	1	Dec 11, 2023
Difference between using "let" in a "for" loop	0	Jul 3, 2022
accessing dictionary keys	3	Oct 1, 2009

Concatenating dictionary values and keys, and further operations

Girish Sahani

Gerard Flanagan

Gerard Flanagan

Girish Sahani

K.S.Sreeram

John Machin

Gerard Flanagan

Girish Sahani

John Machin

skip

Girish Sahani

Roberto Bonvallet

Ask a Question

Similar Threads

Members online

Forum statistics

Latest Threads