suggestions about data structures for a sparse matrix class

A

aaragon

Hello everyone,

Some time ago I coded a sparse matrix class using a hashed container
that mapped a std::pair<size_t,size_t> to a double (i,j representing
the location in the matrix and the double the value). This worked
fine, and it was quite fast, but I had problems when trying to get the
matrix-matrix multiplication so I went ahead and coded a second sparse
matrix class. This time I had a hashed map that mapped the index i
(representing the row) to another hashed map that mapped the j index
to the value. Now the new implementation takes more time of course
because there are two hashing functions involved. I'm wondering if the
creating of m + 1 hash tables (m being the number of rows) has a much
higher overhear than creating just a single hashed container for the
whole thing as in the first implementation.

Does anyone have any suggestions on how to improve this code? I
thought about maybe creating a regular array or an std::vector of hash
tables instead of using a hash table to map the rows, this way I could
avoid one hash function and access the row in constant time. But
before I decided to spend some time coding this, I wanted to ask other
people about it.

Thanks a lot,

aa
 
K

Kai-Uwe Bux

aaragon said:
Hello everyone,

Some time ago I coded a sparse matrix class using a hashed container
that mapped a std::pair<size_t,size_t> to a double (i,j representing
the location in the matrix and the double the value). This worked
fine, and it was quite fast, but I had problems when trying to get the
matrix-matrix multiplication so I went ahead and coded a second sparse
matrix class.

What exactly was the problem?
This time I had a hashed map that mapped the index i
(representing the row) to another hashed map that mapped the j index
to the value. Now the new implementation takes more time of course
because there are two hashing functions involved. I'm wondering if the
creating of m + 1 hash tables (m being the number of rows) has a much
higher overhear than creating just a single hashed container for the
whole thing as in the first implementation.

Does anyone have any suggestions on how to improve this code? I
thought about maybe creating a regular array or an std::vector of hash
tables instead of using a hash table to map the rows, this way I could
avoid one hash function and access the row in constant time. But
before I decided to spend some time coding this, I wanted to ask other
people about it.

One idea is to separate matrix operations from the underlying storage
policy. That way, you can reuse the code for addition, multiplication,
row-ops, etc., and try out hash maps in both arguments, vectors of hash
maps, dense storage, and so on.


BTW: Before rolling your own matrix class, you should just use a linear
algebra library. Those things (especially with floating point entries) are
very tricky. The obvious implementations from your course on linear algebra
can be numerically unstable, and a good choice of algorithms requires some
serious numerical analysis. If possible, hire an expert. The next best
thing is to use a well-tested library (and presumably, the expert would
recommend that anyway; but he would have an opinion to offer on which one
to choose!).


Best

Kai-Uwe Bux
 
A

aaragon

What exactly was the problem?

I found it was necessary to have access to a given row to carry out
the multiplication more efficiently. Having a single hashed container
didn't allow for this information.
One idea is to separate matrix operations from the underlying storage
policy. That way, you can reuse the code for addition, multiplication,
row-ops, etc., and try out hash maps in both arguments, vectors of hash
maps, dense storage, and so on.

Well, I don't think this is possible. I believe that the operations
are very tied to the type of storage they use. For example, if you
want to do matrix-matrix multiplication, you could accomplish this by
doing the ijk three loop, which is O(n^3), but then you loose the
advantages of sparsity (iterate only over non-zero entries in the
matrix).
BTW: Before rolling your own matrix class, you should just use a linear
algebra library. Those things (especially with floating point entries) are
very tricky. The obvious implementations from your course on linear algebra
can be numerically unstable, and a good choice of algorithms requires some
serious numerical analysis. If possible, hire an expert. The next best
thing is to use a well-tested library (and presumably, the expert would
recommend that anyway; but he would have an opinion to offer on which one
to choose!).

I thought about using something else in the beginning, but I also
found that it was fun to implement my own classes as well, and I've
enjoyed doing this. I actually coded an entire interface to BLAS using
expression templates since my first attempt to create a dense matrix
was very slow (using Bjarne Stroustrup's matrix example using
valarrays).
Best

Kai-Uwe Bux


I would still hear about suggestions on data structures to use with a
sparse matrix class. So if you have any, please comment on that.
Thanks for replying too.

aa
 
A

aaragon

I guess I'm missing the value of using hashing in the first place. We
use hashing for things like keyword searches, replacing the text string
with something faster to search on, like an index.  You've already got
the indices, so what's the value of hashing?

I'm sure I must be missing a vital point here.

Jack

P.S. Check my recent columns in Embedded Systems Design, on vector and
matrix classes in C++. There's no "sparseness" about them; everything is
implemented in the most straightforward way.  Still, you might find some
ideas.

Jack

Well, sparse matrices are usually implemented as three arrays.
However, when you're building the matrix, you have to perform search
operations to find out where in the array the elements are going to
be. I decided to implement a sparse matrix class that avoided this
search and yet it allowed constant time access to the elements.

aa
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,968
Messages
2,570,153
Members
46,699
Latest member
AnneRosen

Latest Threads

Top