Better Linked List Handling

  • Thread starter Jonathan Bartlett
  • Start date
B

Barry Schwarz

Just finished a new IBM DeveloperWorks article on linked lists, and
thought you all might be interested. It's not an introduction -- it
instead covers some of the more interesting aspects of linked lists.

http://www.ibm.com/developerworks/linux/library/l-listproc/

Maybe you want to run it through a good lint processor to eliminate
the undefined behavior. While you are at it, you might also get rid
of the useless cast which serves only as a bad example.


<<Remove the del for email>>
 
D

Derrick Coetzee

Jonathan said:
Just finished a new IBM DeveloperWorks article on linked lists, and
thought you all might be interested. It's not an introduction -- it
instead covers some of the more interesting aspects of linked lists.

http://www.ibm.com/developerworks/linux/library/l-listproc/

I like the emphasis on Scheme, and the explanation of tail-sharing.
Defining NULL as 0 might confuse some people, although it's quite
correct (see the comp.lang.c FAQ question 5.5 and related); it's safe to
assume they know about NULL.

There's also a better way to do the generic linked list data structure,
assuming that the value in each node is fixed (read-only): by placing
the next pointer first, then allocating just enough for the specific
data object, you can save space (allocate
offsetof(data)+sizeof(whatever)). The read-only restriction isn't too
big of a deal, since you can cut out an old node and put in a new one.

The tags are also quite wasteful of space, since most lists are of data
of a single type; for these list it's better to put the tag in the
list's data structure. If you are really interested in heterogenous
lists, one effective approach is to allocate the nodes out of pools,
where each pool only contains data of one type. You can determine which
pool an object is in by pointer comparison. Then you have a small table
mapping pool number to tag. As pools fill up you make more (getting
larger as you make more) and expand the table.

Finally, I think cdr coding and unrolled linked lists definitely deserve
a mention - these are two common solutions to the nasty space overhead
and cache problems associated with simple linked lists, although they
complicate some operations.
 
C

CBFalconer

Barry said:
Maybe you want to run it through a good lint processor to eliminate
the undefined behavior. While you are at it, you might also get rid
of the useless cast which serves only as a bad example.

Looks to me as if he has largely renamed lisp as scheme, and
avoided giving credit.
 
P

pete

Jonathan said:
Just finished a new IBM DeveloperWorks article on linked lists, and
thought you all might be interested. It's not an introduction -- it
instead covers some of the more interesting aspects of linked lists.

http://www.ibm.com/developerworks/linux/library/l-listproc/

You have:

Listing 4. Linked list using generic nodes

struct generic_node {
data the_data;
struct generic_node *next;
};

.... but


struct generic_node {
void *data;
struct generic_node *next;
};

is really more generic.
It's especially useful if data points to a structure
which has pointers to lists with different types of data.



struct generic_node *list_rev(struct generic_node *head)
{
struct generic_node *previous, *next_node;

if (head != NULL) {
next_node = NULL;
while (head -> next != NULL) {
previous = head;
head = previous -> next;
previous -> next = next_node;
next_node = previous;
}
head -> next = next_node;
}
return head;
}

void list_free(struct generic_node *node, void (*free_data)(void *))
{
struct generic_node *next_node;

while (node != NULL) {
next_node = node -> next;
free_data(node -> data);
free(node);
node = next_node;
}
}

struct generic_node *list_sort(struct generic_node *head,
int (*compar)(const struct generic_node *,
const struct generic_node *))
{
return node_sort(head, list_count(head), compar);
}

size_t list_count(struct generic_node *head)
{
size_t count;

for (count = 0; head != NULL; head = head -> next) {
++count;
}
return count;
}

struct generic_node *node_sort(struct generic_node *head, size_t count,
int (*compar)(const struct generic_node *,
const struct generic_node *))
{
size_t half;
struct generic_node *tail;

if (count > 1) {
half = count / 2;
tail = list_split(head, half);
tail = node_sort(tail, count - half, compar);
head = node_sort(head, half, compar);
head = list_merge(head, tail, compar);
}
return head;
}

struct generic_node *list_split(struct generic_node *head, size_t count)
{
struct generic_node *tail;

while (count-- > 1) {
head = head -> next;
}
tail = head -> next;
head -> next = NULL;
return tail;
}

struct generic_node *list_merge(struct generic_node *head, struct
generic_node *tail,
int (*compar)(const struct generic_node *,
const struct generic_node *))
{
struct generic_node *list, *sorted, **node;

node = compar(head, tail) > 0 ? &tail : &head;
sorted = list = *node;
*node = sorted -> next;
while (*node != NULL) {
node = compar(head, tail) > 0 ? &tail : &head;
sorted -> next = *node;
sorted = sorted -> next;
*node = sorted -> next;
}
sorted -> next = head != NULL ? head : tail;
return list;
}

int list_sorted(struct generic_node *list,
int (*compar)(const struct generic_node *,
const struct generic_node *))
{
if (list != NULL) {
while (list -> next != NULL) {
if (compar(list, list -> next) > 0) {
break;
}
list = list -> next;
}
}
return list == NULL || list -> next == NULL;
}
 
K

Keith Thompson

Derrick Coetzee said:
Defining NULL as 0 might confuse some people, although it's quite
correct (see the comp.lang.c FAQ question 5.5 and related); it's
safe to assume they know about NULL.
[...]

Actually, defining NULL at all in user code is a bad idea.
Just #include <stdlib.h>.
 
K

Keith Thompson

CBFalconer said:
Looks to me as if he has largely renamed lisp as scheme, and
avoided giving credit.

Actually, Scheme is a well known Lisp dialect. It was developed in
the 1970s.
 
D

Derrick Coetzee

pete said:
struct generic_node {
data the_data;
struct generic_node *next;
};

... but

struct generic_node {
void *data;
struct generic_node *next;
};

is really more generic.

Neither is typesafe, but the first has the notable advantage of internal
storage, which for the small data the author considers is quite a
significant advantage in space overhead (especially with your typical
high-overhead memory allocator), as well as having cache benefits and
saving lots of dereferences.

It's also relatively simple to have a linked list datatype generic to
all data types and also using internal storage (and typesafe too), by
using macro-generated inline functions as an interface to a library of
linked list functions parameterized by data size.
 
J

johnnyb

Someone on the Scheme list said that the article series is like a HOWTO
for Greenspun's tenth law of programming :)
 
A

Alan Balmer

Looks to me as if he has largely renamed lisp as scheme, and
avoided giving credit.

Scheme is a lisp derivative, but it's been around since 1975. It's
much smaller than Lisp, and the OP doesn't claim to have invented it.
Another such was named Plan - I remember a paper entitled "Why
Planning is Better Than Scheming." Or maybe the other way around.
 
R

Richard Tobin

Another such was named Plan - I remember a paper entitled "Why
Planning is Better Than Scheming." Or maybe the other way around.

I think you're thinking of "Why Conniving is Better than Planning":

ftp://publications.ai.mit.edu/ai-publications/0-499/AIM-255.ps

though I suppose there may have been a similarly named paper about
Scheme.

Scheme has been standardized by the IEEE. See:

http://www.cs.indiana.edu/scheme-repository/doc.standards.html

-- Richard
 
P

pete

Derrick said:
Neither is typesafe,
but the first has the notable advantage of internal
storage, which for the small data the author considers is quite a
significant advantage in space overhead (especially with your typical
high-overhead memory allocator), as well as having cache benefits and
saving lots of dereferences.

It's also relatively simple to have a linked list datatype generic to
all data types and also using internal storage (and typesafe too), by
using macro-generated inline functions as an interface to a library of
linked list functions parameterized by data size.

The void *data; is handy for lists of lists.
Then you can use the same functions on different kinds of lists
in the same program.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
474,155
Messages
2,570,871
Members
47,401
Latest member
CliffGrime

Latest Threads

Top