Algorithm for performing a rollup

C

Chris

This is a really basic question for an experienced programmer to ask,
but does anyone have a preferred algorithm for doing a rollup of items
in a list? I find myself writing ugly code over and over again to do this.

As a simple example, let's say we have an array of strings, sorted, and
we want to get list of the unique strings along with the count for each.
Sort of the way you do with a SQL "group by" clause:

input:
{"A", "A", "A", "B", "B", "C", "D", "D"}

output:
"A", 3
"B", 2
"C", 1
"D", 2

This is what I usually do:
***********************************************
String [] array = {"A", "A", "A", "B", "B", "C", "D", "D"};

String str = null;
String prevStr = null;
int count = 1;

for (int i = 0; i < array.length; i++) {
str = array;

if (str.equals(prevStr)) {
count++;
} else {
if (prevStr != null) {
System.out.println(prevStr + " " + count);
count = 1;
}
prevStr = str;
}
}

// catch the last one
if (str != null) {
System.out.println(prevStr + " " + count);
}
*************************************************

This is just plain ugly. I really hate the fact that I have to dump the
results of the rollup in two places.

To get rid of "catch the last one", I could try something that peeks at
the next element to see if it marks the start of a new run. The trouble
is, this peek can sometimes be expensive (depending what getting the
next element entails), and sometimes it's impossible. It can be
impossible if you're using an Iterator, because calling iterator.next()
burns the element.

Anybody got a better way?
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Chris said:
This is a really basic question for an experienced programmer to ask,
but does anyone have a preferred algorithm for doing a rollup of items
in a list? I find myself writing ugly code over and over again to do this.

As a simple example, let's say we have an array of strings, sorted, and
we want to get list of the unique strings along with the count for each.
Sort of the way you do with a SQL "group by" clause:

input:
{"A", "A", "A", "B", "B", "C", "D", "D"}

output:
"A", 3
"B", 2
"C", 1
"D", 2

This is what I usually do:
***********************************************
String [] array = {"A", "A", "A", "B", "B", "C", "D", "D"};

String str = null;
String prevStr = null;
int count = 1;

for (int i = 0; i < array.length; i++) {
str = array;

if (str.equals(prevStr)) {
count++;
} else {
if (prevStr != null) {
System.out.println(prevStr + " " + count);
count = 1;
}
prevStr = str;
}
}

// catch the last one
if (str != null) {
System.out.println(prevStr + " " + count);
}
*************************************************

This is just plain ugly. I really hate the fact that I have to dump the
results of the rollup in two places.


You could store string and count in an ArrayList and print at the end.

Arne
 
?

=?ISO-8859-1?Q?Arne_Vajh=F8j?=

Arne said:
You could store string and count in an ArrayList and print at the end.

If you used HashMap the array did not need to be sorted (but you may
need to sort keys when printing).

Arne
 
S

Stefan Ram

Chris said:
input: {"A", "A", "A", "B", "B", "C", "D", "D"} output:
"A", 3
"B", 2
"C", 1
"D", 2

I have not studied your code, I just took the first approach
which would come to my mind:

public class Main
{ public static void main( final java.lang.String[] args )
{ final java.lang.String source[] =
{ "A", "A", "A", "B", "B", "C", "D", "D", null };
for( int i = 0; i < source.length - 1; )
{ int j = i;
while( source[ i ].equals( source[ j ]))++j;
java.lang.System.out.println( "\"" + source[ i ]+ "\", " +( j - i ));
i = j; }}}

Your code does not fulfill the output-specification,
because you do not write the text enclosed in double quotes.

The »null« is called a »sentinel«. Search for »sentinel« in

http://www.oberon.ethz.ch/WirthPubl/AD.pdf
http://www.oberon.ethz.ch/WirthPubl/ProgInOberon.pdf

However, without the sentinel only

while( source[ i ].equals( source[ j ] ))++j;

would have to be slightly extended to

while( j < source.length && source[ i ].equals( source[ j ]))++j;

and the »- 1« would have to be removed.
 
R

Red Orchid

Message-ID: said:
input:
{"A", "A", "A", "B", "B", "C", "D", "D"}

output:
"A", 3
"B", 2
"C", 1
"D", 2

This is what I usually do:
***********************************************

[snip]

// catch the last one
if (str != null) {
System.out.println(prevStr + " " + count);
}
*************************************************

This is just plain ugly. I really hate the fact that I have to dump the
results of the rollup in two places.


I propose the following code:

<code 0>
void dumpArrayDuplicateInfo(String [] a) {
int dcount = 1;
for (int i = 0; i < a.length - 1; i++) {
if (a.equals(a[i + 1])) {
dcount++;
continue;
}
print(a, dcount);
dcount = 1;
}
if (a.length != 0) {
print(a[a.length - 1], dcount);
}
}

void print(String e, int dcount) {
System.out.println(e + " " + dcount);
}
</code 0>


But, if you hate dumping in two places, maybe, the following
code is possible:

<code 1>
void dumpArrayInfo(String [] arr) {
String[] a = new String[arr.length + 1];
System.arraycopy(arr, 0, a, 0, arr.length);

int dcount = 1;
for (int i = 0; i < a.length - 1; i++) {
if (a.equals(a[i + 1])) {
dcount++;
continue;
}
System.out.println(e + " " + dcount);
dcount = 1;
}
}
</code 1>
 
C

Chris Uppal

Chris said:
This is a really basic question for an experienced programmer to ask,
but does anyone have a preferred algorithm for doing a rollup of items
in a list? I find myself writing ugly code over and over again to do this.

"writing ugly code over and over again" -- that's a /big/ mistake. Even if the
code were as beautiful as the day, it would still be a mistake that a halfway
decent programmer should not make.

It seems as if a utility method which creates a LinkedHashMap<X, Integer> would
do the trick quite nicely. Alternatively, if you don't want the expense of
creating an extra Collection, you could set up a little helper class as in the
attached code; with it you can say:

for (Group<String> g : Group.groups(args))
System.out.println(g);

where (in this example) "args" is a List<String> or String[] array.

Code follows. All sorts of elaborations are possible.

BTW, it looks a bit long, but it's really very short -- almost no real code,
just lots of Java-ish padding...

-- chris


=========== Group.java ===========
import java.util.*;

public class Group<X>
{
private final X m_value;
private final int m_count;

public // ctor
Group(X value, int count)
{
m_value = value;
m_count = count;
}

public X
getValue()
{
return m_value;
}

public int
getCount()
{
return m_count;
}

public String
toString()
{
return m_count + " * " + m_value;
}

public static <T>
GroupingIteratable<T>
groups(List<T> list)
{
return new GroupingIteratable<T>(list);
}

public static <T>
GroupingIteratable<T>
groups(T... elements)
{
return new GroupingIteratable<T>(elements);
}
}
======= GroupingIteratable.java =======
import java.util.*;

public class GroupingIteratable<T>
implements Iterable<Group<T>>
{
private final List<T> m_list;

public // ctor
GroupingIteratable(List<T> list)
{
m_list = list;
}

public // ctor
GroupingIteratable(T... elements)
{
this(Arrays.asList(elements));
}

public Iterator<Group<T>>
iterator()
{
return new GroupingIterator<T>(m_list.iterator());
}
}
======= GroupingIterator.java =======
import java.util.*;

public class GroupingIterator<T>
implements Iterator<Group<T>>
{
private final Iterator<T> m_iterator;
private boolean m_hasReadAhead;
private T m_readAhead;

public // ctor
GroupingIterator(Iterator<T> iterator)
{
m_iterator = iterator;
m_hasReadAhead = false;
}

public boolean
hasNext()
{
return m_hasReadAhead || m_iterator.hasNext();
}

public Group<T>
next()
{
T current = m_hasReadAhead
? m_readAhead
: m_iterator.next();
m_hasReadAhead = false;
m_readAhead = null;
int count = 1;

while (m_iterator.hasNext())
{
T next = m_iterator.next();
if (!current.equals(next))
{
m_readAhead = next;
m_hasReadAhead = true;
break;
}

count++;
}

return new Group<T>(current, count);
}

public void
remove()
{
throw new UnsupportedOperationException();
}
}
===============================
 
C

Chris

Stefan said:
Chris said:
input: {"A", "A", "A", "B", "B", "C", "D", "D"} output:
"A", 3
"B", 2
"C", 1
"D", 2

I have not studied your code, I just took the first approach
which would come to my mind:

public class Main
{ public static void main( final java.lang.String[] args )
{ final java.lang.String source[] =
{ "A", "A", "A", "B", "B", "C", "D", "D", null };
for( int i = 0; i < source.length - 1; )
{ int j = i;
while( source[ i ].equals( source[ j ]))++j;
java.lang.System.out.println( "\"" + source[ i ]+ "\", " +( j - i ));
i = j; }}}

Thank you. That is more elegant. In cases where a sentinel is possible,
it can work. It does do a peek() at the next element, though, which
won't work in many cases in my code.
 
C

Chris

Red said:
Message-ID: said:
input:
{"A", "A", "A", "B", "B", "C", "D", "D"}

output:
"A", 3
"B", 2
"C", 1
"D", 2

This is what I usually do:
***********************************************

[snip]

// catch the last one
if (str != null) {
System.out.println(prevStr + " " + count);
}
*************************************************

This is just plain ugly. I really hate the fact that I have to dump the
results of the rollup in two places.


I propose the following code:

<code 0>
void dumpArrayDuplicateInfo(String [] a) {
int dcount = 1;
for (int i = 0; i < a.length - 1; i++) {
if (a.equals(a[i + 1])) {
dcount++;
continue;
}
print(a, dcount);
dcount = 1;
}
if (a.length != 0) {
print(a[a.length - 1], dcount);
}
}

void print(String e, int dcount) {
System.out.println(e + " " + dcount);
}
</code 0>


But, if you hate dumping in two places, maybe, the following
code is possible:

<code 1>
void dumpArrayInfo(String [] arr) {
String[] a = new String[arr.length + 1];
System.arraycopy(arr, 0, a, 0, arr.length);

int dcount = 1;
for (int i = 0; i < a.length - 1; i++) {
if (a.equals(a[i + 1])) {
dcount++;
continue;
}
System.out.println(e + " " + dcount);
dcount = 1;
}
}
</code 1>

Thanks. Similar to the code from Stefan above. Works when you can get at
elements in the array cheaply, but doesn't work when a peek ahead is
expensive or when you're iterating over elements using an Iterator.
 
L

Lew

Arne gave the hint with the Map idea. Here's a version of how I would attempt
such a thing:

First off, let's put the Strings into a List so we can go all Collections.

public class Foo
{

public static void main( String [] args )
{
List< String > starters =
Arrays.asList( "A", "A", "A", "B", "B", "C", "D", "D" );

Map< String, Integer > kounts = new HashMap();
for( String key : starters )
{
Integer k = kounts.get( key );
if ( k == null )
{
kounts.put( key, 1 );
}
else
{
kounts.put( key, k + 1 );
}
}

List< String > outters = new ArrayList< String >();
outters.addAll( kounts.keySet() );
Collections.sort( outters );

for ( String key : outters )
{
System.out.println( "\""+ key + "\", "+ kounts.get( key ));
}
}
}

-- Lew
 
L

Lew

I'd push things into Collections.

public class Foo
{

public static void main( String [] args )
{
List< String > starters =
Arrays.asList( "A", "A", "A", "B", "B", "C", "D", "D" );

Map< String, Integer > kounts = new HashMap< String, Integer >();
for( String key : starters )
{
Integer k = kounts.get( key );
kounts.put( key, Integer.valueOf(k == null? 1 : k + 1 ));
}

List< String > outters = new ArrayList< String >();
outters.addAll( kounts.keySet() );
Collections.sort( outters );

for ( String key : outters )
{
System.out.println( "\""+ key + "\", "+ kounts.get( key ));
}

}

}

-- Lew
 
S

Stefan Ram

Chris said:
Thank you. That is more elegant. In cases where a sentinel is possible,
it can work. It does do a peek() at the next element, though, which
won't work in many cases in my code.

OK, I remove the sentinel and the peek:

public class Main
{
final static java.lang.String source[] =
{ "A", "A", "A", "B", "B", "C", "D", "D" };
static int position = 0;
static java.lang.String getNext(){ return source[ position++ ]; }
static boolean hasNext(){ return position < source.length; }

public static void main( final java.lang.String[] args )
{ java.lang.String first = null;
java.lang.String next = null;
while( true )
{ if( next != null ){ first = next; next = null; }
else if( hasNext() )first = getNext(); else break;
int count = 1;
while( true )if( !hasNext() ){ next = null; break; }
else if( first.equals( next = getNext() ))++count; else break;
java.lang.System.out.println( "\"" + first + "\", " + count ); }}}
 
L

Lew

Stefan said:
Chris said:
Thank you. That is more elegant. In cases where a sentinel is possible,
it can work. It does do a peek() at the next element, though, which
won't work in many cases in my code.

OK, I remove the sentinel and the peek:

public class Main
{
final static java.lang.String source[] =
{ "A", "A", "A", "B", "B", "C", "D", "D" };
static int position = 0;
static java.lang.String getNext(){ return source[ position++ ]; }
static boolean hasNext(){ return position < source.length; }

public static void main( final java.lang.String[] args )
{ java.lang.String first = null;
java.lang.String next = null;
while( true )
{ if( next != null ){ first = next; next = null; }
else if( hasNext() )first = getNext(); else break;
int count = 1;
while( true )if( !hasNext() ){ next = null; break; }
else if( first.equals( next = getNext() ))++count; else break;
java.lang.System.out.println( "\"" + first + "\", " + count ); }}}

This is so complicated compared to Arne's suggestion of using a Map.

-- Lew
 
L

Lew

Stefan said:
public class Main
{
final static java.lang.String source[] =
{ "A", "A", "A", "B", "B", "C", "D", "D" };
static int position = 0;
static java.lang.String getNext(){ return source[ position++ ]; }
static boolean hasNext(){ return position < source.length; }

public static void main( final java.lang.String[] args )
{ java.lang.String first = null;
java.lang.String next = null;
while( true )
{ if( next != null ){ first = next; next = null; }
else if( hasNext() )first = getNext(); else break;
int count = 1;
while( true )if( !hasNext() ){ next = null; break; }
else if( first.equals( next = getNext() ))++count; else break;
java.lang.System.out.println( "\"" + first + "\", " + count ); }}}

Consider using indentation to make your code readable, as per the Sun Java
Coding Conventions extant since 1999.

-- Lew
 
C

Chris

Lew said:
I'd push things into Collections.

public class Foo
{

public static void main( String [] args )
{
List< String > starters =
Arrays.asList( "A", "A", "A", "B", "B", "C", "D", "D" );

Map< String, Integer > kounts = new HashMap< String, Integer >();
for( String key : starters )
{
Integer k = kounts.get( key );
kounts.put( key, Integer.valueOf(k == null? 1 : k + 1 ));
}

List< String > outters = new ArrayList< String >();
outters.addAll( kounts.keySet() );
Collections.sort( outters );

for ( String key : outters )
{
System.out.println( "\""+ key + "\", "+ kounts.get( key ));
}

}

}

Thanks. This works, and has the advantage that it doesn't require the
input to be in order. The downside is that it isn't scalable. It needs
to store every unique element in memory. When you're processing very
large data sets, memory fills quickly.
 
L

Lew

Chris said:
Thanks. This works, and has the advantage that it doesn't require the
input to be in order. The downside is that it isn't scalable. It needs
to store every unique element in memory. When you're processing very
large data sets, memory fills quickly.

Don't all the other solutions shown so far in this discussion require
maintaining all the elements in memory? It's a rhetorical question: of course
they do.

How is this different?

If your problem space is too large for memory, use an RDBMS. Get the report
you want with a SQL SELECT value, count(value) FROM somewhere GROUP BY value
ORDER BY value.

-- Lew
 
P

Patricia Shanahan

Lew said:
Don't all the other solutions shown so far in this discussion require
maintaining all the elements in memory? It's a rhetorical question: of
course they do.

How is this different?

Remember the original problem statement started with sorted data. It
could be coming from a disk file.

Patricia
 
C

Chris

Stefan said:
Chris said:
Thank you. That is more elegant. In cases where a sentinel is possible,
it can work. It does do a peek() at the next element, though, which
won't work in many cases in my code.

OK, I remove the sentinel and the peek:

public class Main
{
final static java.lang.String source[] =
{ "A", "A", "A", "B", "B", "C", "D", "D" };
static int position = 0;
static java.lang.String getNext(){ return source[ position++ ]; }
static boolean hasNext(){ return position < source.length; }

public static void main( final java.lang.String[] args )
{ java.lang.String first = null;
java.lang.String next = null;
while( true )
{ if( next != null ){ first = next; next = null; }
else if( hasNext() )first = getNext(); else break;
int count = 1;
while( true )if( !hasNext() ){ next = null; break; }
else if( first.equals( next = getNext() ))++count; else break;
java.lang.System.out.println( "\"" + first + "\", " + count ); }}}

This is very clever. An outer and an inner loop, with essentially a
two-element pipeline. (It is difficult to follow; I had to reformat it
to understand it).

My guess that there is a way to simplify the code a bit and get a good
general algorithm.
 
L

Lew

Patricia said:
Remember the original problem statement started with sorted data. It
could be coming from a disk file.

Here is the original statement of the inputs:
As a simple example, let's say we have an array of strings, sorted, and we want to get list of the unique strings along with the count for each. Sort of the way you do with a SQL "group by" clause:

input:
{"A", "A", "A", "B", "B", "C", "D", "D"}

This puts all the elements in memory at once. The OP then went on to show code
using an array kept entire in memory.

-- Lew
 
S

Stefan Ram

Chris said:
My guess that there is a way to simplify the code a bit and get
a good general algorithm.

I do not like the »while( true ){ ... break; ... }«-style
myself, but sometimes it is the last ressort, when the loop
control logic is too complex to fit into the parentheses.

The loop control with some embedded side-effects was put into
the body of the loop:



/* begin of the control section of the outer loop */
while( true )
{ if( next != null ){ first = next; next = null; }
else if( hasNext() )first = getNext(); else break;
/* end of the control section of the outer loop */

/* begin of the actual body of the outer loop */
int count = 1;


/* begin of the control section of the inner loop */
while( true )if( !hasNext() ){ next = null; break; }
else if( !first.equals( next = getNext() ))break; else
/* end of the control section of the inner loop */

/* begin of the actual body of the inner loop */
++count;
/* end of the actual body of the inner loop */


java.lang.System.out.println( "\"" + first + "\", " + count );
/* end of the actual body of the outer loop */ }



The problem reminds me of the question how to structure a read
loop, where the user has to redo his input if it is not valid
without repeating the input statement »i = input()« within the
source code, as in:

i = input(); while( invalid( i )){ i = input(); }

This appears in languages without a »do{ ... }while( ...)«-loop
or when additional constraints are given.

It can be solved in the same way, using a »dirty« »while( true
){ ... break; ... }« style.

while( true ){ i = input(); if( valid( i ))break; }

In fact, there was a time when such questions were research
topics, namely, if I recall correct, Donald E. Knuth wrote an
article about this input-loop in the 1960ies. Or, possibly, it
was Niklaus Wirth, who wrote this.
I had to reformat it to understand it

Feel free not to read the following rationale for my
formatting style. I append it just for people who enjoy
reading about formatting styles.



One Way to Format Parentheses

There are several different ways to format texts with braces
and parentheses. One of them is being described here.

Indentation within Braces

An indentation of just one space often is too small to be seen
clearly, because the natural width and form of characters
often varies by an amount that is not very much smaller than a
space. Therefore, the indentation should amount to at least
two positions. In order not to waste horizontal spaces, an
indentation of exactly two positions is chosen. This means,
that the left position of the next level is two larger than
the position of the directly enclosing level.

Indentation by two positions within a block

{ ++x;
++x; }
^ ^
0 2

Bad A small indentation by one position is not always visible
clearly

{++x;
++x; }

Good The indentation by two positions is visible clearly

{ ++x;
++x; }

Bad A large indentation by more than two positions wastes
horizontal space with no additional benefit

{ ++x;
++x; }

Spaces within braces

In mathematics, there are often no spaces at the inner side of
parentheses or braces in expressions, but spaces are used
indeed at the inner side of braces in set notation, when the
braces contain a description (not when they contain a list).
Spaces in set notation

{ x | x > 2 }

This style is adopted here: One space is written at the inner
side of braces.

Spaces at the inner side of parentheses within a block

{ ++x; }

This style is consistent with the indentation by two
positions, because only using this style, corresponding parts
of two lines have the same position.

Bad No space after the first brace, the two statements are
misaligned

{++x;
++x; }

Good One space after the first brace, the two statements are
properly aligned

{ ++x;
++x; }

Bad Two spaces after the first brace, the two statements are
misaligned

{ ++x;
++x; }

There are some exceptions to this rule: No spaces are used
within empty braces "{}" and between two or more closing
braces of the same direction "}}", except, when the first one
of them is part of an empty pair "{} }" (an empty pair of
braces if treated like a single non-braces character).
Unified rules for all Brackets

For simplicity and uniformity, the rules from above apply to
all kinds of brackets, including parentheses, braces (curly
brackets), square brackets, and angle brackets.

Spaces within parentheses and square brackets

{ y = f( x )+ g() + a[ 2 ]; }

Binary operators are sorrounded by a space, but the space is
omitted, when there already is a space on the other side of a
sequence of bracket directly beside the operator: By this rule
" )+" is written instead of " ) +".

Representation of the Syntactical Structure

A method declaration in Java consists of a head and a body.
The following representation shows this structure:

Good formatting according to the structure

void alpha() // head
{ beta(); } // body

The following formatting is misleading, because the line break
does not match the structural break:

Bad line break within the body

void alpha() { // head and the beginning of the body
beta(); } // the rest of the body

This formatting also would make no sense for blocks within
blocks. So it is often not used for such blocks. Therefore
even the adopters of this style can not use it uniformly.

Left Braces Look Like "bullets"

There is a well known style to publish lists in typography
using bullets sticking out on the left, looking like this:
Common list representation with bullets in typography

o This is the first point
of this list, it is written
here just as an example.

o Here is another entry

o This is another example given
just as an example to show
an example

The braces of the beginnings of blocks stand out on the left
just the same, when the formatting being described here is
used, so they look quite naturally as beginning-of-a-block
markers, when one is used to the typographical list notation:

Left braces look like bullets to mark blocks

{ printf(); printf();
printf(); printf(); printf();
printf(); printf(); }

{ printf(); printf(); }

{ printf(); printf(); printf();
printf(); printf();
printf(); }

Neutrality

Someone wrote this C code:

Code someone wrote

while( fgets( eingabe, sizeof eingabe, stdin ))
if( sscanf( eingabe, "%d", &wert )!= 1 )
fprintf( stderr, "Please enter a number!\n" );
else
summe += wert;

It amazes me that I can add braces by my style conventions
without the need to change the position of any character of
the given code:

The code from above plus braces

while( fgets( eingabe, sizeof eingabe, stdin ))
{ if( sscanf( eingabe, "%d", &wert )!= 1 )
{ fprintf( stderr, "Please enter a number!\n" ); }
else
{ summe += wert; }}
 
L

Lew

Stefan said:
Chris said:
My guess that there is a way to simplify the code a bit and get
a good general algorithm.

I do not like the »while( true ){ ... break; ... }«-style
myself, but sometimes it is the last ressort, when the loop
control logic is too complex to fit into the parentheses.

The loop control with some embedded side-effects was put into
the body of the loop:



/* begin of the control section of the outer loop */
while( true )
{ if( next != null ){ first = next; next = null; }
else if( hasNext() )first = getNext(); else break;
/* end of the control section of the outer loop */

/* begin of the actual body of the outer loop */
int count = 1;


/* begin of the control section of the inner loop */
while( true )if( !hasNext() ){ next = null; break; }
else if( !first.equals( next = getNext() ))break; else
/* end of the control section of the inner loop */

/* begin of the actual body of the inner loop */
++count;
/* end of the actual body of the inner loop */


java.lang.System.out.println( "\"" + first + "\", " + count );
/* end of the actual body of the outer loop */ }



The problem reminds me of the question how to structure a read
loop, where the user has to redo his input if it is not valid
without repeating the input statement »i = input()« within the
source code, as in:

i = input(); while( invalid( i )){ i = input(); }

This appears in languages without a »do{ ... }while( ...)«-loop
or when additional constraints are given.

It can be solved in the same way, using a »dirty« »while( true
){ ... break; ... }« style.

while( true ){ i = input(); if( valid( i ))break; }

In fact, there was a time when such questions were research
topics, namely, if I recall correct, Donald E. Knuth wrote an
article about this input-loop in the 1960ies. Or, possibly, it
was Niklaus Wirth, who wrote this.
I had to reformat it to understand it

Feel free not to read the following rationale for my
formatting style. I append it just for people who enjoy
reading about formatting styles.



One Way to Format Parentheses

There are several different ways to format texts with braces
and parentheses. One of them is being described here.

Indentation within Braces

An indentation of just one space often is too small to be seen
clearly, because the natural width and form of characters
often varies by an amount that is not very much smaller than a
space. Therefore, the indentation should amount to at least
two positions. In order not to waste horizontal spaces, an
indentation of exactly two positions is chosen. This means,
that the left position of the next level is two larger than
the position of the directly enclosing level.

Indentation by two positions within a block

{ ++x;
++x; }
^ ^
0 2

Bad A small indentation by one position is not always visible
clearly

{++x;
++x; }

Good The indentation by two positions is visible clearly

{ ++x;
++x; }

Bad A large indentation by more than two positions wastes
horizontal space with no additional benefit

{ ++x;
++x; }

Spaces within braces

In mathematics, there are often no spaces at the inner side of
parentheses or braces in expressions, but spaces are used
indeed at the inner side of braces in set notation, when the
braces contain a description (not when they contain a list).
Spaces in set notation

{ x | x > 2 }

This style is adopted here: One space is written at the inner
side of braces.

Spaces at the inner side of parentheses within a block

{ ++x; }

This style is consistent with the indentation by two
positions, because only using this style, corresponding parts
of two lines have the same position.

Bad No space after the first brace, the two statements are
misaligned

{++x;
++x; }

Good One space after the first brace, the two statements are
properly aligned

{ ++x;
++x; }

Bad Two spaces after the first brace, the two statements are
misaligned

{ ++x;
++x; }

There are some exceptions to this rule: No spaces are used
within empty braces "{}" and between two or more closing
braces of the same direction "}}", except, when the first one
of them is part of an empty pair "{} }" (an empty pair of
braces if treated like a single non-braces character).
Unified rules for all Brackets

For simplicity and uniformity, the rules from above apply to
all kinds of brackets, including parentheses, braces (curly
brackets), square brackets, and angle brackets.

Spaces within parentheses and square brackets

{ y = f( x )+ g() + a[ 2 ]; }

Binary operators are sorrounded by a space, but the space is
omitted, when there already is a space on the other side of a
sequence of bracket directly beside the operator: By this rule
" )+" is written instead of " ) +".

Representation of the Syntactical Structure

A method declaration in Java consists of a head and a body.
The following representation shows this structure:

Good formatting according to the structure

void alpha() // head
{ beta(); } // body

The following formatting is misleading, because the line break
does not match the structural break:

Bad line break within the body

void alpha() { // head and the beginning of the body
beta(); } // the rest of the body

This formatting also would make no sense for blocks within
blocks. So it is often not used for such blocks. Therefore
even the adopters of this style can not use it uniformly.

Left Braces Look Like "bullets"

There is a well known style to publish lists in typography
using bullets sticking out on the left, looking like this:
Common list representation with bullets in typography

o This is the first point
of this list, it is written
here just as an example.

o Here is another entry

o This is another example given
just as an example to show
an example

The braces of the beginnings of blocks stand out on the left
just the same, when the formatting being described here is
used, so they look quite naturally as beginning-of-a-block
markers, when one is used to the typographical list notation:

Left braces look like bullets to mark blocks

{ printf(); printf();
printf(); printf(); printf();
printf(); printf(); }

{ printf(); printf(); }

{ printf(); printf(); printf();
printf(); printf();
printf(); }

Neutrality

Someone wrote this C code:

Code someone wrote

while( fgets( eingabe, sizeof eingabe, stdin ))
if( sscanf( eingabe, "%d", &wert )!= 1 )
fprintf( stderr, "Please enter a number!\n" );
else
summe += wert;

It amazes me that I can add braces by my style conventions
without the need to change the position of any character of
the given code:

The code from above plus braces

while( fgets( eingabe, sizeof eingabe, stdin ))
{ if( sscanf( eingabe, "%d", &wert )!= 1 )
{ fprintf( stderr, "Please enter a number!\n" ); }
else
{ summe += wert; }}

There are two conventional styles for brace placement in Java, dating back to
C days. The first, recommended by Sun since 1999, is to place the opening
brace at the end of the control structure line, and the closing brace to the
indent position of the control line, on its own line. The statements in the
controlled block are indented one notch further, two spaces by your
convention. Thus:

while ( condition ) {
doSomething();
}

The other variant is to place the opening brace on its own line, indented the
same as the control line:

while ( condition )
{
doSomething();
}

Placing the braces on the same line as contained block statements makes the
source structure harder to discern.

I recommend that you follow one of these two conventions with respect to
braces, since that will align you with the Java subculture.

It's a good idea to familiarize yourself with Sun's guidelines, whether you
choose to follow them or not (at your peril):
<http://java.sun.com/docs/codeconv/>

-- Lew
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,995
Messages
2,570,230
Members
46,819
Latest member
masterdaster

Latest Threads

Top