A
Aguilar, James
Hello all,
To begin, yes, this -is- a homework assignment. However, it is for my
Algorithms class, and my instructor has given us explicit permission to use
"expert" groups like newsgroups, so if that's your only reason not to help,
please do. Otherwise, I guess it's OK. But, just remember, I'm not asking
you to do my work for me, just to point out my error.
My problem is not with the algorithm itself (standard divide and conquer on
a set of points to find the closest pair), but with the C++, that is, my
writing of it. I seem to be getting a stack overflow exception when the
input is too large, and I don't know what to do about it. This exception
has occurred when I compile on g++ in Cygwin and when I compile using VC++
(whatever the newest version is) on Windows. Cygwin does not do anything
when it fails, it simply stops in the middle of printing the diagnostics
(that is, printing a running tally of the closest points). I loaded up the
Beta version of the Visual C++ ide since I don't know gdb very well/at all.
When I got to a certain point in the code, VC++ told me that a stack
overflow exception had been thrown.
I know my algorithm is correct, because it works on random cases between 10
and 100, so I think that if I can defeat this stack overflow, I will have
defeated the entire problem. I am including the code that I think is
relevant: there is other code in the project, but it just performs
administrative tasks. This is the entire algorithm. More talking at the
end.
----- CODE BEGINS -----
// CLOSEST-PAIR.CC
// Implementation of the closest-pair algorithm
#include <iostream>
#include <cfloat>
#include "closest-pair.hh"
const double INFINITY = DBL_MAX; //bigger than any possible interpoint
distance
void printPointPair(const Point&, const Point&);
void printPointPair(const PointPair&);
PointPair divide(Point**, Point**, int);
/*
Closest Pair Functions
Given a collection of nPoints points, find and ***print***
* the closest pair of points
* the distance between them
in the form "(x1, y1) (x2, y2) distance"
INPUTS:
- points sorted in nondecreasing order by X coordinate
- points sorted in nondecreasing order by Y coordinate
- # of input points
*/
PointPair bruteForce(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
Point a, b;
double min = INFINITY, dist;
for (int i = 0; i < nPoints - 1; ++i)
for (int j = i + 1; j < nPoints; ++j) {
dist = pointsByX->distance(pointsByX[j]);
if (min > dist) {
min = dist;
a = *pointsByX;
b = *pointsByX[j];
}
}
printPointPair(a, b);
return PointPair(a, b);
}
PointPair divCon(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
PointPair p = divide(pointsByX, pointsByY, nPoints);
printPointPair(p);
return p;
}
PointPair divide(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
using namespace std;
if (nPoints > 3) {
//divide
// set line
int bisect = pointsByX[nPoints/2]->x();
// sort Y lists to create new ones
Point* yL[nPoints];
Point* yR[nPoints];
int lenL = 0, lenR = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY->x() <= bisect) {
yL[lenL] = pointsByY;
++lenL;
}
if (pointsByY->x() >= bisect) {
yR[lenR] = pointsByY;
++lenR;
}
}
//conquer
// pass sorted arrays for repeat
PointPair pair1 = divide(pointsByX, yL, lenL);
PointPair pair2 = divide(pointsByX + nPoints - lenR, yR, lenR);
PointPair correct;
//combine
// find mindist
double p1Dist = pair1.distance(),
p2Dist = pair2.distance(),
min = 0;
if (p1Dist < p2Dist) {
correct = pair1;
min = p1Dist;
} else {
correct = pair2;
min = p2Dist;
}
// remove all non-2(mindist) points from y array
// I reuse yL here, since it is the longer than the array I need anyway
int stripLen = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY->x() >= bisect - min
|| pointsByY->x() <= bisect + min) {
yL[stripLen] = pointsByY;
++stripLen;
}
}
// then check next seven repeated until the end of the list
for (int i = 0; i < stripLen; ++i)
for (int j = i+1; j-i < 8 && j < stripLen; ++j)
if (yL->distance(yL[j]) < min) {
min = yL->distance(yL[j]);
correct.p1() = yL;
correct.p2() = yL[j];
}
return correct;
} else {
return bruteForce(pointsByX, pointsByY, nPoints);
}
}
void printPointPair(const PointPair& pp)
{
using namespace std;
double dist = pp.distance();
cout << pp << " " << dist << endl;
}
void printPointPair(const Point& a, const Point& b)
{
using namespace std;
double dist = a.distance(&b);
cout << a << " " << b << " " << dist << endl;
}
----- CODE ENDS -----
I thought that I might be concerned about the temporary arrays that I
create. However, I've been informed that since they are temporary (here I
am referring to yL and yR) they are deleted every time I exit their scope.
If this is the case, it would seem (to me) logical that the stack overflow
would occur on the first branch, but this is not the case. The first
branch, even on very large inputs (up to n = 10000) completes successfully
(according to the debugger and my diagnostics). Also, for reference, when I
compile it on G++ in Cygwin, n = 187 completes successfully, but n = 188
does not. This is consistently the case, and I can show it to repeat.
If needed, I can post the entire project, but I think with as many smart
guys as there are here, someone will find my problem just in this post.
Yours,
James
To begin, yes, this -is- a homework assignment. However, it is for my
Algorithms class, and my instructor has given us explicit permission to use
"expert" groups like newsgroups, so if that's your only reason not to help,
please do. Otherwise, I guess it's OK. But, just remember, I'm not asking
you to do my work for me, just to point out my error.
My problem is not with the algorithm itself (standard divide and conquer on
a set of points to find the closest pair), but with the C++, that is, my
writing of it. I seem to be getting a stack overflow exception when the
input is too large, and I don't know what to do about it. This exception
has occurred when I compile on g++ in Cygwin and when I compile using VC++
(whatever the newest version is) on Windows. Cygwin does not do anything
when it fails, it simply stops in the middle of printing the diagnostics
(that is, printing a running tally of the closest points). I loaded up the
Beta version of the Visual C++ ide since I don't know gdb very well/at all.
When I got to a certain point in the code, VC++ told me that a stack
overflow exception had been thrown.
I know my algorithm is correct, because it works on random cases between 10
and 100, so I think that if I can defeat this stack overflow, I will have
defeated the entire problem. I am including the code that I think is
relevant: there is other code in the project, but it just performs
administrative tasks. This is the entire algorithm. More talking at the
end.
----- CODE BEGINS -----
// CLOSEST-PAIR.CC
// Implementation of the closest-pair algorithm
#include <iostream>
#include <cfloat>
#include "closest-pair.hh"
const double INFINITY = DBL_MAX; //bigger than any possible interpoint
distance
void printPointPair(const Point&, const Point&);
void printPointPair(const PointPair&);
PointPair divide(Point**, Point**, int);
/*
Closest Pair Functions
Given a collection of nPoints points, find and ***print***
* the closest pair of points
* the distance between them
in the form "(x1, y1) (x2, y2) distance"
INPUTS:
- points sorted in nondecreasing order by X coordinate
- points sorted in nondecreasing order by Y coordinate
- # of input points
*/
PointPair bruteForce(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
Point a, b;
double min = INFINITY, dist;
for (int i = 0; i < nPoints - 1; ++i)
for (int j = i + 1; j < nPoints; ++j) {
dist = pointsByX->distance(pointsByX[j]);
if (min > dist) {
min = dist;
a = *pointsByX;
b = *pointsByX[j];
}
}
printPointPair(a, b);
return PointPair(a, b);
}
PointPair divCon(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
PointPair p = divide(pointsByX, pointsByY, nPoints);
printPointPair(p);
return p;
}
PointPair divide(Point* pointsByX[], Point* pointsByY[], int nPoints)
{
using namespace std;
if (nPoints > 3) {
//divide
// set line
int bisect = pointsByX[nPoints/2]->x();
// sort Y lists to create new ones
Point* yL[nPoints];
Point* yR[nPoints];
int lenL = 0, lenR = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY->x() <= bisect) {
yL[lenL] = pointsByY;
++lenL;
}
if (pointsByY->x() >= bisect) {
yR[lenR] = pointsByY;
++lenR;
}
}
//conquer
// pass sorted arrays for repeat
PointPair pair1 = divide(pointsByX, yL, lenL);
PointPair pair2 = divide(pointsByX + nPoints - lenR, yR, lenR);
PointPair correct;
//combine
// find mindist
double p1Dist = pair1.distance(),
p2Dist = pair2.distance(),
min = 0;
if (p1Dist < p2Dist) {
correct = pair1;
min = p1Dist;
} else {
correct = pair2;
min = p2Dist;
}
// remove all non-2(mindist) points from y array
// I reuse yL here, since it is the longer than the array I need anyway
int stripLen = 0;
for (int i = 0; i < nPoints; ++i) {
if (pointsByY->x() >= bisect - min
|| pointsByY->x() <= bisect + min) {
yL[stripLen] = pointsByY;
++stripLen;
}
}
// then check next seven repeated until the end of the list
for (int i = 0; i < stripLen; ++i)
for (int j = i+1; j-i < 8 && j < stripLen; ++j)
if (yL->distance(yL[j]) < min) {
min = yL->distance(yL[j]);
correct.p1() = yL;
correct.p2() = yL[j];
}
return correct;
} else {
return bruteForce(pointsByX, pointsByY, nPoints);
}
}
void printPointPair(const PointPair& pp)
{
using namespace std;
double dist = pp.distance();
cout << pp << " " << dist << endl;
}
void printPointPair(const Point& a, const Point& b)
{
using namespace std;
double dist = a.distance(&b);
cout << a << " " << b << " " << dist << endl;
}
----- CODE ENDS -----
I thought that I might be concerned about the temporary arrays that I
create. However, I've been informed that since they are temporary (here I
am referring to yL and yR) they are deleted every time I exit their scope.
If this is the case, it would seem (to me) logical that the stack overflow
would occur on the first branch, but this is not the case. The first
branch, even on very large inputs (up to n = 10000) completes successfully
(according to the debugger and my diagnostics). Also, for reference, when I
compile it on G++ in Cygwin, n = 187 completes successfully, but n = 188
does not. This is consistently the case, and I can show it to repeat.
If needed, I can post the entire project, but I think with as many smart
guys as there are here, someone will find my problem just in this post.
Yours,
James