B
bcomeara
I am writing a program which needs to include a large amount of data.
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information). Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on. Is there a better way
of building large amounts of data into C++ programs? I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholder) containing vector of
vectors:
vector<vector<double> > CDFvectorholder::Initialize() {
vector<vector<double> > CDFvectorcontents;
vector<double> contentsofrow;
contentsofrow.push_back(0.33298);
contentsofrow.push_back(1);
CDFvectorcontents.push_back(contentsofrow); //comparison where
ntax=3
contentsofrow.clear();
contentsofrow.push_back(0.07352);
contentsofrow.push_back(0.14733);
contentsofrow.push_back(0.33393);
contentsofrow.push_back(0.78019);
contentsofrow.push_back(1);
CDFvectorcontents.push_back(contentsofrow); //comparison where
ntax=4
contentsofrow.clear();
contentsofrow.push_back(0.01209);
contentsofrow.push_back(0.03292);
contentsofrow.push_back(0.04202);
contentsofrow.push_back(0.0767);
contentsofrow.push_back(0.13314);
contentsofrow.push_back(0.23417);
contentsofrow.push_back(0.40921);
contentsofrow.push_back(0.58934);
contentsofrow.push_back(0.82239);
contentsofrow.push_back(0.98537);
contentsofrow.push_back(1);
CDFvectorcontents.push_back(contentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorcontents;
}
and the main program file, initializing the vector of vectors:
vector<vector<double> > CDFvector;
CDFvectorholder bob;
CDFvector=bob.Initialize();
and using it:
double cdfundermodel=CDFvector[integerB][integerA];
Thank you,
Brian O'Meara
Basically, the data are p values for different possible outcomes from
trials with different number of observations (the p values are
necessarily based on slow simulations rather than on a standard
function, so I estimated them once and want the program to include
this information). Currently, I have this stored as a vector of
vectors of varying sizes (first vector is indexed by number of
observations for the trial; for each number of observations, there is
a vector containing a p value for different numbers of successes, with
these vectors getting longer as the number of observations (and
therefore possible successes) increases). I created a class containing
this vector of vectors; my program, on starting, creates an object of
this class. However, the file containing just this class is ~50,000
lines long and 10 MB in size, and takes a great deal of time to
compile, especially with optimization turned on. Is there a better way
of building large amounts of data into C++ programs? I could just
include a separate datafile, and have the program call it upon
starting, but then that would require having the program know where
the file is, even when I distribute it. In case this helps, I am
already using the GNU Scientific Library in the program, so using any
functions there is an easy option. My apologies if this question has
an obvious, standard solution I should already know about.
Excerpt from class file (CDFvectorholder) containing vector of
vectors:
vector<vector<double> > CDFvectorholder::Initialize() {
vector<vector<double> > CDFvectorcontents;
vector<double> contentsofrow;
contentsofrow.push_back(0.33298);
contentsofrow.push_back(1);
CDFvectorcontents.push_back(contentsofrow); //comparison where
ntax=3
contentsofrow.clear();
contentsofrow.push_back(0.07352);
contentsofrow.push_back(0.14733);
contentsofrow.push_back(0.33393);
contentsofrow.push_back(0.78019);
contentsofrow.push_back(1);
CDFvectorcontents.push_back(contentsofrow); //comparison where
ntax=4
contentsofrow.clear();
contentsofrow.push_back(0.01209);
contentsofrow.push_back(0.03292);
contentsofrow.push_back(0.04202);
contentsofrow.push_back(0.0767);
contentsofrow.push_back(0.13314);
contentsofrow.push_back(0.23417);
contentsofrow.push_back(0.40921);
contentsofrow.push_back(0.58934);
contentsofrow.push_back(0.82239);
contentsofrow.push_back(0.98537);
contentsofrow.push_back(1);
CDFvectorcontents.push_back(contentsofrow); //comparison where
ntax=5
//ETC
return CDFvectorcontents;
}
and the main program file, initializing the vector of vectors:
vector<vector<double> > CDFvector;
CDFvectorholder bob;
CDFvector=bob.Initialize();
and using it:
double cdfundermodel=CDFvector[integerB][integerA];
Thank you,
Brian O'Meara