L
Leandro
I'm writing a FDTD code (electromagnetic simulation) and I'm having some troubles with the performance of the code.
We have two versions. The first one runs just the calculus (version1). The second one is the whole application (GUI version, using wxWidgets), with the same calculus routine (version2). The problem is that version2 runs almost twice slower than version1, and I can't understand why.
The FDTD calculation is a lot of loops (one loop in time and three in x, y and z direction). So I removed all of them but one and tried "Very Sleepy" to profile the code. It shows me that the exactly piece of code runs with very different speed in the two versions, reproduced below (variable with []are of type TNT::Array3D - http://math.nist.gov/tnt/overview.html)
Here are the results, compiled with g++:
Version1:
0.15s void FdtdSolver::CalculateDx()
{
int i, j, k;
double curl_h;
// Calculate the Dx field
for(i = 1; i < ia; i++)
{
0.01s for(j = 1; j < Ny; j++)
{
0.06s for(k = 1; k < Nz; k++)
{
curl_h = cay[j]*(Hz[j][k] - Hz[j-1][k]) -
0.38s caz[k]*(Hy[j][k] - Hy[j][k-1]);
0.10s idxl[j][k] = idxl[j][k] + curl_h;
Dx[j][k] = gj3[j]*gk3[k]*Dx[j][k] +
0.29s gj2[j]*gk2[k]*(curl_h + gi1*idxl[j][k]);
}
}
}
// Other loops with the same behavior...
}
Version2:
0.01s void FDTDEngine::CalculateDx()
{
int i, j, k;
double curl_h;
// Calculate the Dx field
for(i = 1; i < ia; i++)
{
0.00s for(j = 1; j < Ny; j++)
{
0.06s for(k = 1; k < Nz; k++)
{
0.01s curl_h = cay[j]*(Hz[j][k] - Hz[j-1][k]) -
0.53s caz[k]*(Hy[j][k] - Hy[j][k-1]);
0.10s idxl[j][k] = idxl[j][k] + curl_h;
0.02s Dx[j][k] = gj3[j]*gk3[k]*Dx[j][k] +
0.36s gj2[j]*gk2[k]*(curl_h + gi1*idxl[j][k]);
}
}
}
// Other loops with the same behavior...
}
The question is: What kind of think can I do to solve this problem?
Tks!
ps.: Sorry for the language. Non native speaker...
We have two versions. The first one runs just the calculus (version1). The second one is the whole application (GUI version, using wxWidgets), with the same calculus routine (version2). The problem is that version2 runs almost twice slower than version1, and I can't understand why.
The FDTD calculation is a lot of loops (one loop in time and three in x, y and z direction). So I removed all of them but one and tried "Very Sleepy" to profile the code. It shows me that the exactly piece of code runs with very different speed in the two versions, reproduced below (variable with []are of type TNT::Array3D - http://math.nist.gov/tnt/overview.html)
Here are the results, compiled with g++:
Version1:
0.15s void FdtdSolver::CalculateDx()
{
int i, j, k;
double curl_h;
// Calculate the Dx field
for(i = 1; i < ia; i++)
{
0.01s for(j = 1; j < Ny; j++)
{
0.06s for(k = 1; k < Nz; k++)
{
curl_h = cay[j]*(Hz[j][k] - Hz[j-1][k]) -
0.38s caz[k]*(Hy[j][k] - Hy[j][k-1]);
0.10s idxl[j][k] = idxl[j][k] + curl_h;
Dx[j][k] = gj3[j]*gk3[k]*Dx[j][k] +
0.29s gj2[j]*gk2[k]*(curl_h + gi1*idxl[j][k]);
}
}
}
// Other loops with the same behavior...
}
Version2:
0.01s void FDTDEngine::CalculateDx()
{
int i, j, k;
double curl_h;
// Calculate the Dx field
for(i = 1; i < ia; i++)
{
0.00s for(j = 1; j < Ny; j++)
{
0.06s for(k = 1; k < Nz; k++)
{
0.01s curl_h = cay[j]*(Hz[j][k] - Hz[j-1][k]) -
0.53s caz[k]*(Hy[j][k] - Hy[j][k-1]);
0.10s idxl[j][k] = idxl[j][k] + curl_h;
0.02s Dx[j][k] = gj3[j]*gk3[k]*Dx[j][k] +
0.36s gj2[j]*gk2[k]*(curl_h + gi1*idxl[j][k]);
}
}
}
// Other loops with the same behavior...
}
The question is: What kind of think can I do to solve this problem?
Tks!
ps.: Sorry for the language. Non native speaker...