Do not want to use memset function

I

Immortal Nephi

I wrote my own Fill_Memory function because I want to add SSE2 code
into it, but memset function does not have SSE2 code. Everything in
my function works very well.
I turned on the optimization of C++ Compiler. It mistakenly
automatically invoked memset function (or memset function is inserted
itself without my consult) as long as the bytes are between two and
four after Fill_Memory function is called and processed more than 128
bytes leaving two more bytes to be processed with memset function.
How can I tell C++ Compiler to disable inserting memset function and
allow Fill_Memory function to do the job itself?
 
M

Marcel Müller

I wrote my own Fill_Memory function because I want to add SSE2 code
into it, but memset function does not have SSE2 code.

This is /your/ statement.
I have already seen SSE instructions in trivial operations like memcpy
or even assignment of large objects.
How can I tell C++ Compiler to disable inserting memset function and
allow Fill_Memory function to do the job itself?

You can't do that in a standard conform way because it requires changes
to the runtime environment. Of course, you might provide your own memset
implementation and force the linker to prefer /your/ implementation by
using the same symbol. But replacing only parts of the runtime is
undefined behavior in general, although it works from time to time.


Marcel
 
G

gwowen

You can't do that in a standard conform way because it requires changes
to the runtime environment.

True, but I think what is happening here is that the compiler is
optimizing something to memset() that is not an explicit memset()
call. For example gcc may optimize:

my_memclear(char *ptr,size_t num)
{
for(size_t i=0; i<num; ++i) ptr = 0;
}

to

my_memclear(char *ptr,size_t num)
{
__builtin_memset(ptr,0,100);
}

As you say, though, there's everychance that __builtin_memset() *does*
use SSE when the architecture flags allow it.
 
J

Jorgen Grahn

This is /your/ statement.
I have already seen SSE instructions in trivial operations like memcpy
or even assignment of large objects.

Seems reasonable. It's important to realize that the compiler may not
be genererating code specifically for the CPU it itself is running on.
E.g. gcc on x86 will by default generate code which runs on the
ancient 80386 and upwards. You have to tell it somehow that it's OK
to use SSE2 instructions.

/Jorgen
 
N

Nephi Immortal

This is /your/ statement.
I have already seen SSE instructions in trivial operations like memcpy
or even assignment of large objects.


You can't do that in a standard conform way because it requires changes
to the runtime environment. Of course, you might provide your own memset
implementation and force the linker to prefer /your/ implementation by
using the same symbol. But replacing only parts of the runtime is
undefined behavior in general, although it works from time to time.

I use MSVC++ 2010. I looked at memset's assembly code. It does not
have any SSE2 code, but memcpy does have SSE2 code.

Why should memset be called unless number of bytes are less than four?

the tail loop looks like below.

while( count > 0 )
{
char*& dst_8 = reinterpret_cast< char*& >( dst );

*dst_8 = value;
++dst_8;
--count;
}

This code will be replaced with memset no matter if count is less than
4.

If you add something differently, memset will not be inserted
automatically.

if( count > 0 )
{
do
{
char*& dst_8 = reinterpret_cast< char*& >( dst );

*dst_8 = value;
++dst_8;
--count;
}
while( count > 0 );
}
 
A

Adam Skutt

I use MSVC++ 2010.  I looked at memset's assembly code.  It does not
have any SSE2 code, but memcpy does have SSE2 code.

I'm a little confused as to how you have looked at that code. Anyway,
bugs like this: http://connect.microsoft.com/Visual...0304/memset-causes-corruption-using-arch-sse2
suggest that you are incorrect and that the compiler will happily
generate the instructions if you ask it to do so.

My guess is you have not correctly asked it to do so, as it is not the
default behavior. Regardless, the solution to this problem and your
original problem are best asked of the compiler manual and not here.

Adam
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

No members online now.

Forum statistics

Threads
473,997
Messages
2,570,239
Members
46,827
Latest member
DMUK_Beginner

Latest Threads

Top