A
Arnold Hendriks
In Europe, software patents cannot be enforced at all. (Although they can beJuha Nieminen said:That my be true in Europe but not in the US.
registered, to keep the EPO busy)
In Europe, software patents cannot be enforced at all. (Although they can beJuha Nieminen said:That my be true in Europe but not in the US.
Mingnan G. said:By downloading the source code, you will find in HNXGC_MT directory
which uses the Global Memory Fence technology to elide acquire/
release
semantics in a multi-processors environment, such as for Windows 64-
bit
IA64 platform. The binary code for Windows 64-bit IA64 is also
available.
Mingnan G. said:By downloading the source code, you will find in HNXGC_MT directory
which uses the Global Memory Fence technology to elide acquire/
release
semantics in a multi-processors environment, such as for Windows 64-
bit
IA64 platform. The binary code for Windows 64-bit IA64 is also
available.
Chris Thomasson said:Mingnan G. said:By downloading the source code, you will find in HNXGC_MT directory
which uses the Global Memory Fence technology to elide acquire/
release
semantics in a multi-processors environment, such as for Windows 64-
bit
IA64 platform. The binary code for Windows 64-bit IA64 is also
available.
[...]
I can't seem to find the sign-up page. I sent you an e-mail requesting
further instruction.
[...]Joe Seigh said:(sorry for the repost but it probably would help if I actually included
c.p.t) [...]
CC'ing c.p.t
I don't know what the GlobalMemoryFence is yet since that part is not
filled in yet. Generally with shared pointers you need acquire/release
semantics which can't be elided.
c.p.t) [...]
CC'ing c.p.tI don't know what the GlobalMemoryFence is yet since that part is not
filled in yet. Generally with shared pointers you need acquire/release
semantics which can't be elided.
[...]
I took a quick look at the code. AFAICT, he creates a thread per-processor
and binds it to that processor.
So if your running an a 64-processor system, he creates:
Thread 1 - Bound To CPU 1
Thread 2 - Bound To CPU 2
Thread 3 - Bound To CPU 3
Thread 4 - Bound To CPU 4
Thread 5 - Bound To CPU 5
Thread 6 - Bound To CPU 6
[and on and on to 64]
Those threads basically sit on an event and wait. When they are awoken, they
execute a membar and atomically decrement a counter and check to see if it
dropped to zero, which means that every "CPU" thread has executed it.
AFAICT, this is exactly the same as the synchronize_rcu() function
implemented with a daemon per-cpu. So, there is really nothing new here at
all. I think he is going to have a problem with prior are for sure.
As for the smart pointer reference counting, I haven't looked yet as I was
mainly interested in GlobalMemoryFence (a.k.a, synchronize_rcu()).
Any thoughts?
Chris said:[...]Joe Seigh said:(sorry for the repost but it probably would help if I actually
included c.p.t) [...]
CC'ing c.p.t
I don't know what the GlobalMemoryFence is yet since that part is not
filled in yet. Generally with shared pointers you need acquire/release
semantics which can't be elided.
I took a quick look at the code. AFAICT, he creates a thread
per-processor and binds it to that processor.
So if your running an a 64-processor system, he creates:
Thread 1 - Bound To CPU 1
Thread 2 - Bound To CPU 2
Thread 3 - Bound To CPU 3
Thread 4 - Bound To CPU 4
Thread 5 - Bound To CPU 5
Thread 6 - Bound To CPU 6
[and on and on to 64]
Those threads basically sit on an event and wait. When they are awoken,
they execute a membar and atomically decrement a counter and check to
see if it dropped to zero, which means that every "CPU" thread has
executed it. AFAICT, this is exactly the same as the synchronize_rcu()
function implemented with a daemon per-cpu. So, there is really nothing
new here at all. I think he is going to have a problem with prior are
for sure.
As for the smart pointer reference counting, I haven't looked yet as I
was mainly interested in GlobalMemoryFence (a.k.a, synchronize_rcu()).
Mingnan G. said:(sorry for the repost but it probably
would help if I actually includedc.p.t) [...]
CC'ing c.p.tI don't know what the GlobalMemoryFence is yet since that part is not
filled in yet. Generally with shared pointers you need acquire/release
semantics which can't be elided.
[...]
I took a quick look at the code. AFAICT, he creates a thread
per-processor
and binds it to that processor.
So if your running an a 64-processor system, he creates:
Thread 1 - Bound To CPU 1
Thread 2 - Bound To CPU 2
Thread 3 - Bound To CPU 3
Thread 4 - Bound To CPU 4
Thread 5 - Bound To CPU 5
Thread 6 - Bound To CPU 6
[and on and on to 64]
Those threads basically sit on an event and wait. When they are awoken,
they
execute a membar and atomically decrement a counter and check to see if
it
dropped to zero, which means that every "CPU" thread has executed it.
AFAICT, this is exactly the same as the synchronize_rcu() function
implemented with a daemon per-cpu. So, there is really nothing new here
at
all. I think he is going to have a problem with prior are for sure.
As for the smart pointer reference counting, I haven't looked yet as I
was
mainly interested in GlobalMemoryFence (a.k.a, synchronize_rcu()).
Any thoughts?
synchronize_rcu is for locking(lock-free),
GlobalMemoryFence is for
memory semantics for MP.
Totally two different things.
Joe Seigh said:[...]Chris said:news:ZmZ%i.2695$Jy1.83@trndny02...
I figured it was some form of RCU w/ memory barriers in the quiescent
states.
That by itself doesn't let you elide memory barriers. Whether you can
elide
a membar depends on the specific situation. And the proofs can be a bit
complicated depending on the situation (at least for SMR+RCU).
I'm guessing from the docs, the refcounting is a bit like atomic_ptr with
the distinguishing of local non-shared vs. global heap shared reference
counts.
With refcounting, it was always safe to use a raw reference as long as you
had
(owned) one refcount. There seems to be a wrapper class that is the
logical
equivalent.
The burden of proof is on him. He needs to show how his stuff is
different
from prior art. That means explicitly listing the prior art and
contrasting
it with his stuff.
Mingnan G. said:HnxGC can perform Lock-free concurrent garbage collection even without
GlobalMemoryFence. GlobalMemoryFence is designed to remove extra
memory ordering semantics of HnxGC, such as Acquire/Release semantics
for IA64.
We don't need to detect quiescent states as RCU does to
perform lock-free access.
Although GMF is very interesting, but it is only useful for some
processor platforms. For example, x86 architecture has enforced memory
semantics for atomic operations, so GMF is not needed even in a x86
multi-processors environment.
I will describe and discuss GMF later in
the documentation of hnxgc.harnixworld.com
Anyway, thanks for your interests in HnxGC.
I don't have experiences
on NUMA, but it will be great if you
can make it work on a NUMA and get some performance benchmarks.
Mingnan G. wrote:You worry too much. For a non-commercial educational purpose, you can
apply
for a *FREE* license for this software. See follows:
[blah]
If your development is so breathtakingly innovative and of general
applicability to C++ programming, why don't you make it available to the
whole programming community as free software under a liberal license.
Nick said:why should he?
Nick Keighleywrote:
Because locking it up with a patent effectively removes it from the
common pool of knowledge (at least in the U.S.). It isn't even as if it
hadn't been invented yet -- it essentially cannot be (re-)invented by
someone else (until the patent runs out, of course) and thus is
effectively lost to humanity (or at least, the US software community.)
As I don't live in the US, I personally couldn't care less about the
software patent nonsense but still.
Nick said:there's a difference between not patenting it and releasing it as
free software.
Chris Thomasson said:Sure. But that's more expensive. Can the code as is run without
GlobalMemoryFence and still elude Acquire/Release?
AFAICT, you using GMF to lower the overhead of reference counter
mutations. I can't quite see if your using it to get release semantics. As
for acquire, well I am talking about avoiding a #StoreLoad | #StoreStore
barrier, I guess the would be a store-acquire to be more precise...
Some sort of GMF is required on x86 to get the #StoreLoad barrier in a
lock-free reader pattern. You can use GC for that pattern an it has
similar concerns indeed. For instance, you have to use a GMF to implement
SMR.
[...]Joe Seigh said:(sorry for the repost but it probably would help if I actually included
c.p.t)
[...]Chris Thomasson said:[...]Joe Seigh said:(sorry for the repost but it probably would help if I actually included
c.p.t)
Some observations on your pointer manipulation functions...
_________________________________________________________________
- I notice that the _hnxgc_assign_lptr function is made up of more than an
interlocked update, or simple distributed reference count increment. I
notice at multiple lines of code that makes calls into other object
functions made up of multiple lines of code themselves.
[...]Have you benchmarked this against a proxy garbage collector yet? This
would be in the context of a reader/writer solution of course.. I am
thinking that it will outperform your GC simply because the pointer access
within the collected region can be completely naked on most of the
existing architectures out there. Another advantage of this type of
garbage collection is that you can clearly separate the writers from the
readers. Also, refer to last few paragraphs of following message:
http://groups.google.com/group/comp.lang.c++/browse_frm/thread/95007ffdf246d50e
Want to reply to this thread or ask your own question?
You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.