API for aggregator of asynchronous calls

S

Scott Sauyet

I needed a technique to aggregate the results of several asynchronous
calls. Obviously there are many ways of doing this. I wrote an
Aggregator constructor function and was wondering if the API this
presents makes sense to people here.

Here is some demo code that uses this function:

var agg = new Aggregator();

var myHandler = function(data, statuses, errors) {
// statuses = {"blue": "completed", "red": "completed",
// "yellow": "completed"}
// errors = {} //
// data = {"red": { /* ... */}, "blue": { /* ... */ },
// "yellow": { /* ... */ }}

// do something with data.blue, data.red, data.yellow
}

agg.onComplete(myHandler);

// Or var agg = new Aggregator(myHandler);

agg.start("blue");
agg.start("red");
agg.start("yellow");

getJsonAjax({
url: "blue/",
success: function(data, status) {
agg.complete("blue", data);
},
failure: function(status, err) {
agg.error("blue", status);
}
});

getJsonAjax({
url: "red/",
success: function(data, status) {
agg.complete("red", data);
},
failure: function(status, err) {
agg.error("red", status);
}
});

getJsonAjax({
url: "yellow/",
success: function(data, status) {
agg.complete("yellow", data);
},
failure: function(status, err) {
agg.error("yellow", status);
}
});

(I don't actually have a getJsonAjax() function, but it should be
obvious enough what it's meant to do.)

The user associates each asynchronous process with a String (above
"red", "blue", and "yellow"). When each process completes, the
relevant returned data is supplied with the complete() call.

The completion handlers accept three parameters. The first represents
the actual results of the various asynchronous calls, the second, the
status flags of each call ("started", "completed", "error", possibly
"aborted"), and the third, any error messages generated by these
calls. Any actual combination of the results is left to the caller,
presumably in the completion handlers. You can pass such handlers to
the onComplete() method or to the constructor.

I do have an implementation of this, but at the moment, I'm mostly
concerned with whether the API makes sense. Is there a cleaner way to
do this? Is this general enough? Are the String identifiers for the
calls robust enough? Should the constructor also accept the
identifiers? What suggestions do you have?
 
A

Asen Bozhilov

Scott said:
I needed a technique to aggregate the results of several asynchronous
calls.  Obviously there are many ways of doing this.  I wrote an
Aggregator constructor function and was wondering if the API this
presents makes sense to people here.
Here is some demo code that uses this function:

    var agg = new Aggregator();

    var myHandler = function(data, statuses, errors) {
      // statuses = {"blue": "completed", "red": "completed",
      //             "yellow": "completed"}
      // errors = {} //
      // data = {"red": { /* ... */}, "blue": { /* ... */ },
      //         "yellow": { /* ... */ }}

      // do something with data.blue, data.red, data.yellow
    }

    agg.onComplete(myHandler);

    // Or var agg = new Aggregator(myHandler);

    agg.start("blue");
    agg.start("red");
    agg.start("yellow");
    getJsonAjax({
      url: "blue/",
      success: function(data, status) {
        agg.complete("blue", data);
      },
      failure: function(status, err) {
        agg.error("blue", status);
      }
    });

Leave `Aggregator' to care about your requests. I do not see any
reasons for:

agg.start("yellow");

First I should push in aggregated request and after that I should
explicitly inform `Aggregator' for completion of that request:

agg.complete("blue", data);

That is a little bit inconsistency.

The following is my point of view of the problem.

function Aggregator() {
var that = this;
this._aggregated = [];
this._count = 0;

this._handler = function (req, data) {
that._count--;
if (that._count == 0) {
that.onComplete();
}
};
}

Aggregator.prototype = {
onComplete : function () {},

push : function (reqWrapper) {
reqWrapper.success = this._handler;
this._count++;
this._aggregated.push(reqWraper);
},

start : function () {
for (var i = 0, len = this._aggregated.length; i < len; i++) {
this._aggregated.send();
}
}
};

function RequestWrapper() {}
RequestWrapper.prototype.send = function () {
//...
};

var agg = new Aggregator();

agg.onComplete = function () {
//...
};

agg.push(new RequestWrapper());
agg.push(new RequestWrapper());
agg.push(new RequestWrapper());

agg.start();
 
S

Scott Sauyet

Asen said:
Scott said:
I needed a technique to aggregate the results of several asynchronous
calls.  Obviously there are many ways of doing this.  I wrote an
Aggregator constructor function and was wondering if the API this
presents makes sense to people here. [ ... ]
Leave `Aggregator' to care about your requests. I do not see any
reasons for:

agg.start("yellow");

First I should push in aggregated request and after that I should
explicitly inform `Aggregator' for completion of that request:

agg.complete("blue", data);

That is a little bit inconsistency.

First off, thank you Asen for taking the time to reply.

I'm not sure I seen an inconsistency here, but it certainly might seem
awkward.


The following is my point of view of the problem.

function Aggregator() {
    var that = this;
    this._aggregated = [];
    this._count = 0;

    this._handler = function (req, data) {
        that._count--;
        if (that._count == 0) {
            that.onComplete();
        }
    };

}

Aggregator.prototype = {
    onComplete : function () {},

    push : function (reqWrapper) {
        reqWrapper.success = this._handler;
        this._count++;
        this._aggregated.push(reqWraper);
    },

    start : function () {
        for (var i = 0, len = this._aggregated.length; i < len; i++) {
            this._aggregated.send();
        }
    }

};

function RequestWrapper() {}
RequestWrapper.prototype.send = function () {
    //...

};

var agg = new Aggregator();

agg.onComplete = function () {
    //...

};

agg.push(new RequestWrapper());
agg.push(new RequestWrapper());
agg.push(new RequestWrapper());

agg.start();


(Sorry for the long quote, can't find any way to trim it without
removing something essential.)

This would certainly be clearer and cleaner to use for the problem I
presented in my demo.

It probably is enough for my current needs, too. But there is a real
possibility that I will need some additional features I didn't mention
in my initial message, but which did prompt the API I used. First of
all, sometimes the asynchronous process I'm waiting for might be user
input rather than an AJAX call. Second, some calls might lead me to
make additional ones, and I want my wrap-up function to run only after
all the results are in. I might still be able to get away without
listing the calls, though, and only counting, but I don't think I
could use the push-push-push-start system, but rather one in which
each process is started and the aggregator checks after each one
completes whether there are still any processes still running.

I will look into whether I need the labels at all. I might be able to
avoid them.

Thanks again,
 
S

Scott Sauyet

Stefan said:
I needed a technique to aggregate the results of several asynchronous
calls.  [ ... ]
I do have an implementation of this, but at the moment, I'm mostly
concerned with whether the API makes sense.  Is there a cleaner way to
do this?  Is this general enough?  Are the String identifiers for the
calls robust enough?  Should the constructor also accept the
identifiers?  What suggestions do you have?

I don't see any major problems with the API you suggested. If you ask 10
people to design an Aggregator object, you'll very likely end up with 10
different designs, and the one you come up with yourself will always
feel the most natural.

Of course. That's why when I don't have others to immediately work
with an API, I like to ask knowledgeable people if the API makes
sense.
But since you asked for feedback, here are some random unsorted thoughts
that came to my mind:

If "agg = new Aggregator(handler)" and "agg.onComplete(handler)" are
equivalent, then the onComplete method is misnamed, because the handler
will also get called on failures (and possibly other status changes).
"setHandler" might be a better name.

Absolutely right. I changed my stop(name, data) method to
complete(name, data) at the last minute because I was using the status
values of "started", "completed", or "errored" for the various tasks
and it made sense to make the verbs match these statuses. I didn't
consider that this is too close to the onComplete() method. I'm not
sure I like setHandler() because I'm allowing multiple callback
functions (quite possibly for no good reason.) I think I'll keep the
onComplete() and rename complete(name, data), but I'm not sure to
what. Perhaps I should use startTask() and stopTask(), although
"errorTask()" does not roll off the tongue properly.

One thing I've learned in a project where we had very complex and highly
automated forms with more than one layer of "aggregators" before the
actual Ajax transport layer, is that the handler function won't always
be interested in the complete current state of all requests, but in the
changes. From the handler's perspective, it would ask itself

  - why am I being called now?
  - has a job completed?
  - has a job failed?
  - is there a new job to consider?
etc.

To answer these questions it would have to keep an internal copy of the
previous state for comparison, but that would only duplicate what's
already available in the Aggregator object. You could find a way to pass
delta information to the handler. It could still get at the complete
state information if it has a reference to the Aggregator (in any of
several possible ways, or maybe even passed as an argument in the
callback), and the Aggregator exposes methods to query its state.

That is a fascinating concept. I've had systems where writing this in
a general way would have been very useful. For my current needs and
those anticipated relatively soon, this would be overkill, but I can
almost see how it could be done elegantly with a fairly simple API. I
think if I find a little spare time, I might try that just for the fun
of it.

I don't know how general you want to keep the Aggregator. Code reuse was
never my strong suit; I tend to copy and adjust instead of writing one
perfect class/object/function and leaving it untouched afterwards. From
that standpoint, I would probably integrate the Ajax requests into the
Aggregator.

Oh, I definitely want it more general than that. One AJAX call might
actually add a new task to the aggregator. Other asynchronous
processes for the aggregator might involve waiting for user input.
The main thing is that I want a simple way to know when all the data
needed for processing, however it's gathered from disparate sources,
is available. So it should be fairly general.

I guess there are parts of the API you haven't posted. From what I've
seen, the Aggregator could almost be considered as semantic sugar for a
simple hash-like registry object:

  // create "Aggregator"
  var agg = {};

  // start "red" job
  agg.red = { status: "pending" };
  myHandler(agg);

  getJsonAjax({
      url: "red/",
      success: function(data, status) {
        agg.red = { status: "completed", data: data };
        myHandler(agg);
      },
      failure: function(status, err) {
        agg.red = { status: "failed", error: err };
        myHandler(agg);
      }
  });

Okay, that's a little too simplistic, and not very pretty, either ;-)

To some extent that's right. But the trouble with such a hash is the
lack of knowledge of overall completion. In such a system, that
calculation would have to be pushed down into the handler function.
Avoiding that is really the motivator for this. I've done such things
fairly often, but the scenarios are getting more complicated, and if
certain expected requirements come through, they will get far more
complicated.

At the other end of the spectrum, you could go in a more object oriented
direction and create Job objects which would know how they should be
transported, and which an Aggregator could then run and manage. This
could also help to eliminate the "stringiness" of your job identifiers.

PS: I just saw Asen's reply. It looks like his RequestWrapper objects
are similar to what I was trying to say in my last paragraph.

Yes, I hadn't really considered going that far, but it might really be
useful. I'm not quite sure of the API, though. Are you thinking
something like this?:

var agg = new Aggregator(myHandler);
var job1 = new Job(function() {
getJsonAjax({
url: "red/",
success: function(data, status) {job1.stop(data);},
failure: function(status, err) {job1.error(status);}
});
});
agg.add(job1);


Thank you for your thoughtful and detailed reply,
 
T

Thomas 'PointedEars' Lahn

Scott said:
var agg = new Aggregator();

var myHandler = function(data, statuses, errors) {
// statuses = {"blue": "completed", "red": "completed",
// "yellow": "completed"}
// errors = {} //
// data = {"red": { /* ... */}, "blue": { /* ... */ },
// "yellow": { /* ... */ }}

// do something with data.blue, data.red, data.yellow
}

agg.onComplete(myHandler);

The `myHandler' variable does not appear to serve a useful purpose here.
Maybe it is only because of the example.

Make it

agg.oncomplete = myHandler;

if viable for your application design. Saves one call and is more
intuitive.
// Or var agg = new Aggregator(myHandler);

agg.start("blue");
agg.start("red");
agg.start("yellow");

getJsonAjax({

Rename getJsonAjax() and make it a method of `agg' inherited from
`Aggregator.prototype' that does not send(), and you will not have to worry
about agg.start(). Then agg.send() at last.
[...]
The user associates each asynchronous process with a String (above
"red", "blue", and "yellow"). When each process completes, the
relevant returned data is supplied with the complete() call.

The completion handlers accept three parameters. The first represents
the actual results of the various asynchronous calls, the second, the
status flags of each call ("started", "completed", "error", possibly
"aborted"), and the third, any error messages generated by these
calls. Any actual combination of the results is left to the caller,
presumably in the completion handlers. You can pass such handlers to
the onComplete() method or to the constructor.

I do have an implementation of this, but at the moment, I'm mostly
concerned with whether the API makes sense. Is there a cleaner way to
do this? Is this general enough? Are the String identifiers for the
calls robust enough?

Use either "constants" or object references instead. As for the latter, let
the caller deal with names when and if they need them (per variables or
properties).
Should the constructor also accept the identifiers?

I do not think it a bad idea if the callback(s) could optionally be supplied
with the constructor call. I have done so in JSX:httprequest.js myself.
BTW, thanks for this "aggregator" idea.
What suggestions do you have?

HTH


PointedEars
 
S

Scott Sauyet

Thomas said:
Scott said:
    var agg = new Aggregator();
    var myHandler = function(data, statuses, errors) {
// [ ... ]
      // do something with data.blue, data.red, data.yellow
    }
    agg.onComplete(myHandler);

The `myHandler' variable does not appear to serve a useful purpose here.  
Maybe it is only because of the example.

In practice, I would almost certainly pass an anonymous function to
the constructor or to the event registration method. I was just
trying to make things more explicit for the example.
Make it

  agg.oncomplete = myHandler;

if viable for your application design.  Saves one call and is more
intuitive.

I think it's just old habit to allow for multiple listeners. I can't
think of any circumstances where I would use more than one here, so it
might be best to drop the event registration and simply pass it in the
constructor.

Rename getJsonAjax() and make it a method of `agg' inherited from
`Aggregator.prototype' that does not send(), and you will not have to worry
about agg.start().  Then agg.send() at last.

That could work. But I think I like Stefan's suggestion of Job
objects being managed by the aggregator. I don't have it all worked
out yet, but something like:

var agg = new Aggregator([
new AjaxJob("red/"),
new AjaxJob("blue/"),
new AjaxJob("yellow/"),
new OtherJob("whatever", "parameters")
], myCallback);

agg.start(); // maybe not necessary.

Use either "constants" or object references instead.  As for the latter, let
the caller deal with names when and if they need them (per variables or
properties).

I've been assuming that the callback function does the actual
aggregation (which perhaps means that the main class is poorly
named.) For my current needs, the results of the individual calls all
have the same structure, and I could use some generic aggregation, but
I would rather not assume it will stay that way, so I need some means
to distinguish the various results from one another, but using
Stefan's technique, this could be handled simply by using the job
objects.

I do not think it a bad idea if the callback(s) could optionally be supplied
with the constructor call.  I have done so in JSX:httprequest.js myself..  

Yes, that was always my intent. But in the above, I can also supply
the initial set of Job objects to the Aggregator.

BTW, thanks for this "aggregator" idea.

You're welcome. I can't take credit for the name. I had the idea,
and could not come up with a name for it. It was driving me a little
crazy, too. A coworker came up with "Aggregator" half-way through my
explanation.


Very much so. Thank you very much.
 
A

Asen Bozhilov

Scott said:
It probably is enough for my current needs, too.  But there is a real
possibility that I will need some additional features I didn't mention
in my initial message, but which did prompt the API I used.  First of
all, sometimes the asynchronous process I'm waiting for might be user
input rather than an AJAX call.  Second, some calls might lead me to
make additional ones, and I want my wrap-up function to run only after
all the results are in.  I might still be able to get away without
listing the calls, though, and only counting, but I don't think I
could use the push-push-push-start system, but rather one in which
each process is started and the aggregator checks after each one
completes whether there are still any processes still running.

Probably by these requirements the follow is enough.

function Aggregator() {
var that = this;
this.data = [];
this._count = 0;

this._handler = function (request, response) {
that.data.push({response : response, request : request});
request.success = null;
that._count--;
if (that._count == 0) {
this.onComplete();
}
};
}

Aggregator.prototype = {
onComplete : function () {},
add : function (request) {
this._count++;
request.success = this._handler;
}
};

var agg = new Aggregator();

agg.onComplete = function () {
/**
* Get the data of the aggregated requests
*/
this.data;
};

var req1 = new RequestWrapper(),
req2 = new RequestWrapper(),
req3 = new RequestWrapper();

agg.add(req1);
agg.add(req2);

req1.send();
req2.send();

agg.add(req3);
req3.send();

The interesting point for me is how do you want to associate response
data with the requests? And in which order? At my current API they are
in order of response not in order of adding in `Aggregator'. With
your approach with identifiers and mapping in native object you would
have one problem. You cannot support any order. If you use `for-in'
for iterating over responses you cannot know in which order they will
be enumerated, because that is implementation depended.

You should reply on these things which I wrote, and probably you would
get more clever ideas for your API. The general problem is the order
of response which you expected.
 
S

Scott Sauyet

Asen said:
Scott said:
[ additional requirement for aggregator ]

Probably by these requirements the follow is enough.

function Aggregator() {
    var that = this;
    this.data = [];
    this._count = 0;

    this._handler = function (request, response) {
        that.data.push({response : response, request : request});
        request.success = null;
        that._count--;
        if (that._count == 0) {
            this.onComplete();
        }
    };

}

Aggregator.prototype = {
    onComplete : function () {},
    add : function (request) {
        this._count++;
        request.success = this._handler;
    }
};

var agg = new Aggregator();
agg.onComplete = function () {
// [ ... ]
};

var req1 = new RequestWrapper(),
    req2 = new RequestWrapper(),
    req3 = new RequestWrapper();

agg.add(req1);
agg.add(req2);

req1.send();
req2.send();

agg.add(req3);
req3.send();

This should work well, I think. I would add "request.send();" to the
Aggregator.prototype.add method so we could lose the "reqN.send()"
calls.

I will probably proceed down the path I've discussed earlier in the
thread with Stefan, but that's pretty close to yours.

The interesting point for me is how do you want to associate response
data with the requests? And in which order? At my current API they are
in order of response not in order of adding in `Aggregator'.  With
your approach with identifiers and mapping in native object you would
have one problem. You cannot support any order. If you use `for-in'
for iterating over responses you cannot know in which order they will
be enumerated, because that is implementation depended.

Honestly, I thought of this as one of the strengths of that approach.
I can arbitrarily assign a handle that labels the response data, the
errors, and the status of each requests. If I find I care about the
order they come in, the Aggregator can easily track that too, but I
don't see that as likely to ever be important. But because I have the
labels, my callback function can work separately with data.yellow,
data.red, etc. as needed.

Right now the one handler I've written for real code is making calls
to several similar systems and simply concatenating and sorting their
output. It replaces code which called the second system in a success
callback to the first AJAX call and the next one in a callback to
that. It was ugly but worked fine for several years, until the first
service failed but the others would still have worked. In my case, I
do need to know the source of the data, so each response item was
tagged with the label ("red", "yellow", "blue" in the example.) What
may be added soon is a backup system for some of the systems checked.
That would be why the scheduled jobs might have to have links back to
the Aggregator: if System A fails, I want to add a call to System A'
to the aggregator. I'm still a little queasy about that part of the
API; it seems a circular dependency that I'd rather not introduce.

You should reply on these things which I wrote, and probably you would
get more clever ideas for your API. The general problem is the order
of response which you expected.

For me, the main thing is to be able to associate the response with
the request when I go to put them together. Order is simply not
important.

Again, thank you for your helpful replies.
 

Ask a Question

Want to reply to this thread or ask your own question?

You'll need to choose a username for the site, which only take a couple of moments. After that, you can post your question and our members will help you out.

Ask a Question

Members online

Forum statistics

Threads
473,969
Messages
2,570,161
Members
46,709
Latest member
AustinMudi

Latest Threads

Top