J
Joe Van Dyk
Hi,
I've been assigned with the task of writing a program that lets a user
start and monitor applications on a group of computers/nodes (called a
cluster).
Requirements:
- Multiple people can start the GUI and view what's going on on the
cluster.
- People need to be able to assign specific programs to specific nodes. =
=20
- Need to be able to give each program values for their command-line
arguments and environment variables.
- Be able to easily view the log files for each application
- Be able to start/kill applications at will
- See if an application is still running or not. Notified if a
program unexpectedly died.
- Start and kill a group of applications as if it's one application
- Users shouldn't be able to kill another user's application=20
So, I'm thinking that on each node there could be a DRb server that could:
- Listen for start application commands
- Listen for kill application commands
- When it gets an "update" command, would send back a status report
that contained what applications were running, the load average, stuff
like that.
And then each node could register themselves somehow with Rinda? And
then every GUI that started could connect to that Rinda server.=20
Except that in the future the GUI will have to deal with multiple
clusters, and so there might have to be multiple Rinda servers (one
for each Cluster).
Does that approach sound reasonable? Are there any libraries or
existing applications out there that would help me with this?
Thanks,
Joe
I've been assigned with the task of writing a program that lets a user
start and monitor applications on a group of computers/nodes (called a
cluster).
Requirements:
- Multiple people can start the GUI and view what's going on on the
cluster.
- People need to be able to assign specific programs to specific nodes. =
=20
- Need to be able to give each program values for their command-line
arguments and environment variables.
- Be able to easily view the log files for each application
- Be able to start/kill applications at will
- See if an application is still running or not. Notified if a
program unexpectedly died.
- Start and kill a group of applications as if it's one application
- Users shouldn't be able to kill another user's application=20
So, I'm thinking that on each node there could be a DRb server that could:
- Listen for start application commands
- Listen for kill application commands
- When it gets an "update" command, would send back a status report
that contained what applications were running, the load average, stuff
like that.
And then each node could register themselves somehow with Rinda? And
then every GUI that started could connect to that Rinda server.=20
Except that in the future the GUI will have to deal with multiple
clusters, and so there might have to be multiple Rinda servers (one
for each Cluster).
Does that approach sound reasonable? Are there any libraries or
existing applications out there that would help me with this?
Thanks,
Joe