|
|
|
|
The current L4 system contains what we can consider a prototype harness
diski- disk input
neto- transfer datastream to processing computers via network, adding end-of-run
markers
neti-receive datastream from network
nodi-transfer datastream to user processes on this computer , adding database records
and event steering records ( ie random number seed , event duplicate )
nodo -collect output from user processes on this computer synchronising on end-of-run
logo-output this stream to network
receiver - collect the output streams from all computers
logging - disk output , dump job submission, tcl job submission
master - starts some of the tasks ( receiver, logo )
These tasks use PVM for data transfer ( using TCP ) and shared memory for communication
with the user application.
This harness runs on Lynx-OS/irix and apart from movement to linux has several
missing features which will be needed by a general user job harness ie
Resource allocation and process management :on which machines should the job be run
( current har coded in master and in configuration files for h1l4ioX machines )
- automatic choice according to data file location and network capabilities and system loading
Security : user authentication - wee don't want to have to install all user account on all machines
Health and Status : monitoring of health and status of system
PVM daemon on per user OR per system-> interference between different user jobs
PVM master daemon runs on single system -> may become bottleneck
Data transfer speed/efficiency : TCP/IP flow control may not be appropriate and
has high overhead especially for just round the corner high speed commodity networks
eg gigabit ethernet. PVM introduces further data buffering ( and hence copying ) reducing
performance.
Executable management : construction, caching and location of executables - fast distribution
of tasks to all machines in user job
Remote data access : fpack causes file staging - but no pre-staging , automatic temporary
output file creation and migration on close.
-
|
|