Running long jobs on DAMTP workstations
We like to ensure that the person using a computer at the console (physically working at the computer) enjoys reasonable interactive performance. Therefore it is a courtesy to them that long running jobs do not impact upon their use of the computer. A long job is one that requires more than about 10 minutes of CPU time. This includes Matlab, Maple and Reduce runs as well as conventional FORTRAN or C programs.
Contents
The Rules
- NICE: always run long jobs in the background with a "nice" value of 19:
nice -19 command <input >output & (sh, bash)) nice +19 command <input >output & (csh, tcsh)The nice command minimises the effect of your job on other logged on users who need a good interactive response for editing etc. Use the appropriate form of the command for the shell which you use. (The default shell in DAMTP is bash). If you start a job and it runs longer than you expect you can alter its nice value with the renice commandman renice), for example:ps -ef | grep program_name # to find the PID of the job/process renice +19 -p ${PID} - MEMORY: a minimum of 1GB of memory should be left free for the console user. You must take care not to overload computers by starting jobs which use an excessive amount of memory. You should always be aware of your program's approximate memory requirements, the total amount of memory available on the computer, and how much memory is being used by other users (see Memory and Swapping below)
- I/O: A job which does very large amounts of I/O maybe inappropriate for running on a machine that someone is using as a console: Linux's handling of I/O is still far from ideal. If a background job is causing a machine to produce a noticable delay in responding to keystrokes, or to change focus, it is inappropriate
- MULTIPLE JOBS: you should not start multiple jobs in competition with another user's job (or users' jobs) to gain an unfair share of a computer's processing power. Similarly you should not run jobs simultaneously on several computers to gain an unfair share of the total available processing power. If you require more power than is available on a single workstation you should discuss your needs with the IT Manager.
- BACKGROUND/LOGOUT: do not log on to a workstation console and leave a program running unattended so that others cannot use the console. Use the nohup command to ensure that a background job keeps running after you log out:
nohup nice -19 command <input >output &
Preparing Programs for Running, Checkpointing
You should always take care to ensure that your program has been optimised (if you don't know about optimisation, ask).
You should also make sure that long jobs are restartable. This means that if you expect the run to last more than a day, dump internal tables and data to a disk periodically so that in the event of a computer being restarted you will not have to start your job from the beginning. This is commonly known as checkpointing. If you have difficulty with this, you should split your long jobs up into sections which will take no more than about a day to run.
Jobs which do periodic checkpointing should also trap the TERM signal and perform a checkpoint, this means that if the jobs are killed during a shutdown or reboot then less of the work will be wasted. As of May 2009 we have altered the shutdown and reboot procedures of the Linux machines to send TERM signals to user jobs much earlier than before and allow a few extra seconds for jobs to save any state.
- To understand when DAMTP computers get restarted/reboot, please read the reboot policy
- Planned reboots are advertised a mailing list
Using top
top displays information about active processes ranked in order of CPU usage. Items of interest include:
NICE the process's nice value - must be 19 for long jobs TIME Execution time, minutes:seconds WCPU Time-averaged percentage of CPU used. SIZE The total size of the process in kilobytes RES The amount of memory actually in use (resident size)
You should always use top to check for other running jobs before starting your own. You can monitor your program's CPU and memory requirements by running top in a separate xterm window. See Memory and Swapping for the meaning of SIZE and RES.
Load Average
Load average is basically the average number of processes which are competing for a share of the computer processor, and ready to run, i.e. not waiting for input or halted. Therefore, if there are three non-interactive jobs running on a computer, you would expect the load average to be 3. Of course, other user and system processes also increase the load average temporarily, so you might expect the actual load average reading to be more than 3.
Memory and Swapping
Linux and UNIX operating systems provides "virtual memory" for executing processes using a combination of real memory (RAM) and "swap space" on disk. The amount of virtual memory available is typically 2-4 times that of the real memory installed in the computer.
The total amount of virtual memory required by all the processes running in the computer is often greater than the amount of real memory, and the operating system has to move "pages" of memory to and from disk as different processes become active. (a process cannot be executed when essential pages of memory are swapped out on the disk).
The top command tells you how much memory each process is using. SIZE is the total size of the process in kilobytes: the amount of swap space allocated to the process on the swap disk. RES is the resident size of the process in kilobytes: an estimate of the amount of real memory currently needed by the process. RES is the size of the process' "working set": the code and data which is being accessed frequently. Little-used data and code (e.g. initialisation code) will soon get swapped out to disk and not be included in the RES figure.
In normal operations the system can select unused or seldom used pages of memory as candidates for swapping and the computer works at maximum efficiency. When the total amount of resident memory needed by active processes exceeds the amount of available real memory, the computer will "page" to disk excessively and give a much reduced overall performance.
This will often make the computer unusable by other logged-on users, and lead to greatly increased elapsed time to completion of all jobs running on the computer. Users should always check that their jobs do not require more resident working set memory than is available on the computer, and if necessary use a computer with more memory.
See the computer table for the amount of memory installed in DAMTP computers.
When several users are running jobs on a computer they should cooperate to make sure the computer does not become overloaded. If necessary a running process can be suspended using the kill command:
kill -STOP process_id
and resumed with
kill -CONT process_id
The vmstat command can be used to display the computer's paging activity (si, so) in Kbytes/s. The computer will become overloaded if paging rates exceed typically several Mbyte/s. The iostat command will allow you to see the i/o traffic to and from local devices - however it will not report activity to network-mounted file-systems. The free command will display the total amount of physical memory and swap space for the system, it is quite normal for the free memory reported to be very low since the system will often use and otherwise unused memory for buffering files (the traditional buffercache having long since been replaced by a unified vm system on most modern machines).
Please email any suggestions, corrections, broken links, or errors to itweb [itweb