Parallel multi process bash with return codes
Have you ever needed to run a bunch of long running processes from a bash script and get their return codes ? I come across this issue quite frequently in my line of work. The most common one is where i need to run rsync to collect files from many machines then if successful run some other task. Depending on the amount of servers and data this can take several hours to run sequentially and I don’t really like waiting around to check the output so that I can run the next task.
How to speed it up? The obvious way would to be to background the rsysc commands but then I dont know if they were all successful. What if one fails? How would I know which one? Some how I needed to catch the return codes of all the sub-shells and be able to match them to a command. This is where the bash command wait come into play.
~]$ help wait
wait: wait [id]
Wait for job completion and return exit status.Waits for the process identified by ID, which may be a process ID or a
job specification, and reports its termination status. If ID is not
given, waits for all currently active child processes, and the return
status is zero. If ID is a a job specification, waits for all processes
in the job’s pipeline.Exit Status:
Returns the status of ID; fails if ID is invalid or an invalid option is
given.
The idea is to collect the PID of each sub-shell using the $! variable and adding it to a list. Then use the wait command to wait for each sub-shell to finish and exit with the return code of the sub-shell command. By adding these return codes to another list then we can iterate over them and match them up with the original list. We split the original list into an array towards the end so we can reference the individual items by an index.
In the following example I have used wget and also deliberately changed testfile03.txt to testfile06.txt to show an example of a non 0 return code.
testfile01
testfile02
testfile03
testfile04
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 | #!/bin/bash my_list=" http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile01.txt http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile02.txt http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile06.txt http://www.thegoldfish.org/wp-content/uploads/2010/08/testfile04.txt " results='' pids='' for X in $my_list ; do wget -o /dev/null $X & pid=$! pids="$pids $pid" done for pid in $pids do wait $pid result=$? results="$results $result" done echo $results i=0 my_array=( $my_list ) for ret_val in $results; do echo ${my_array[$i]} returned $ret_val ((i++)) done |
I could not get it to work properly but your script was of great help and now I can finally get the exit statuses of allmy background processes. Thanks!
With GNU Parallel you could do this to get a logfile
$parallel –joblog /tmp/logfile wget -o /dev/null ::: [... URLS ...]
$ cat /tmp/logfile
Seq Host Starttime Runtime Send Receive Exitval Signal Command
3 : 1312373549 1 0 0 8 0 wget url1
1 : 1312373549 5 0 0 0 0 wget url2
2 : 1312373549 9 0 0 0 0 wget url3
4 : 1312373549 17 0 0 0 0 wget url4
Watch the intro video to learn more: http://www.youtube.com/watch?v=OpaiGYxkSuQ
Thanks Ole, that looks like a fantastic tool, thanks for the heads up
For those of you who want to take a look you can find it at the GNU Parallelhome page. I am sure that I will use it in the future.