timeout a command in ksh

timeout a command in ksh

am 08.02.2006 00:13:24 von jrw32982

I'm trying to write some shell code which will run a specified command
but timeout if the command takes too long. I've browsed the archives
and found many interesting ideas, but none of them seems to address all
of my requirements. Here are the requirements:

1) This code is not a separate script but can be embedded within an
existing script.
2) The "watchdog" process is cleaned up if the command exits early.
3) The exit status of the command is reported properly.

The best I've been able to come up with is the following. It looks
like it should address all my requirements. It does appear to work on
Solaris, but on AIX it exits 0 even though the child process has been
killed.

#!/bin/ksh

timeout_handler() {
[[ $TIMER_CMD_PID != "" ]] && kill $TIMER_CMD_PID # 2>/dev/null
}

run_with_timeout() {
TIMER_TIMEOUT=$1
shift
sleep $TIMER_TIMEOUT && kill -s USR1 $$ &
TIMER_SLEEP_PID=$!
"$@" &
TIMER_CMD_PID=$!
wait $TIMER_CMD_PID
TIMER_RC=$?
unset TIMER_CMD_PID
trap "" USR1
kill $TIMER_SLEEP_PID 2>/dev/null
return $TIMER_RC
}

trap timeout_handler USR1
run_with_timeout 3 /bin/sleep 60
exit $?

This code runs a sleep command for 60 seconds but that command is
terminated early. BTW, I found that the ALRM signal is useless under
older (1988) versions of ksh, as it appears to be ignored (contrary to
the documentation, which makes it appear that it should work). So I
use the USR1 signal instead.

Any ideas would be most appreciated.

-- John Wiersba

Re: timeout a command in ksh

am 08.02.2006 09:45:14 von Michael Paoli

jrw32982@yahoo.com wrote:
> I'm trying to write some shell code which will run a specified command
> but timeout if the command takes too long. I've browsed the archives
> and found many interesting ideas, but none of them seems to address all
> of my requirements. Here are the requirements:
> 1) This code is not a separate script but can be embedded within an
> existing script.
> 2) The "watchdog" process is cleaned up if the command exits early.
> 3) The exit status of the command is reported properly.
> The best I've been able to come up with is the following. It looks
> like it should address all my requirements. It does appear to work on
> Solaris, but on AIX it exits 0 even though the child process has been
> killed.

Well, at least for illustrative/conceptual purposes*, how about an
example:
$ time ./timeout 5 3; echo $?

real 0m3.018s
user 0m0.000s
sys 0m0.010s
143
$ time ./timeout 3 5; echo $?

real 0m3.019s
user 0m0.000s
sys 0m0.010s
0
$ cat timeout
#!/bin/sh

# our watchdog timeout in seconds
maxseconds="${2-2}"

# how long the thing we want to timeout happens to run,
# we can play with that value for testing purposes
itrunsforseconds="${1-1000}"

# thing that might take too long
{
sleep $itrunsforseconds
} &
waitforpid=$!

{
sleep $maxseconds
# one can be much more elaborate about how to zap it when it's
taken too
# long, but for simplistic illustration purposes ...
2>>/dev/null kill -0 $waitforpid && kill -15 $waitforpid
} &
killerpid=$!

>>/dev/null 2>&1 wait $waitforpid
# this is the exit value we care about, so save it and use it when we
exit
mypidexitedwith=$?

# zap our watchdog if it's still there, since we no longer need it
2>>/dev/null kill -0 $killerpid && kill -15 $killerpid

exit $mypidexitedwith

With ksh, however, results vary significantly, depending, among other
factors, upon which ksh implementation is used, e.g:
$ time ksh ./timeout 5 3; echo $?

real 0m5.029s
user 0m0.000s
sys 0m0.020s
0
$ time ksh ./timeout 5 3; echo $?

real 0m3.03s
user 0m0.01s
sys 0m0.02s
98
$

Preliminary looks would seem to indicate the observed behavior of wait
in ksh/pdksh and that documented in the man pages is not consistent
(note also that on some systems, /bin/ksh and/or /usr/bin/ksh is
pdksh).

*primarily intended for illustrative purposes, no guarantees it's fully
tested, bug free, or complete

references:
http://groups.google.com/group/comp.unix.shell/browse_frm/th read/1ca39669fb14920/05185289ab0d3ab1#05185289ab0d3ab1
Message-ID:
Message-ID: <1139354004.898448.299200@g47g2000cwa.googlegroups.com>
http://www.rawbw.com/~mp/unix/sh/#Good_Programming_Practices

Re: timeout a command in ksh

am 08.02.2006 17:47:46 von jrw32982

Michael Paoli wrote:
> With ksh, however, results vary significantly, depending, among other
> factors, upon which ksh implementation is used, e.g:
> $ time ksh ./timeout 5 3; echo $?
>
> real 0m5.029s
> user 0m0.000s
> sys 0m0.020s
> 0
> $ time ksh ./timeout 5 3; echo $?
>
> real 0m3.03s
> user 0m0.01s
> sys 0m0.02s
> 98
> $
>
> Preliminary looks would seem to indicate the observed behavior of wait
> in ksh/pdksh and that documented in the man pages is not consistent
> (note also that on some systems, /bin/ksh and/or /usr/bin/ksh is
> pdksh).

Thanks, Michael! Your approach (no trap) seems to work better on AIX.
I'm curious: how did you get the aberrant behavior quoted above? What
OS/shell were you using? I seem to always get an exit status of 143
(not 98) if the command is interrupted and I haven't seen the strange
behavior of the command *not* being interrupted as your first example
shows.

-- John

Re: timeout a command in ksh

am 09.02.2006 12:07:36 von Michael Paoli

jrw32982@yahoo.com wrote:
> Michael Paoli wrote:
(in Message-ID:
<1139388314.487341.261160@o13g2000cwo.googlegroups.com>)
$ cat timeout
#!/bin/sh
maxseconds="${2-2}"
itrunsforseconds="${1-1000}"
{
sleep $itrunsforseconds
} &
waitforpid=$!
{
sleep $maxseconds
2>>/dev/null kill -0 $waitforpid && kill -15 $waitforpid
} &
killerpid=$!
>>/dev/null 2>&1 wait $waitforpid
exit
mypidexitedwith=$?
2>>/dev/null kill -0 $killerpid && kill -15 $killerpid
exit $mypidexitedwith
> > With ksh, however, results vary significantly, depending, among other
> > factors, upon which ksh implementation is used, e.g:
> > $ time ksh ./timeout 5 3; echo $?
> >
> > real 0m5.029s
> > user 0m0.000s
> > sys 0m0.020s
> > 0
> > $ time ksh ./timeout 5 3; echo $?
> >
> > real 0m3.03s
> > user 0m0.01s
> > sys 0m0.02s
> > 98
> > $
> >
> > Preliminary looks would seem to indicate the observed behavior of wait
> > in ksh/pdksh and that documented in the man pages is not consistent
> > (note also that on some systems, /bin/ksh and/or /usr/bin/ksh is
> > pdksh).

> Thanks, Michael! Your approach (no trap) seems to work better on AIX.
> I'm curious: how did you get the aberrant behavior quoted above? What
> OS/shell were you using? I seem to always get an exit status of 143
> (not 98) if the command is interrupted and I haven't seen the strange
> behavior of the command *not* being interrupted as your first example
> shows.

The first of those aberrant examples was from
Debian GNU/Linux 3.1 a.k.a. "sarge" with pdksh as ksh per alternatives
configuration, and the second from HP-UX 11.11.

On Debian GNU/Linux 3.1 a.k.a. "sarge" the alternatives also indicates
that zsh4 could be used. Trying zsh4 I also get aberrant behavior:

$ time zsh4 timeout 5 3; echo $?

real 0m5.128s
user 0m0.000s
sys 0m0.020s
143
$
Somewhat different aberrant behavior, however, in that case. It does
give the expected exit (128+15=143), but it takes unexpectedly long
(with first argument of 5, and second argument of 3, we would expect
our
long(er) running (~5 seconds, per first argument) process to be
"watchdog" timed out per our shorter (~3 seconds) timeout per the
second argument. We see the correct exit value, but it took too long
(should have been much closer to 3 seconds, rather than over 5).

Putzing and tweaking a bit, I find issue/dependency with pdksh/zsh4
on Debian GNU/Linux 3.1 a.k.a. "sarge". I haven't rechecked if the
same
issue explains what was observed on HP-UX 11.11. The relevant man
pages
don't seem to, at least quickly, easily and readily, explain why we get
this precise behavior (most notably with a backgrounded { } list
command), though the answer may be in there somewhere ... and if not
there, ... in the source.
Anyway, examples, then the demo script:
$ ./timeout2
command runs 10 seconds, timeout in 3 seconds
bare command returned 143 in about 3 seconds
{} command returned 143 in about 3 seconds
() command returned 143 in about 3 seconds
$ ksh ./timeout2
command runs 10 seconds, timeout in 3 seconds
bare command returned 143 in about 3 seconds
{} command returned 0 in about 10 seconds
() command returned 143 in about 3 seconds
$ zsh4 ./timeout2
command runs 10 seconds, timeout in 3 seconds
bare command returned 143 in about 3 seconds
{} command returned 143 in about 10 seconds
() command returned 143 in about 3 seconds
$ cat timeout2
#!/bin/sh
maxseconds="${2-3}"
itrunsforseconds="${1-10}"
echo command runs $itrunsforseconds seconds, timeout in $maxseconds
seconds
s=`date '+%s'`
sleep $itrunsforseconds &
waitforpid=$!
{ sleep $maxseconds
2>>/dev/null kill -0 $waitforpid && kill -15 $waitforpid
} &
killerpid=$!
>>/dev/null 2>&1 wait $waitforpid
r=$?
e=`date '+%s'`
2>>/dev/null kill -0 $killerpid && kill -15 $killerpid
echo bare command returned $r in about `expr $e - $s` seconds
s=`date '+%s'`
{ sleep $itrunsforseconds; } &
waitforpid=$!
{ sleep $maxseconds
2>>/dev/null kill -0 $waitforpid && kill -15 $waitforpid
} &
killerpid=$!
>>/dev/null 2>&1 wait $waitforpid
r=$?
e=`date '+%s'`
2>>/dev/null kill -0 $killerpid && kill -15 $killerpid
echo '{}' command returned $r in about `expr $e - $s` seconds
s=`date '+%s'`
(sleep $itrunsforseconds) &
waitforpid=$!
{ sleep $maxseconds
2>>/dev/null kill -0 $waitforpid && kill -15 $waitforpid
} &
killerpid=$!
>>/dev/null 2>&1 wait $waitforpid
r=$?
e=`date '+%s'`
2>>/dev/null kill -0 $killerpid && kill -15 $killerpid
echo '()' command returned $r in about `expr $e - $s` seconds
$

references/excerpts:
$ cat /etc/debian_version
3.1
$ uname -mo
i686 GNU/Linux
$ /usr/sbin/update-alternatives --verbose --display ksh
ksh - status is auto.
link currently points to /bin/pdksh
/bin/pdksh - priority 10
slave usr.bin.ksh: /bin/pdksh
slave ksh.1.gz: /usr/share/man/man1/pdksh.1.gz
/bin/zsh4 - priority 5
slave usr.bin.ksh: /bin/zsh4
slave ksh.1.gz: /usr/share/man/man1/zsh.1.gz
Current `best' version is /bin/pdksh.
$
pdksh(1):
{ list }
Compound construct; list is executed, but not in a
subshell.
Note that { and } are reserved words, not
meta-characters.
zsh(1)/zshmic(1)/zshall(1):
{ list }
Execute list.
Message-ID: <1139388314.487341.261160@o13g2000cwo.googlegroups.com>

Re: timeout a command in ksh

am 09.02.2006 17:28:46 von jrw32982

Michael Paoli wrote:
> The first of those aberrant examples was from
> Debian GNU/Linux 3.1 a.k.a. "sarge" with pdksh as ksh per alternatives
> configuration, and the second from HP-UX 11.11.
>
> On Debian GNU/Linux 3.1 a.k.a. "sarge" the alternatives also indicates
> that zsh4 could be used. Trying zsh4 I also get aberrant behavior:

Thanks, Michael! I'm running/testing on AIX 5.3 and Solaris 9. My
current example code looks like:

#!/bin/ksh
function timeout {
typeset timeout=$1
shift
"$@" & # command which might hang
typeset cmd_pid=$!
sleep $timeout && kill -TERM $cmd_pid 2>/dev/null &
typeset sleep_pid=$!
wait $cmd_pid 2>/dev/null
typeset rc=$?
kill $sleep_pid 2>/dev/null
return $rc # will be non-zero if command was timed out
}
timeout 3 /bin/sleep 100 # sample call to timeout function

-- John Wiersba