[RADIATOR] AuthHEIMDALDIGEST and defunct kdigest processes

Heikki Vatiainen hvn at open.com.au
Tue Feb 5 17:14:31 UTC 2019


On 31/01/2019 15.10, Johan Wassberg wrote:

> We are using Radiator with AuthHEIMDALDIGEST and recently upgraded from 4.15 to
> 4.22. We have noticed that 4.22 is leaving a lot of defunct `kdigest` processes
> which over time is causing Radiator to crash due to trouble forking new
> `kdigest`s.

Thanks for reporting this. I was able to reproduce this.

> One solution could be to run `waitpid` with "-1" in a loop instead of
> the real pid and therfor handling all childs that sent `SIGCHLD` in the
> next authentication. Another soution could be to remove `WNOHANG` making
> `waitpid` block until the child returns. Not sure if that has any
> performance issues or why you implemented `waitpid` with `WNOHANG` from
> the beginning.

I'd say removing WNOHANG could work here. When I wrapped waitpid with 
two debug log calls, the time stamps they reported were typically less 
than a millisecond. This which makes me think that waitpid typically 
will wait for a very short time.

In fact, I could not first reproduce this before I added additional load 
to the test machine. After this the number of zombies started slowly 
going up. Based on this I think there was a race, just like you guessed. 
It also appears that the wait time is very short and thus it's worth 
just letting waitpid to run a bit longer without WNOHANG.

WNOHANG was likely used because with it zombies were not seen (but the 
system was should have been under more load).

> Let me know if there is anything else I can provide to easier resolve
> this issue.

Can you try removing WNOHANG and configure radiusd to use 
LogMicroseconds? If there are visible changes in performance, then log 
lines after the last 'digest command output' and return code (ACCEPT, 
etc.) from the AuthBy can be observed more closely. Or you could modify 
the source to trigger a log message if waitpid starts using too long 
time. Maybe 2 milliseconds could be good trigger.

Thanks,
Heikki

-- 
Heikki Vatiainen <hvn at open.com.au>

Radiator: the most portable, flexible and configurable RADIUS server
anywhere. SQL, proxy, DBM, files, LDAP, TACACS+, PAM, Active Directory,
EAP, TLS, TTLS, PEAP, WiMAX, RSA, Vasco, Yubikey, HOTP, TOTP,
DIAMETER etc. Full source on Unix, Windows, MacOSX, Solaris, VMS, etc.


More information about the radiator mailing list