(RADIATOR) Deadhost marking

Fri Apr 28 04:20:26 CDT 2006

Hello,

I was thinking and experimenting with dead host marking. I think that
actual code in Radiator is not as good as it should be.

On host r1orgC.etest.cesnet.cz I'm working with this configuration:

<Handler Realm=/^orgC\.etest\.cesnet\.cz$|^r1orgC\.etest\.cesnet\.cz$/i>
	AuthBy	CheckFILE
	AuthLog authlogger
</Realm>

<Handler TunnelledByTTLS=1>
	AuthBy	CheckFILE
	AuthLog authlogger
</Handler>

<Handler TunnelledByPEAP=1>
	AuthBy	CheckFILE
	AuthLog authlogger
</Handler>

<Handler>
        <AuthBy RADIUS>
                RetryTimeout            1
                Retries                 1
                FailureBackoffTime      60

		UseExtendedIds

                <Host r1nren.etest.cesnet.cz>
                        AuthPort                1812
                        AcctPort                1813
                        Secret                  testing
                </Host>
                <Host r2nren.etest.cesnet.cz>
                        AuthPort                1812
                        AcctPort                1813
                        Secret                  testing
                </Host>
        </AuthBy>

        AddToReplyIfNotExist    Tunnel-Private-Group-ID=1:100
        AddToReply              Tunnel-Type=1:VLAN,\
                                Tunnel-Medium-Type=1:Ether_802
</Handler>

That says

1) if request have realm @orgC.etest.cesnet.cz or
@r1orgC.etest.cesnet.cz it is processed localy

2) otherwise send it to two uplevel radiuses r1nren and r2nren.

To r1nren and r2nren are also connected r1orgA a r2orgA simulating other
"eduroam" interconnected site. All radiuses have equivalent configuration.

I did simple simulation:

1) all servers are up

   result: everyone is happy, everything si working

2) r1nren is down, user from orgA visits orgC

   - r1orgC is sending acccess-request to r1nren.
   - it is down 1sec timeouts, r1orgC is retransmiting
   - another second out, r1nren is marked as dead
   - r1orgC continues with r2nren which finaly responds

   result: user get authenticated with aprox 3sec delay, prety good

3) user from some-not-connected organization visits orgC, for example
   orgB

   - r1orgC is sending acccess-request to r1nren.
   - it does not know orgB so it simply IGNORE request
   - 1sec timeout on r1orgC, retransmision, 1sec timeout
   - r1nren is mared as DEAD even it isn't
   - same with r2nren

3b) user from orgA opens his laptop

   result: because both r1/2nren servers are marked as dead his
   access-request timeouts.

I did similar experiment with CISCO AIRONET 1200 AP. It had configured
two APs. There were two clients one notebook was trying to authenticate
with non-existent realm which get timeouts. AP quickly marked both it's
RADIUSes as dead. In this moment starts another notebook with working
realm. AP give it try and sends Access-Request to one of its RADIUSes.
WOW! Radius reponds, user get authenticated and can work. First user is
still trying, and after few sec are both RADIUSes again marked as DEAD.

I like this CISCO aproach very much! It allows to fallback for Backoff
time to 2nd, 3rd... RADIUSes if first, second, ... is not working and
what is more critical that this way valid users can get to network and work.

I did quick hack to Radius/AuthRADIUS.pm which implements CISCO way. It
works for me prety fine! In my simulated situation 3b) user gets
authenticated :)

Please can be this patch reevaluated and posibly included to Radiator?

Another interesting idea taken from that CISCO AP is that it can mark
RADIUS hosts as dead only if it gets an connection refused or some other
sort ICMP message which is being sent as response to UDP packet which
can not reach it's destination.

Best regards
-- 
-----------------------
Jan Tomasek aka Semik
http://www.tomasek.cz/
-------------- next part --------------
An embedded and charset-unspecified text was scrubbed...
Name: AuthRADIUS.pm.patch
URL: <http://www.open.com.au/pipermail/radiator/attachments/20060428/68c59ca2/attachment.ksh>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 189 bytes
Desc: OpenPGP digital signature
URL: <http://www.open.com.au/pipermail/radiator/attachments/20060428/68c59ca2/attachment.bin>