[RADIATOR] clarification on AuthBy ROUNDROBIN failover

Andrew Clark adc at umn.edu
Wed Nov 10 10:14:00 CST 2010


Thanks for the reply, Mike.  I've done some further investigation in
the lab, and it seems like AuthBy ROUNDROBIN behavior along with
round-robin DNS doesn't work as I'd expect.  I might be expecting it
to work differently than it does, so I'd like to know if this is
working as designed or not when one combines AutthBy ROUNDROBIN with a
round-robin DNS entry.

I have an AuthBy ROUNDROBIN with a single Host entry, one that
resolves to three A records.  Authentication load is evenly split
among the three, but failover only accounts for one of them.  If I
replace the single Host entry with three separate ones, failover then
accounts for all three.

Here's a trace 4 log showing the behavior with a single Host entry
(that resolves to three hosts) - the three servers in question aren't
supposed to answer for this source, which is how I'm testing the
failover behavior.  And my session db isn't up :-)

Anyhow - it only queries the first in the set of three and then times
out without querying the other two.

Wed Nov 10 10:00:03 2010: NOTICE: Server started: Radiator 4.7 on test-aaa
Wed Nov 10 10:00:06 2010: DEBUG: Packet dump:
*** Received from 134.84.88.41 port 41461 ....
Code:       Access-Request
Identifier: 111
Authentic:  <17>jk2<133><166><225>{<229><206><161><217><151><182>#m
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Identifier = "203.63.154.1"
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password =
<223><248><30><236><131>;<142><147><238>8<15><221><230><214><159><134>

Wed Nov 10 10:00:06 2010: DEBUG: Handling request with Handler
'Client-Identifier=lab-wireless', Identifier 'lab-wireless-default'
Wed Nov 10 10:00:06 2010: DEBUG: session_db_mysql Deleting session for
mikem, 192.168.238.210, 1234
Wed Nov 10 10:00:06 2010: ERR: Could not connect to SQL database with
DBI->connect DBI:mysql:NTS_RADIUS:127.0.0.1:3306, radiator,
ucGhy40HW.kSM:  Can't connect to MySQL server on '127.0.0.1' (111)
Wed Nov 10 10:00:06 2010: ERR: Could not connect to any SQL database.
Request is ignored. Backing off for 600 seconds
Wed Nov 10 10:00:06 2010: DEBUG: do query is: 'delete from RADONLINE
where ACCTSESSIONID=''':
Wed Nov 10 10:00:06 2010: DEBUG: Handling with Radius::AuthRADIUS
Wed Nov 10 10:00:06 2010: DEBUG: AuthBy RADIUS creates new local
socket '0.0.0.0:0' for sending requests
Wed Nov 10 10:00:06 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  <17>jk2<133><166><225>{<229><206><161><217><151><182>#m
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <228><164>c|<203><148><182><0><4><160><152>]<127>(<138>h
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:00:11 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:00:11 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  <17>jk2<133><166><225>{<229><206><161><217><151><182>#m
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <228><164>c|<203><148><182><0><4><160><152>]<127>(<138>h
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:00:16 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:00:16 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  <17>jk2<133><166><225>{<229><206><161><217><151><182>#m
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <228><164>c|<203><148><182><0><4><160><152>]<127>(<138>h
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:00:21 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:00:21 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  <17>jk2<133><166><225>{<229><206><161><217><151><182>#m
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <228><164>c|<203><148><182><0><4><160><152>]<127>(<138>h
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:00:26 2010: INFO: AuthRADIUS CAH-wireless2008: No reply
after 20 seconds and 3 retransmissions to 134.84.119.107:1836 for
mikem (111). Now have 1 consecutive failures over 0 seconds. Backing
off for 300 seconds
Wed Nov 10 10:00:26 2010: INFO: AuthROUNDROBIN: Retry 1,
firstHostTried 0, lastHostTried 0
Wed Nov 10 10:00:26 2010: WARNING: AuthROUNDROBIN: Request was tried
for 1 times. All alive server from the RoundRobin list were tried.
Wed Nov 10 10:00:26 2010: INFO: AuthRADIUS CAH-wireless2008: Could not
find a working host to forward mikem (1) after 20 seconds. Ignoring
Wed Nov 10 10:00:26 2010: DEBUG: AuthBy ROUNDROBIN result: IGNORE,


--------------------------------------------------------------------


And here's when I use three Host entries that each resolve to the
three IP addresses involved - it works as I'd expect it to:

Wed Nov 10 10:02:07 2010: NOTICE: Server started: Radiator 4.7 on test-aaa
Wed Nov 10 10:02:14 2010: DEBUG: Packet dump:
*** Received from 134.84.88.41 port 45673 ....
Code:       Access-Request
Identifier: 239
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Identifier = "203.63.154.1"
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password =
^s<191><214><2><182><189><150><237>!<206><170><152><2><237><251>

Wed Nov 10 10:02:14 2010: DEBUG: Handling request with Handler
'Client-Identifier=lab-wireless', Identifier 'lab-wireless-default'
Wed Nov 10 10:02:14 2010: DEBUG: session_db_mysql Deleting session for
mikem, 192.168.238.210, 1234
Wed Nov 10 10:02:14 2010: ERR: Could not connect to SQL database with
DBI->connect DBI:mysql:NTS_RADIUS:127.0.0.1:3306, radiator,
ucGhy40HW.kSM:  Can't connect to MySQL server on '127.0.0.1' (111)
Wed Nov 10 10:02:14 2010: ERR: Could not connect to any SQL database.
Request is ignored. Backing off for 600 seconds
Wed Nov 10 10:02:14 2010: DEBUG: do query is: 'delete from RADONLINE
where ACCTSESSIONID=''':
Wed Nov 10 10:02:14 2010: DEBUG: Handling with Radius::AuthRADIUS
Wed Nov 10 10:02:14 2010: DEBUG: AuthBy RADIUS creates new local
socket '0.0.0.0:0' for sending requests
Wed Nov 10 10:02:14 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.207 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:19 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:19 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.207 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:24 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:24 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.207 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:29 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:29 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.207 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:34 2010: INFO: AuthRADIUS CAH-wireless2008: No reply
after 20 seconds and 3 retransmissions to 134.84.119.207:1836 for
mikem (239). Now have 1 consecutive failures over 0 seconds. Backing
off for 300 seconds
Wed Nov 10 10:02:34 2010: INFO: AuthROUNDROBIN: Retry 1,
firstHostTried 0, lastHostTried 1
Wed Nov 10 10:02:34 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:39 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:39 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:44 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:44 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:49 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:49 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.107 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:54 2010: INFO: AuthRADIUS CAH-wireless2008: No reply
after 40 seconds and 3 retransmissions to 134.84.119.107:1836 for
mikem (239). Now have 1 consecutive failures over 0 seconds. Backing
off for 300 seconds
Wed Nov 10 10:02:54 2010: INFO: AuthROUNDROBIN: Retry 2,
firstHostTried 0, lastHostTried 2
Wed Nov 10 10:02:54 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.7 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:02:59 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:02:59 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.7 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:03:04 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:03:04 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.7 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:03:09 2010: DEBUG: Timed out, retransmitting
Wed Nov 10 10:03:09 2010: DEBUG: Packet dump:
*** Sending to 134.84.119.7 port 1836 ....
Code:       Access-Request
Identifier: 1
Authentic:  6-<145><131><166><149>tKp(1e<205>z<241><177>
Attributes:
        User-Name = "mikem"
        Service-Type = Framed-User
        NAS-IP-Address = 192.168.238.210
        NAS-Port = 1234
        Called-Station-Id = "123456789"
        Calling-Station-Id = "987654321"
        NAS-Port-Type = Async
        User-Password = <18>
<241><241>p<227><159><200><208><158><7><216>Q<163>V<192>
        NAS-Identifier = "WIRELESS"
        Proxy-State = OSC-Extended-Id=1

Wed Nov 10 10:03:14 2010: INFO: AuthRADIUS CAH-wireless2008: No reply
after 60 seconds and 3 retransmissions to 134.84.119.7:1836 for mikem
(239). Now have 1 consecutive failures over 0 seconds. Backing off for
300 seconds
Wed Nov 10 10:03:14 2010: INFO: AuthROUNDROBIN: Retry 3,
firstHostTried 0, lastHostTried 0
Wed Nov 10 10:03:14 2010: WARNING: AuthROUNDROBIN: Request was tried
for 3 times. All alive server from the RoundRobin list were tried.
Wed Nov 10 10:03:14 2010: INFO: AuthRADIUS CAH-wireless2008: Could not
find a working host to forward mikem (1) after 60 seconds. Ignoring
Wed Nov 10 10:03:14 2010: DEBUG: AuthBy ROUNDROBIN result: IGNORE,


-- 
Andrew D. Clark
Network Operations Engineer
University of Minnesota, Networking/Telecom Services
2218 University Ave SE
Minneapolis, MN 55414-3029
Phone: 612-626-4880

On Tue, Oct 26, 2010 at 3:58 PM, Mike McCauley <mikem at open.com.au> wrote:
> Hello Andrew,
>
> On Wednesday 27 October 2010 01:38:12 am Andrew Clark wrote:
>> First one I sent to the list must've fallen through the cracks.
>>
>> I'm seeking some clarification on the failover behavior of AuthBy
>> ROUNDROBIN and how to read the logs when servers are marked dead.  I
>> have three hosts in the round-robin pool (via a round-robin DNS host
>> name) and I can see that requests are being distributed correctly to
>> all three.  What is unclear is the meaning of the logs when a server
>> is marked out.  The three servers are of course at three different IP
>> addresses, but I only see log messages about one of three IP addresses
>> being marked down:
>>
>> Tue Oct 12 16:14:52 2010: INFO: AuthRADIUS: No reply after 3
>> retransmissions to 134.84.119.107:1836 for foo  (). Now have 1
>> consecutive failures over 0 seconds. Backing off for 300 seconds
>> Tue Oct 12 16:14:52 2010: INFO: AuthROUNDROBIN: Retry 1,
>> firstHostTried 0, lastHostTried 0
>> Tue Oct 12 16:14:52 2010: WARNING: AuthROUNDROBIN: Request was tried
>> for 1 times. All alive server from the RoundRobin list were tried.
>> Tue Oct 12 16:14:52 2010: INFO: AuthRADIUS could not find a working
>> host to forward to. Ignoring
>> Tue Oct 12 16:14:53 2010: INFO: AuthRADIUS: No reply after 3
>> retransmissions to 134.84.119.107:1836 for foo  (171). Now have 1
>> consecutive failures over 0 seconds. Backing off for 300 seconds
>> Tue Oct 12 16:14:53 2010: INFO: AuthROUNDROBIN: Retry 1,
>> firstHostTried 0, lastHostTried 0
>
> This means there was only one 'non-dead' server left in our list of server.
>
>
>> Tue Oct 12 16:14:53 2010: WARNING: AuthROUNDROBIN: Request was tried
>> for 1 times. All alive server from the RoundRobin list were tried.
>
> This last message means that there was no reply from any of the 'non-dead'
> servers it tried, and it ran out of servers to try.
>
> Looks to me like at this stage 2 of the 3 servers had been marked as down (due
> to no repsonse), and then there was no response from the third.
>
> You may want to investigate why all the downstream servers failed to reply.
>
> Hope that helps.
>
> Cheers.
>
> --
> Mike McCauley                               mikem at open.com.au
> Open System Consultants Pty. Ltd
> 9 Bulbul Place Currumbin Waters QLD 4223 Australia   http://www.open.com.au
> Phone +61 7 5598-7474                       Fax   +61 7 5598-7070
>
> Radiator: the most portable, flexible and configurable RADIUS server
> anywhere. SQL, proxy, DBM, files, LDAP, NIS+, password, NT, Emerald,
> Platypus, Freeside, TACACS+, PAM, external, Active Directory, EAP, TLS,
> TTLS, PEAP, TNC, WiMAX, RSA, Vasco, Yubikey, MOTP, HOTP, TOTP,
> DIAMETER etc. Full source on Unix, Windows, MacOSX, Solaris, VMS, NetWare etc.
>


More information about the radiator mailing list