[RADIATOR] radiator Timeout handling
Michael
ringo at vianet.ca
Thu Sep 16 14:43:46 CDT 2010
Debian 4.0
Perl, v5.8.8 built for x86_64-linux-gnu-thread-multi
DBI v1.53
mysqld Ver 5.0.32-Debian_7etch12-log for pc-linux-gnu on x86_64 (Debian
etch distribution)
since my radiator config is quite lengthy, i'll just post what i think
is the relevant parts. if you would like to know more, let me know.
There are also a few very custom things I do to identify what type of
service the connection is coming from, handling of Acct Gigawords and
Acct Octets, etc... so, there's going to be a few things in the
debug/config that you wont recognize.
<Client> section allows specific NAS's
#All accounting packets for all NAS'S handled here:
<Handler Request-Type = "Accounting-Request", \
Acct-Status-Type = Start|Stop>
Identifier accounting
SessionDatabase sessdb
PreProcessingHook file:"%D/conf/hook.PreProcessingHook"
AuthByPolicy
AuthBy SQL-1-Accounting
</Handler>
<AuthBy SQL>
Identifier SQL-1-Accounting
DBSource dbi:mysql:yyy:yyy
DBUsername yyy
DBAuth yyy
Timeout 3
FailureBackoffTime 300
# a few attributes added to radius packet via previous Hook file.
AccountingTable `acct`
AcctColumnDef `submitted_un`,%u,formatted
AcctColumnDef `auth_un`,auth-un
AcctColumnDef `zone`,zone
AcctColumnDef `uid`,uid
AcctColumnDef `timestamp`,Timestamp
AcctColumnDef `type`,Acct-Status-Type
AcctColumnDef `acct_delay_time`,Acct-Delay-Time
AcctColumnDef `upload`,total-input-octets
AcctColumnDef `download`,total-output-octets
AcctColumnDef `sess_id`,Acct-Session-Id
AcctColumnDef `sess_time`,Acct-Session-Time
AcctColumnDef `term_cause`,Acct-Terminate-Cause
AcctColumnDef `nas_ip_address`,NAS-IP-Address
AcctColumnDef `ip_address`,Framed-IP-Address
AcctColumnDef `service`,service
AcctFailedLogFileName
%L/accounting-missed/logfile.sql-failed.acct.%Y%m%d
</AuthBy>
Thu Sep 16 15:19:48 2010: DEBUG: Packet dump:
*** Received from 192.168.100.1 port 1646 ....
Code: Accounting-Request
Identifier: 75
Authentic:
<133><139><188><253><225><157><240>[Lo<175><236><127><144><161><31>
Attributes:
Acct-Session-Id = "00000620"
Framed-Protocol = PPP
Framed-IP-Address = 192.168.100.100
User-Name = "dsltest"
Acct-Authentic = RADIUS
Acct-Status-Type = Start
NAS-Port-Type = Ethernet
NAS-Port = 100
NAS-Port-Id = "0/0/0/431"
Class = "FED,100,dsltest"
Service-Type = Framed-User
NAS-IP-Address = 192.168.100.1
Ascend-Session-Svr-Key = "AA39FB66"
NAS-Identifier = "TEST-LNS-1"
Acct-Delay-Time = 0
Thu Sep 16 15:19:48 2010: DEBUG: hook.service-label: finding
service-label for NAS-Port=(100)
Thu Sep 16 15:19:48 2010: DEBUG: hook.service-label: service-label=(testing)
Thu Sep 16 15:19:48 2010: DEBUG: hook.service-label:
add_attr(service=testing)
Thu Sep 16 15:19:48 2010: DEBUG: Handling request with Handler
'Request-Type = "Accounting-Request", Acct-Status-Type = Start|Stop',
Identifier 'accounting'
Thu Sep 16 15:19:48 2010: DEBUG: hook.PreProcessingHook: executing.
Thu Sep 16 15:19:48 2010: DEBUG: hook.PreProcessingHook: changed
Acct-Status-Type Start to login.
Thu Sep 16 15:19:48 2010: DEBUG: hook.PreProcessingHook: loading attrs
from Class.
Thu Sep 16 15:19:48 2010: DEBUG: Handling with Radius::AuthSQL:
SQL-1-Accounting
Thu Sep 16 15:19:48 2010: DEBUG: Handling accounting with Radius::AuthSQL
Thu Sep 16 15:19:48 2010: DEBUG: do query is: 'insert into `acct`
(`acct_delay_time`,`auth_un`,`ip_address`,`nas_ip_address`,`service`,`sess_id`,`submitted_un`,`timestamp`,`type`,`uid`,`zone`)
values
('0','dsltest','192.168.100.100','192.168.100.1','testing','00000620','dsltest','1284664788','login','100','FED')':
Thu Sep 16 15:19:51 2010: ERR: do failed for 'insert into `acct`
(`acct_delay_time`,`auth_un`,`ip_address`,`nas_ip_address`,`service`,`sess_id`,`submitted_un`,`timestamp`,`type`,`uid`,`zone`)
values
('0','dsltest','192.168.100.100','192.168.100.1','testing','00000620','dsltest','1284664788','login','100','FED')':
SQL Timeout
Thu Sep 16 15:19:54 2010: ERR: do failed for 'insert into `acct`
(`acct_delay_time`,`auth_un`,`ip_address`,`nas_ip_address`,`service`,`sess_id`,`submitted_un`,`timestamp`,`type`,`uid`,`zone`)
values
('0','dsltest','192.168.100.100','192.168.100.1','testing','00000620','dsltest','1284664788','login','100','FED')':
SQL Timeout
Thu Sep 16 15:19:54 2010: DEBUG: AuthBy SQL result: IGNORE, Database failure
### timeout value 3 * 2 queries = 6 seconds and 2 possible accounting
duplicates.
### omitted here is the exact same debug lines from another Start packet
from the NAS since it timed out and retransmitted.
### here's the Stop packet coming through trying an insert again, not
obeying FailureBackoffTime.
*** Received from 192.168.100.1 port 1646 ....
Code: Accounting-Request
Identifier: 76
Authentic: <130><28><130><231>C<233><2><135><30><142><206><148><156>T~<160>
Attributes:
Acct-Session-Id = "00000620"
Framed-Protocol = PPP
Framed-IP-Address = 192.168.100.100
User-Name = "dsltest"
Acct-Authentic = RADIUS
Acct-Session-Time = 4
Acct-Input-Octets = 880
Acct-Output-Octets = 116
Acct-Input-Packets = 12
Acct-Output-Packets = 7
Acct-Terminate-Cause = User-Request
Acct-Status-Type = Stop
NAS-Port-Type = Ethernet
NAS-Port = 100
NAS-Port-Id = "0/0/0/431"
Class = "FED,100,dsltest"
Service-Type = Framed-User
NAS-IP-Address = 192.168.100.1
Ascend-Session-Svr-Key = "AA39FB66"
NAS-Identifier = "TEST-LNS-1"
Acct-Delay-Time = 0
Thu Sep 16 15:19:54 2010: DEBUG: hook.service-label: finding
service-label for NAS-Port=(100)
Thu Sep 16 15:19:54 2010: DEBUG: hook.service-label: service-label=(testing)
Thu Sep 16 15:19:54 2010: DEBUG: hook.service-label:
add_attr(service=testing)
Thu Sep 16 15:19:54 2010: DEBUG: Handling request with Handler
'Request-Type = "Accounting-Request", Acct-Status-Type = Start|Stop',
Identifier 'accounting'
Thu Sep 16 15:19:54 2010: DEBUG: hook.PreProcessingHook: executing.
Thu Sep 16 15:19:54 2010: DEBUG: hook.PreProcessingHook: changed
Acct-Status-Type Stop to logout.
Thu Sep 16 15:19:54 2010: DEBUG: hook.PreProcessingHook: loading attrs
from Class.
Thu Sep 16 15:19:54 2010: DEBUG: hook.PreProcessingHook: calculating
total-output-octets and total-input-octets.
Thu Sep 16 15:19:54 2010: DEBUG: Handling with Radius::AuthSQL:
SQL-1-Accounting
Thu Sep 16 15:19:54 2010: DEBUG: Handling accounting with Radius::AuthSQL
Thu Sep 16 15:19:54 2010: DEBUG: do query is: 'insert into `acct`
(`acct_delay_time`,`auth_un`,`download`,`ip_address`,`nas_ip_address`,`service`,`sess_id`,`sess_time`,`submitted_un`,`term_cause`,`timestamp`,`type`,`uid`,`upload`,`zone`)
values
('0','dsltest','116','192.168.100.100','192.168.100.1','testing','00000620','4','dsltest','User-Request','1284664794','logout','100','880','FED')':
Thu Sep 16 15:19:57 2010: ERR: do failed for 'insert into `acct`
(`acct_delay_time`,`auth_un`,`download`,`ip_address`,`nas_ip_address`,`service`,`sess_id`,`sess_time`,`submitted_un`,`term_cause`,`timestamp`,`type`,`uid`,`upload`,`zone`)
values
('0','dsltest','116','192.168.100.100','192.168.100.1','testing','00000620','4','dsltest','User-Request','1284664794','logout','100','880','FED')':
SQL Timeout
Thu Sep 16 15:20:00 2010: ERR: do failed for 'insert into `acct`
(`acct_delay_time`,`auth_un`,`download`,`ip_address`,`nas_ip_address`,`service`,`sess_id`,`sess_time`,`submitted_un`,`term_cause`,`timestamp`,`type`,`uid`,`upload`,`zone`)
values
('0','dsltest','116','192.168.100.100','192.168.100.1','testing','00000620','4','dsltest','User-Request','1284664794','logout','100','880','FED')':
SQL Timeout
Thu Sep 16 15:20:00 2010: DEBUG: AuthBy SQL result: IGNORE, Database failure
### omitted here is the exact same debug lines from another Stop packet
from the NAS since it timed out and retransmitted.
On 10-09-16 03:00 PM, Hugh Irvine wrote:
>
> Hello Michael -
>
> We'll need to see a copy of the configuration file (no secrets), together with a more complete trace 4 debug showing what is happening.
>
> We will also need to know what hardware/software platform you are running on, what version of Perl, what version of DBI/DBD, what SQL database, and anything else that might be useful.
>
> regards
>
> Hugh
>
>
> On 16 Sep 2010, at 13:33, Michael wrote:
>
>> Hi,
>>
>> I'm having a couple issues with<AuthBy SQL>. Maybe it would be considered a bug i'm not sure.
>>
>>
>> 1. the Timeout handling.
>> ------------------------
>> > From my testing, it appears that radiator times out at this value, but seems to retry the sql query a second time, creating in another timeout count.
>>
>> eg debug:
>> Tue Sep 14 12:48:21 2010: DEBUG: Handling accounting with Radius::AuthSQL
>> Tue Sep 14 12:48:21 2010: DEBUG: do query is: 'insert into `acct`<snip>
>> Tue Sep 14 12:48:25 2010: ERR: do failed for 'insert into `acct`<snip> SQL Timeout
>> Tue Sep 14 12:48:29 2010: ERR: do failed for 'insert into `acct`<snip> SQL Timeout
>> Tue Sep 14 12:48:29 2010: DEBUG: AuthBy SQL result: IGNORE, Database failure
>>
>> Timeout is set for 4 seconds...
>> so, query executed at 12:48:21, ERR timed out 4 seconds later, appeared to re-try but didn't say anything, and another ERR timeout 4 seconds after that. That's 8 seconds of course. It doubles the Timeout value.
>>
>> This is no good, for me. If I set my SQL timeout value for 4 seconds, and my NAS timeout for 5 seconds, I expect my radiator to timeout before my NAS re-transmits. my NAS will retry after 5 seconds because radiator hasn't responded. And, radiator hasn't obeyed the timeout so it's still waiting for 8 seconds. This causes the same accounting packet to enter radiator again, and causing another 8 seconds delay, and of course duplicate entries in the accounting logging since I'm also using AcctFailedLogFileName so the packet will eventually end up in the SQL table.
>>
>>
>> 2. SQL Timeout issue #2.
>> ------------------------
>> using the same debug example above, when the SQL query times out, it doesn't seem to use the FailureBackoffTime value. It only seems to use FailureBackoffTime when there is a connection failure, not a timeout. So, every query is still presented to the SQL server. If the timeout is due to lets say a write lock, when the lock releases, all the queued insert statements are executed creating in sometimes up to 10 duplicate accounting entries.
>>
>>
>> 3. AuthBy result after failure
>> ------------------------------
>> an IGNORE occurs when the SQL query fails, but if AcctFailedLogFileName is used, and successfully wrote the accounting packet to the file, should radiator not then reply with an ACCEPT? Dumping the accounting packet to a file, but replying with IGNORE will cause the NAS to go to the next radius server possibly accepting the entry, therefore the packet in the acct failed logfile, is a double. Even if there's only 1 radiator server in play, there will be an accounting packet entry in the failed log for each time the NAS retries.
>>
>>
>> Thanks in advanced for any advice,
>> Michael
>>
>> _______________________________________________
>> radiator mailing list
>> radiator at open.com.au
>> http://www.open.com.au/mailman/listinfo/radiator
>
>
>
> NB:
>
> Have you read the reference manual ("doc/ref.html")?
> Have you searched the mailing list archive (www.open.com.au/archives/radiator)?
> Have you had a quick look on Google (www.google.com)?
> Have you included a copy of your configuration file (no secrets),
> together with a trace 4 debug showing what is happening?
>
More information about the radiator
mailing list