(RADIATOR) Multiple radius instances problem (possible remote consulting and professional services)
Hugh Irvine
hugh at open.com.au
Thu Apr 26 16:46:31 CDT 2007
Hello Sergio -
Thanks for the additional information.
It is not clear to me why the dialup instances take so much longer
than the DSL instances to do the authentication. You also don't show
how long the accounting is taking.
In any case, if the problem is slow LDAP and SQL databases you should
address those issues first.
I am guessing that there is some event like a DSL RAS rebooting that
is causing a burst of authentication requests that swamp the
authentication server(s).
How many RADIUS requests per second are hitting the boxes?
BTW - the numbers you show for the SUN LDAP server are consistent
with what I have observed at other sites - it doesn't seem to be able
to process more than at the most 10 requests per second. This being
the case, whenever you have more than 10 requests per second arriving
in Radiator you will have a problem.
There may also be a problem with inserting the accounting data into
the MySQL database, but you have not provided any information on that.
regards
Hugh
On 27 Apr 2007, at 07:28, Sergio Gonzalez wrote:
> Hello,
>
> A customer have the next configuration to authenticate more than
> 20.000 concurrent users every day:
>
> - Sun A: v240Z with 2 Gb RAM, Solaris 9 64bit, Radiator 3.14, Perl
> 5.8.7
> - Sun B: v240Z with 2 Gb RAM, Solaris 9 64bit, Radiator 3.14, Perl
> 5.8.7, MySQL Professional 5.0.17c
> - Sun C: v880 with 4 Gb RAM, Solaris 9 64bit, Radiator 3.14, Perl
> 5.8.7, MySQL Professional 5.0.17c
> - Sun D: v440 with 16 Gb Ram, Solaris 9 64bit, Sun LDAP Server 5
> - Sun E: v440 with 16 Gb Ram, Solaris 9 64bit, Sun LDAP Server 5
>
> Those radius servers answer requests from:
>
> - Around 35 Dial-up RASes with morre than 150 ports each
> - 4 DSL RASes with more than 7.000 ports each
>
> Each radius server has authentication and accounting instances. The
> Authentication instances ask the LDAP server (in fact only one, but
> if the first fails, it will ask the other) and also the MySQL
> servers (in the same fashion as the LDAP, the first, if fails, the
> second).
>
> Taking a Trace -1 and a LogMicroseconds from those instances I got:
>
> Dial-up instances: 8 req/sec max. Each authentication request takes
> 0.15 sec to complete. This means around 7 req/sec before going into
> the udp queue.
> DSL instances: 25 req/sec max. Each authentication request takes
> sec to complete. This means
>
> The auth and acct requests were attended between all three servers
> like this:
>
> Sun A: 2 auth instances for Dial-up and 2 acct instances for Dial-up.
> Sun B: 2 auth instances for DSL and 2 acct instances for DSL.
> Sun C: 2 auth instances for DSL and 2 acct instances for DSL.
>
> Since 5 days ago, the two dia-up auth instances in Sun A got
> stalled. No even radpwtst worked, but looking into the logfile, the
> process seems to be up and running ( a lot of registries got
> written every second, I mean, a lot of Access-Accept and Access-
> Reject, so the whole process is working find from radiator's point
> of view). For the time the Sun A instances got stalled, a few
> seconds later. the Auth instances for Sun C got stalled also. The
> only way to recover the disaster was to implement a config file for
> those instances with a "bypass", just telling to any request to be
> accepted.
>
>
> In the three Radiator Sun servers the udp_recv_hiwat parameter is
> set to more than 8 million and the udp buffer is set to the max,
> 64k (solaris boundary). Also, when the instances got stalled, there
> are a lot of Access-Accept that never leaves the boxes, and also
> there are a lot of access-request comming from the RASes that never
> reaches the Radiator application. It seems to be a socket buffer
> overflow problem.
>
> How do I fix this?.
>
>
> Best Regards.
>
> Sergio Gonzalez
>
>
>
>
NB:
Have you read the reference manual ("doc/ref.html")?
Have you searched the mailing list archive (www.open.com.au/archives/
radiator)?
Have you had a quick look on Google (www.google.com)?
Have you included a copy of your configuration file (no secrets),
together with a trace 4 debug showing what is happening?
Have you checked the RadiusExpert wiki:
http://www.open.com.au/wiki/index.php/Main_Page
--
Radiator: the most portable, flexible and configurable RADIUS server
anywhere. Available on *NIX, *BSD, Windows, MacOS X.
Includes support for reliable RADIUS transport (RadSec),
and DIAMETER translation agent.
-
Nets: internetwork inventory and management - graphical, extensible,
flexible with hardware, software, platform and database independence.
-
CATool: Private Certificate Authority for Unix and Unix-like systems.
--
Archive at http://www.open.com.au/archives/radiator/
Announcements on radiator-announce at open.com.au
To unsubscribe, email 'majordomo at open.com.au' with
'unsubscribe radiator' in the body of the message.
More information about the radiator
mailing list