(RADIATOR) Suggestions for high volume system

Mon Apr 29 00:50:39 CDT 2002

Hello Viraj -

On Sun, 28 Apr 2002 23:06, Viraj Alankar wrote:
> Hello,
>
> I am wondering what's the best design for a high volume radius system. We
> are looking at on the order of 100-150 requests/second (auth+acct) on
> average. Does anyone here have a load balancing system setup? If so, I'd
> appreciate any tips on how you set this up.
>

I will send you a seperate mail containing a copy of a paper I wrote for one 
of our customers that deals with part of this issue.

> After using Radiator for quite awhile, I've found that the main things that
> cause slowdowns is database queries or network outages. I've noticed during
> network outages, some RASes (we have mostly Ascend) and proxy servers start
> flooding the server once the connectivity comes back. These appear to be
> queued requests (mostly accounting) on the systems. In this situation it
> pretty much kills our radius server (CPU -> 99%) and many times we have to
> run Radiator in a very basic configuration (no database, no authentication)
> for some time to cool things down. Many times I've even had to go to our
> firewall and block some RAS traffic.
>

You should have a look at a trace 4 debug using the LogMicroseconds parameter 
(requires Time-HiRes from CPAN). This will tell you how much time each 
processing step is taking, and in consequence how many requests per second 
you can deal with.

> So I am just looking for some tips on how to setup a scalable system. We
> have a test system setup with a Foundry switch load balancing to 2 Radiator
> servers via roundrobin. However, in our tests we are noticing that the load
> balancing is not even when the source UDP port stays constant, which is for
> example when another Radiator is forwarding requests to it. It only seems
> to load balance properly when the source ports change. Anyone have any
> ideas what could be wrong here?
>

It sounds like the switch is using address/port pairs to determine how to 
load share. Radiator has three different load balancing modules that 
implement different algorithms. The most useful in this respect is usually 
the AuthBy LOADBALANCE module that distributes requests according to the 
response time of each target.

> What I was thinking was to instead setup one Radiator system that uses the
> AuthBy loadbalance clause instead of the Foundry switch. Any thoughts on
> this instead of hardware load balancing?
>

As mentioned above, you should check the switch to see what options you have 
for selecting the load balancing algorithm.

> The next issue is database slowdowns. I am thinking that the best setup
> would be for the RASes to go directly to Radiators that do not have any
> sort of DB dependency, and instead they proxy to respective servers that do
> have DB dependencies. For example:
>
>        A
>       / \
>      /   \
>     B     C
>    / \   / \
>   D   E F   G
>
> A = Radiator doing AuthBy loadbalance to B and C (or hardware switch)
> B/C = Radiator with only AuthBy RADIUS clauses
> D/E/F/G = Radiator with DB access
>
> The B and C trees would be identical. Does this sound like a proper setup?
>

Well the problem here is that you will still have a single throttle point 
that will result in everything running at the speed of the database. In other 
words, B and C will still not send a reply to the NAS until the database 
query(s) complete.

> As far as the type of database access, we've mostly seen that accounting is
> what causes problems. I believe this is due to our table designs. For
> example, we have unique indexes to drop duplicate accounting, indexed on
> many fields. At some point when there is alot of data inserts become slow.
> I was thinking that Radiator's access to the DB should be made as fast as
> possible, and that Radiator should instead use the DB as sort of a log
> table for accounting (with no indexes at all), similar to writing to raw
> files. Then, periodically, an external process would process this data and
> move to the real accounting tables (with indexes, etc). This way, DB query
> time is kept to a minimal for accounting.
>

What you describe is a good solution - keep the processing that Radiator 
itself does to an absolute minimum.

You might also consider running two instances of Radiator on each host - one 
for authentication and the other for accounting.

> Another problem we have is the number of Handlers. We handle requests
> depending on the following:
>
> RAS IP
> RAS IP+DNIS
> RAS IP+DNIS+Realm
>
> With all of our devices, the number of handlers is getting quite large. I'm
> wondering what would be an upper bound on this and if there is a better way
> to handle this. We have almost 500 handlers at this point.
>

It is difficult to say anything sensible about your setup without seeing the 
configuration file and understanding your requirements.

Note that we do offer consulting and design services on a contract basis and 
we have done a large number of custom installations all around the world for 
many of our customers.

If you are interested in this service, please contact Joanne.

> Anyhow, I'd appreciate any info or tips anyone has on a large setup like
> this.
>

There have been a number of discussions on this topic on the mailing list, so 
I suggest you have a look at the archive site and do some searching.

regards

Hugh

-- 
Radiator: the most portable, flexible and configurable RADIUS server
anywhere. Available on *NIX, *BSD, Windows 95/98/2000, NT, MacOS X.
-
Nets: internetwork inventory and management - graphical, extensible,
flexible with hardware, software, platform and database independence.
===
Archive at http://www.open.com.au/archives/radiator/
Announcements on radiator-announce at open.com.au
To unsubscribe, email 'majordomo at open.com.au' with
'unsubscribe radiator' in the body of the message.