(RADIATOR) Radiator going down after Oracle SQL Timeout

Mariano Absatz radiator at lists.com.ar
Fri Dec 14 12:13:50 CST 2001


The point is when you add one more server to the "back farm", why you do it? 
Because you can't process enough radius requests? or because the servers 
themselves are overloaded? The most important factor is, usually, the 
database. What are you using? SQL? DBM? LDAP? /etc/passwd? Where does it 
reside?

A happy new year to you too!

El 14 Dec 2001 a las 14:23, Harrison Ng escribió:

> Hello Mariano, 
> 
> Your radiator splitting method sounds interesting to us. 
> We try to do it in test plant, and get experience from it. 
> 
> What we are doing here is using two radiator to proxy 
> auth/acct request to a bunch of radiator server. 
> We add more radiator server to the bunch for scaling. 
> Eventually we have to manage too many linux boxes. 
> This cost us much administrative overhead and money 
> for maintenance. 
> 
> Merry Christmas and Happy New Year. 
> 
> Regards, 
> Harrison 
> 
> 
> -----Original Message----- 
> From: Mariano Absatz [ mailto:radiator at lists.com.ar
> <mailto:radiator at lists.com.ar> ] 
> Sent: Thursday, December 13, 2001 9:28 PM 
> To: Harrison Ng 
> Cc: Radiator List 
> Subject: RE: (RADIATOR) Radiator going down after Oracle SQL Timeout 
> 
> 
> Well, 
> 
> I think this was discussed quite a few times in the list and was
> recommended 
> by Hugh. 
> 
> The point is, precisely, the "single-thread-ness" of Radiator (inherited
> from 
> the still unstablesness of Perl's multi-threading). 
> 
> While Radiator IS really fast, the data bases it interfaces are not 
> necessarily fast (nor available, as the problem I had shows). 
> 
> In my case, I'm using an oracle database to authenticate users and also
> to 
> store accounting records and on-line users. For now, these all reside in
> the 
> same database in the same host (not the same host that is running
> Radiator), 
> but I designed it so it can scale and functionally divide the databases.
> 
> 
> But even being in the same host, by splitting up Radiator authentication
> and 
> accounting processes the database delays querying the tables to
> authenticate 
> don't stop Radiator's accounting from receiving and storing account
> records 
> and maintain the on-line users table and vice-versa. 
> 
> If I detected that the process is still to slow and the culprit was the 
> database, I might even be tempted to leave 2 radiator instances
> listening on 
> the standard ports for authentication and accounting records and load- 
> balancing them among a bunch of authentication and accounting radiator 
> processes all running on non-standard ports on the same host. 
> 
> El 13 Dec 2001 a las 10:48, Harrison Ng escribió: 
> 
> > Hello Mariano, 
> > 
> > Do you mind telling me the purpose of running 
> > two instances of Radiator on the same unix box. 
> > 
> > I've heard that Radiator is a single thread perl appplication. 
> > So it can't fully utilize system resource effectively. 
> > 
> > Harrison 
> > SmarTone BroadBand Services Ltd. 
> > 
> > 
> > 
> > -----Original Message----- 
> > From: owner-radiator at open.com.au [ mailto:owner-radiator at open.com.au
> <mailto:owner-radiator at open.com.au>  
> > < mailto:owner-radiator at open.com.au
> <mailto:owner-radiator at open.com.au> > ]On 
> > Behalf Of Hugh Irvine 
> > Sent: Thursday, December 13, 2001 9:14 AM 
> > To: Mariano Absatz; Radiator List 
> > Subject: Re: (RADIATOR) Radiator going down after Oracle SQL Timeout 
> > 
> > 
> > 
> > Hello Mariano - 
> > 
> > What you describe below sounds to me like a problem with the
> DBD-Oracle 
> > module. I would suggest that you try to use the "restartWrapper"
> program 
> > that 
> > we provide in the distribution ("goodies/restartWrapper") instead of 
> > "supervise" (at least for debugging this problem). The restartWrapper 
> > program 
> > can be set up with a delay before restarting, and it can also be 
> > configured 
> > to email a designated email address with the exit status and any error
> 
> > messages that were written to stderr. We should then be able to see
> what 
> > is 
> > causing Radiator to die. 
> > 
> > regards 
> > 
> > Hugh 
> > 
> > 
> > On Thu, 13 Dec 2001 08:14, Mariano Absatz wrote: 
> > > Hi, 
> > > 
> > > I'm having the following problem: 
> > > 
> > > I'm using Radiator (2.18.4) and have all of my data on a remote
> Oracle 
> > 
> > > (8.1.6) server. 
> > > 
> > > Both machines are Sun Netra with Solaris 8. Perl version is 5.6.1. 
> > > 
> > > There are two instances of Radiator (one for authentication and the 
> > other 
> > > for accounting). 
> > > 
> > > The problem is the following. If the Oracle server goes down, the 
> > queries 
> > > time out (that's reasonable). The point is some times (not after
> every 
> > SQL 
> > > timeout, but after some of them), Radiator goes down. It seems to be
> 
> > that 
> > > this happens when the query in question is necessary as part of the 
> > > authentication (e.g. during a username lookup or simultaneous use or
> 
> > port 
> > > limit check), but not when it is nonessential (as a deletion from
> the 
> > > radonline table for the nas/port recently received or an insertion
> in 
> > an 
> > > AuthLog). 
> > > 
> > > On only one ocassion I saw the "Could not connect to any SQL
> database. 
> > 
> > > Request is ignored. Backing off for 600 second" message, but even
> that 
> > 
> > > time, Radiator went down. 
> > > 
> > > I'm using daemontool's supervise ( http://cr.yp.to/daemontools.html
> <http://cr.yp.to/daemontools.html>  
> > < http://cr.yp.to/daemontools.html <http://cr.yp.to/daemontools.html>
> > ) to keep 
> > > the servers running so the server starts up again almost
> immediately. 
> > I see 
> > > the messages when it is starting again in the log. 
> > > 
> > > The question is, why is Radiator silently shutting down rather than 
> > backing 
> > > off? 
> > > 
> > > One of the main problems is that on the almost immediate restart,
> the 
> > first 
> > > thing Radiator tries to do is to read the client list from the 
> > database. If 
> > > Oracle is still down, it won't read it, it won't retry, and (since 
> > there 
> > > are no hardwired <Client>'s in the config file, it won't accept 
> > anything 
> > > from any NAS. 
> > > 
> > > Regretfully, supervise's log is autorotated and autoerased on a size
> 
> > basis 
> > > and I don't have the output to correlate with Radiator's log. 
> > > 
> > > I'm attaching parts of the logs showing the SQL Timeout error 
> > immediately 
> > > followed by Radiator starting up again (via supervise). 
> > > 
> > > The "DEBUG: Adding Clients from SQL database" is the first message 
> > issued 
> > > by a NEW Radiator starting. 
> > > 
> > > I'm also attaching the whole set of configuration files (the main
> one 
> > is 
> > > radius-main.cfg) in a zip file. 
> > 
> > -- 
> > Radiator: the most portable, flexible and configurable RADIUS server 
> > anywhere. Available on *NIX, *BSD, Windows 95/98/2000, NT, MacOS X. 
> > - 
> > Nets: internetwork inventory and management - graphical, extensible, 
> > flexible with hardware, software, platform and database independence. 
> > === 
> > Archive at http://www.open.com.au/archives/radiator/
> <http://www.open.com.au/archives/radiator/>  
> > < http://www.open.com.au/archives/radiator/
> <http://www.open.com.au/archives/radiator/> >  
> > Announcements on radiator-announce at open.com.au 
> > To unsubscribe, email 'majordomo at open.com.au' with 
> > 'unsubscribe radiator' in the body of the message. 
> > 
> > 
> 
> 
> -- 
> Mariano Absatz 
> El Baby 
> ---------------------------------------------------------- 
> Logic: The art of being wrong with confidence... 
> 
> 


--
Mariano Absatz
El Baby
----------------------------------------------------------
This time it will surely run.I just found the last bug. 


===
Archive at http://www.open.com.au/archives/radiator/
Announcements on radiator-announce at open.com.au
To unsubscribe, email 'majordomo at open.com.au' with
'unsubscribe radiator' in the body of the message.


More information about the radiator mailing list