(RADIATOR) Radiator stalled.

Sergio Alejandro Gonzalez Z (S2010) sagonzal at sky.net.co
Tue Aug 23 11:48:38 CDT 2005


Hello there:

I've a weird problem, so I post it here looking for advice.

I've two sun v440 with 2 Gb ram and 2 Ultra Sparc
processors each one. Both of them are running:

- One proxy Load balanced instance for authenticattion
(1645/UDP).
- One proxy load balanced instance for accounting
(1646/UDP).
- One instance for authentication (1745/UDP) and
- One instance for accounting (1845/UDP).

The authentication instances connect to two LDAP servers on
an AuthBy Group clause (for HA).

Both authentication and accounting proxy balancers send
requests to inner authentication and accounting instances
on both server like this:

Lets say A and B are the two sun v440.

Authentication Load Balance radiator on server A sends in
round robin fashion request to authentication instance on
server A and also to the authentication instance on server
B.

Accounting Load Balance radiator on server A sends in round
robin fashion request to accounting instance on server A
and also to the accounting instance on server B.

The same happens on Server B (and everything viceversa).

Everything was working ok on both servers for about a month
and a half until this morning. Authentication started to
fail, but accounting was ok.

The authentication instances log files on both servers
(A,B) showed there was a lost of connection to both LDAPs.
Looking for the cause of the problem, I performed an
ldapsearch on both servers A and B with succesful results.
The servers A and B can see the LDAP servers, so the
problem (I guessed) was then the connections opened by each
authentication instance on both server A and B to the
LDAPs. Then I started another authentication instance on
the same servers A and B but listening for requests on a
diferent UDP port (11000/UDP), and it worked ok. I just
guessed the problem was the socket to the LDAPs so I
restarted the "production"  authentication instances on
both servers, but this "trick" didn't make it (No reply
response of radpwtst to neither 1745/UDP ports of both
instances), not even tested directly (radpwtst -auth_port)
. 

Just after 4 kill -9 to the authentication instances PIDs,
they started to answer.


My question is. The balancing scheme I'm using here can be
the problem, I mean the proxy load balancing?. If not, what
to do you suggest?.


Thanks a lot in advance.




Sergio Gonzalez
IT Engineer.

--
Archive at http://www.open.com.au/archives/radiator/
Announcements on radiator-announce at open.com.au
To unsubscribe, email 'majordomo at open.com.au' with
'unsubscribe radiator' in the body of the message.


More information about the radiator mailing list