[Mageia-sysadm] Alamut memory problem
Michael Scherer
misc at zarb.org
Sun Mar 6 12:57:28 CET 2011
Hi,
it seems that alamut memory usage is growing since a few days ( in
fact, we have seen it now that we monitor it thanks to xymon
installation, but it should be since a much longer time ).
After some hours of debugging with dmorgan, I narrowed the problem down
to viewvc, after careful looking to logs and with the help of watch -d
&
ps .Each time we see a line IOError in error.log, that the sign
there is a memory leaked ( as we can see on graphs if we look carefully
). I
suspect that the problem is on viewvc side, as explained on :
http://viewvc.tigris.org/ds/viewMessage.do?dsForumId=4254&dsMessageId=2364399
Before, we did allowed wsgi web software to register custom signals
handler (
http://svnweb.mageia.org/adm/puppet/modules/apache/templates/mod_wsgi.conf?r1=908&r2=1051
), but this caused apache to not restart, and so it was disabled again(
http://svnweb.mageia.org/adm/puppet/modules/apache/templates/mod_wsgi.conf?r1=1051&r2=1195
).
So basically, when we see this :
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] Traceback
(most recent call last):
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] File
"/usr/share/viewvc/bin/wsgi/viewvc.wsgi", line 40, in application
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
viewvc.main(server, cfg)
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] File
"/usr/share/viewvc/lib/viewvc.py", line 4415, in main
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
view_error(server, cfg)
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] File
"/usr/share/viewvc/lib/viewvc.py", line 4403, in view_error
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
debug.PrintException(server, exc_dict)
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] File
"/usr/share/viewvc/lib/debug.py", line 81, in PrintException
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
server.write("<h3>An Exception Has Occurred</h3>\\n")
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] File
"/usr/share/viewvc/lib/sapi.py", line 252, in write
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
self._wsgi_write(s)
[Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] IOError:
client connection closed
there is a problem.
At first, I thought it was some users with crappy internet connexion,
but this example is googlebot.
So far, we have a few options :
- going back on cgi would mask the errors, to the cost of speed
- asking to apache to recycle more often his childs would also mask the
error
- reallowing mod_wsgi to add a signal handler would likely do it too
- fixing the leak on viewvc side should IMHO better but maybe too
complex
So I propose the following :
- use python-flup to make viewvc run a external fastcgi process, so it
can register signal handler as he see fit, and do not interfer with
apache. This also bring consistancy ( since we use fastcgi for almost
everything, except bugzilla and django applications ), and would allow
a finer grained control ( as we can see each process taking memory and
plan ressources based on this ).
WDYT ?
--
Michael Scherer
More information about the Mageia-sysadm
mailing list