[Mageia-sysadm] Alamut memory problem

Michael Scherer misc at zarb.org
Sun Mar 6 12:57:28 CET 2011


 Hi,

 it seems that alamut memory usage is growing since a few days ( in
 fact, we have seen it now that we monitor it thanks to xymon
 installation, but it should be since a much longer time ).

 After some hours of debugging with dmorgan, I narrowed the problem down
 to viewvc, after careful looking to logs and with the help of watch -d 
 &
 ps .Each time we see a line IOError in error.log, that the sign
 there is a memory leaked ( as we can see on graphs if we look carefully 
 ). I
 suspect that the problem is on viewvc side, as explained on :

 http://viewvc.tigris.org/ds/viewMessage.do?dsForumId=4254&dsMessageId=2364399

 Before, we did allowed wsgi web software to register custom signals
 handler (
 http://svnweb.mageia.org/adm/puppet/modules/apache/templates/mod_wsgi.conf?r1=908&r2=1051
 ), but this caused apache to not restart, and so it was disabled again(
 http://svnweb.mageia.org/adm/puppet/modules/apache/templates/mod_wsgi.conf?r1=1051&r2=1195
 ).

  So basically, when we see this :

  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] Traceback
  (most recent call last):
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]   File
  "/usr/share/viewvc/bin/wsgi/viewvc.wsgi", line 40, in application
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
  viewvc.main(server, cfg)
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]   File
  "/usr/share/viewvc/lib/viewvc.py", line 4415, in main
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
  view_error(server, cfg)
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]   File
  "/usr/share/viewvc/lib/viewvc.py", line 4403, in view_error
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
  debug.PrintException(server, exc_dict)
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]   File
  "/usr/share/viewvc/lib/debug.py", line 81, in PrintException
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
  server.write("<h3>An Exception Has Occurred</h3>\\n")
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]   File
  "/usr/share/viewvc/lib/sapi.py", line 252, in write
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66]
  self._wsgi_write(s)
  [Tue Mar 01 07:31:43 2011] [error] [client 74.125.16.66] IOError:
  client connection closed

 there is a problem.

 At first, I thought it was some users with crappy internet connexion,
 but this example is googlebot.

 So far, we have a few options :
 - going back on cgi would mask the errors, to the cost of speed
 - asking to apache to recycle more often his childs would also mask the
 error
 - reallowing mod_wsgi to add a signal handler would likely do it too
 - fixing the leak on viewvc side should IMHO better but maybe too
  complex

 So I propose the following :

 - use python-flup to make viewvc run a external fastcgi process, so it
 can register signal handler as he see fit, and do not interfer with
 apache. This also bring consistancy ( since we use fastcgi for almost
 everything, except bugzilla and django applications ), and would allow
 a finer grained control ( as we can see each process taking memory and
 plan ressources based on this ).

 WDYT ?

-- 
  Michael Scherer



More information about the Mageia-sysadm mailing list