[Mageia-sysadm] questions about our infrastructure setup & costs

Mon Apr 2 21:00:30 CEST 2012

On Mon, Apr 02, 2012 at 06:02:48PM +0200, Romain d'Alverny wrote:
> On Mon, Apr 2, 2012 at 16:59, Michael Scherer <misc at zarb.org> wrote:
> > Le lundi 02 avril 2012 à 15:23 +0200, Romain d'Alverny a écrit :
> > That's a rather odd question, since with your treasurer hat, you should
> > have all infos, so I do not really see what we can answer to that.
> 
> If I ask, it's that I don't. *\o/*

Well, you have the hardware we paid, no ? 
The accounting is on the bank account, and despites requiring searching, it
was published.

If iyou need information on the hardware we got at the beggining, I can send 
on this list a partial history of where 
does every piece of hardware come from, if that what you need, but I prefer to 
be sure that's really what you need before spending a afternoon on this.

When they were setup, I proposed to deploy GLPI to have a inventory being done 
automatically, but I was being told that a hand made should be sufficient, so I didn't. 
And the hand made one didn't appear. 

We can publish the yaml file from puppet, that can give enough information for someone
wanting a inventory now, with mac address, bios information, serial number, memory, etc.
Or just run some for loop with lshw and push it somewhere, so people can grasp what we 
have for hardware.

Or deploy glpi/fusion-inventory/etc, since they are packaged, if that's also the type
of information you need.

> And it would greatly appreciated that sysadmin decipher these things
> for a more accessible understanding of the infrastructure, not only
> for me, but for other future people that may not have your technical
> background.

I still do not know what answer.
Basically :

Ecosse and jonund are purely for the buildsystem.
http://svnweb.mageia.org/adm/puppet/manifests/nodes/ecosse.pp?revision=2708&view=markup

Alamut is where all our web applications are running.
See the zone file, as this reflect exactly what is running where :
http://svnweb.mageia.org/adm/puppet/deployment/dns/templates/mageia.org.zone?revision=2456&view=markup

the only exception are blog and planet, on krampouezh and champagne.
( one with the mysql db, the other one with application =

Champagne also hold most of the static websites.

mailling list are on alamut too.
postgresql is on alamut, so is the computing of our mail aliases
( ie, postfix + spamassassin, etc )

Valstar is controling the buildsystem ( job dispatch ), 
serve as puppetmaster, host git, svn and the ldap master.

So the buildsystem is roughly 3 servers + arm board, until we start to use postgresql in 
which case we would have to take alamut in account.

We can also take rabbit in account for the iso production. 

> > The list of servers is in puppet :
> > http://svnweb.mageia.org/adm/puppet/manifests/nodes/
> >
> > and each has some module assigned to it, take this as functional chunk.
> >
> > Unfortunately, the servers are all doing more than one tasks, so
> > splitting them in functional chunks do not mean much.
> 
> Yes it does. That's exactly this other view that I'm asking for.
> Because it makes sense to know what the "buildsystem" chunk/unit
> costs, as a whole, in storage/bandwidth/hosting options, in comparison
> with the "user accounts" one, with the "Web sites" one, with the
> "mailing-lists" one, etc. so that we can consider different hosting
> options for each of them.

We do not have accounting per chunk so the bandwidth would be aggregated.
We can add accounting, but that would requires some iptables trick, and 
we should let it run for 1 month to have proper data.

For the housing, we need to have the size of each server ( I can provides 
pictures of each server if someone want to count ) and the power consumption,
and I do not know how to get this.

Storage is the only thing we can get. 
svn is 180 Go ( 82 on alamut, ie without the svn for binary )
distrib tree is 550 Go
bin repo is 100 Go

the whole postgresql setup si 1 Go
sympa archives are 5.8 Go

the rest likely do not need to be counted for storage.

And for the spec of the server, see start of this mail.

> It makes sense too to have a dependencies graph at some level to
> quickly explain/see how our infrastructure components work with each
> other.

We do have done a not so bad job of mutualisation, so for example, 
both bugzilla and sympa use the same ldap and the same postgresql.

Splitting them would mean to have more work to do ( the current setup need some
work to use more than a single sql server, for example, but could be done, there
is some hook for that )

> Because, these considerations need to be understood/thought out/taken
> also by people that are _not_ sysadmins, such as persons that may
> focus on organizing our financial resources or contacts or donations
> for that end.
>
> Think about this as "enabling other community members to grasp what it
> takes to make Mageia.org function and to contribute means to that
> end".
>
> For instance, if we can have functional chunks, we may decide:
>  - which ones are critical (and should be moved to a solution that has
> the greatest possible availability): for instance LDAP, identity,
> mailing-lists, www, blogs;
>
>  - which ones may more safely shutdown without shutting down other
> independent services from a functional point of view; (for instance,
> buildsystem could go down without injuring the code repository itself,
> or the mailing-lists; code repositories may shut down too, if a later,
> read-only copy is available somewhere else
>  - which ones may be redundant and how.

So on the redundancy topic, there is 2 parts :
- redundant hardware ( RAID, double power, double ethernet card, etc )
- redundant software

For the hardware part, we have several limitation, most notable one being 
we do not have spare servers, nor warranty on most of them. And since they are far 
away, the only documentation is photos I took that are currently on my phone.
( cause we didn't had time to do everything that was planned last time and finished just in 
time ).

For the software part, most services would require a way to have a redundant file system.
More than half of our services depend on a single filesystem. 
- Puppetmaster depend on a sqlite database to work ( plan is to migrate to postgresql 
sooner or later, as sqlite do not scale ). So for now, we cannot make it redundant.
- identity depend on ldap write access, and we didn't set it up this way ( ie, there is 1 single master )
- epoll depend on FS, and postgresql
- bugzilla depend on FS ( for attachements ), and postgresql (mostly R/W)
- transifex depend on FS ( for file ), and postgresql (R/W)
- buildsystem depend on FS ( for the queing of jobs ), but builders are redundants
- svn and git depend on a FS ( but there is a replica on alamut for viewvc ). 
- viewvc can be made redundant without trouble
- planet and blog depend on FS ( for pictures, cache ) and mysql. Planet can however
be made redundant quite easily. 
- mga::mirrors depend on postgresql, but readonly, so could be made redundant without trouble
- sympa depend on FS and postgresql, and read/write
- postfix can be made redundant, and is already ( could be improved however )
- xymon depend on FS
- mediawiki depend on FS (attachement ) and postgresql ( R/W )
- youri report depend on postgresql, and is stateless so can be started from scratch again
- pkgcpan is stateless too, so ca be moved somewhere else fast, like most static website
( hugs, releases, www, static )
- dns is fully redundant and can be deployed fast, provided puppetmaster is not impacted
- main mirror can be duplicated, that's the goal of a mirror, but would requires contacting several people
- maintdb depend on FS

So we can :
- make mga::mirrors redundant ( would be a good idea in fact )
  - requires a redundant setup for postgresql
  - round robin DNS
  - make sure urpmi/drakx work fine in case of failure

- make planet and viewvc redundant. Not so useful IMHO

- improve postfix redundancy ( ie letting krampouezh distribute mail to aliases )
  - would requires a small tweak on postfix module, but it currently a bit messy

- make puppetmaster scalable and redundant
  - need someone to fix thin, and then I can move sqlite to postgresql using 
    some ruby script

- improve identity/catdap/ldap setup to have it work in case of problem (
  - applications should be tested for proper failover
  - ldap should be set in a multi master setup

but again, not sure if identity is the most important stuff to improve, and ldap 
is already doubled, at least as readonly.

And for everything that depend on FS, the problem is simple :
- if the filesystem is toast, we cannot do much, except restore from backups
- if we want to make them redundant, we need to make the FS redundant and shared, 
and make sure the software support it. For example, i am not sure that svn would 
work if the repository was shared over nfs ( maybe that changed ). Same goes for 
sqlite.

There is various way of doing this. We can either do it at the filesystem level 
( lustre, gfs, ocfs2, gluster ), or export on nfs with some NAS ( like netapp ).
Solution on the hardware level are expensive. Solution on the file system level 
can be divided in 2 categoresy :
- those I do not know
- those where people told me they are cra^W^W have a lot of potential to be improved

As a side note, I doubt we ship required userspace tools in Mageia for the 4 one I gave.

One other solution would be to work on fast failure recovery, ie be able to reinstall a server fast
( that's half done by puppet, backup restoration should be the other half ).

Or to work in a way that would be less centralized ( for example and for the buildsystem, 
using git instead of svn for packages, using a more scalable system than the current one, 
for example, something based on amqp with redundant queues, etc ).
There is distributed bugtracker ( ditz, SD, bugs everywhere, see https://lwn.net/Articles/281849/ ), 
there is distributed wiki ( mostly based on git, bzr ), so we could have a different set of services
that would work fine in case of failure. 

> Yes, of course, some systems already split up like this (www, blog,
> ldap are not in Marseille here). But apart from sysadmin, no one
> knows. No one knows either what are the plans.

Because no one took the time to even search, or to post on the mailling list,
which is sad.

All current solutions for redundant setup I can think of would requires more
than hardware, so no matter how long I explain the current setup, this would 
unfortunately still requires someone with enough sysadmin skills to finish 
the job.

Hence the proposal https://wiki.mageia.org/en/SummerOfCode2012#Ease_the_integration_of_new_sysadmins_.28draft.29

> Splitting by function, dependency seems a good way to know what the
> system does, and how we can lay it out, not in one or two data
> centers, but more globally.

I do not say that's not a good idea to document ( I started to write 
various SOP on the wiki ), just that I fear that no one will 
be motivated to do it and to keep it up to date, especially for such a huge
document.

-- 
Michael Scherer