[Mageia-webteam] Forum VM needs
Maât
maat-ml at vilarem.net
Sat Jan 15 12:41:20 CET 2011
Le 13/01/2011 13:29, Michael Scherer a écrit :
> Le mercredi 12 janvier 2011 à 21:27 +0100, Maât a écrit :
>> Hi there,
>>
>> As it seems VM creation takes a little bit of time due
>> to people being under heavy load at work Anne and misc
>> considered the option of creation the Xen VM on one of
>> our servers (we could migrate the VM on atalante later)
> The exact technology should not matter much, that's also what puppet is
> made for. Ie, unless we plan to do a migration at the system image
> level, we could simply install the 2nd vm, put puppet, clone the
> computer, migrate the db and ip.
>
> ( not that I do not like xen, but I would prefer something else ).
Well I mentionned Xen because Atalante uses it... but if this is not a problem for puppet then ok for me :)
>> For that misc asked for Forum needs...
> I think I didn't make myself clear. I wanted information to deploy it
> like where is the git stored ( a url, not "it is on a server" ), who
> will need what access, etc. But the information you gave are also
> important ( and bring lots of question as you can see ).
>
Atm it's stored on Ennael's dedibox :)
>> For the beginning i'll consider that we are going to put everything on the same machine
>> (DB and PHP). This is not rally brilliant to virtualize DB servers but i guess this will
>> not kill the VM in the first monthes as the tables will not be big.
> AFAIK, using virtio and proper cache, this should not be much a problem.
ok then
>> So phpBB needs a LAMP Stack : Apache + PHP5 + MysSQL5 (it prefers to have MySQLi extention)
> No specific requirement in term of version, using 2010.1 rpm should be
> ok, I assume ?
indeed
>> And we'll need with php the optional :
>> -- zlib compression (better having it)
>> -- remote ftp support (well... i'm not in favor even if documentation asks for it)
> We could drop outgoing connexion if needed.
>
> ( yes, php make me paranoid in term of security )
>
quite understandable :o)
>> -- XML support (better having it)
>> -- Image Magick support (better having it)
> php-image-magick. I do think there is a conspiration to make me have a
> stroke. Security research by a friend of mine on ImageMagick do not make
> feel safe to know we will use it, but if this is required, we have no
> choice.
>
> Just to know, what will it be used for ( I assume this will be used to
> resize avatar ) ?
yes
>> -- GD support (same as Image magick)
> Does the forum support suhoshin, or various php hardening measures ?
>
Dunno... but we can check that. phpBB forum's got a few threads of errors mentionning suhosin but i suspect they did not try very hard
> Did you do various testing with a hardened configuration with dangerous
> call disabled ( mainly remote url access for a start, but i also think
> we can use opendir restriction, etc, etc ).
>
> Does it have non regression testing ( so we can enable stuff and see if
> anything break ? ) ?
>
Testing and integration envs are perfect for that :)
>> For source management git will be used... so we'll need it too :)
> Just git clone ?
> I have a puppet module for this, just need tests before I commit.
>
> For git hosting, again, while I am in favor, there is a few questions to
> answer and prepare it, see my previous mail about what is needed.
>
we need cloning, branch/tag switching and sync with reference repos
>> As forum have often to face bruteforce having Fail2ban would be really great...
>> for every open service like ssh
> On ssh level, and for me, that's a vote in favor of "no". We use ssh
> keys only for admins, so fail2ban will just cause trouble.
...
>> but also for forums... i'd like to have Fail2ban
>> parse a file of phpBB failed login to trigger a IP low level ban during a
>> few hours or more...
> Well, if you give us the configuration, we can see.
ok
> We can also use the trick that Olivier deployed on d-c to avoid numerous
> connexions from the same IP ( in case someone decide to be smart and do
> simultaneous attempts to log ).
ok too... i'm curious to see what the trick is :)
>> For forum management we'll need :
> ---- 8<----
> or for those who are not CxO-fluent ( private joke ), who is 'we', in
> term of organisation ( ie, do we need to create a ldap group, etc ) ?
>
We = those who maintain forum at forum admin level (ash and i atm)
For integration and testing we'll prehaps need a little bit more
For production you can keep that for sysadmins i guess (till now i got to be sysadmin & forum admin so this will need a little bit of tuning to adapt myself)
>> -- access to sources (read/write)
> I rather keep this automated from git, for security reasons and to avoid
> human errors. I would even add a cron job that does a git diff or
> something similar, to detect if someone uploaded a file manually, or
> touched to it using apache.
>
> In fact, as a security measure, I think the user that will write the
> source should put it read only for apache. Ie, use a separate system
> user for that.
yup that's wise
but some dirs (avatars upload dir, file upload dir, emoticon uplad dir, cache dir) will need to be writable for apache...
>> -- access to data zones (avatars and uploaded things) (read/write)
> You mean apache will need it, no ?
indeed... apache will need it in the fist place, but sometimes there can be need to delete an avatar, resize it, replace it with something not offensive, make it read only to prevent user re-changing again and again...
> Direct access seems to me a pretty rare event, we can grant access if
> there is a really lots of request, ie if you annoy admin enough to make
> them give it rather than doing themselves.
Well you're right we can tune organisation later... so let's say that sysadmin will take care of low level tasks on production forum...
>> -- access to accesslogs and errorlogs (read)
> then this should be merged with the webmasters concept that romain
> explained. For now, we didn't setup anything ( we didn't even split log
> file on alamut, even if this should be trivial ).
>
indeed forum admins have obviously tasks that are close to this "webmaster" concept
>> -- ability to change php log levels
> This can be done by php, I think.
>
if php global config allows it.
>> -- access to php logs (read)
> same as accesslogs
yup
>> -- console access to database(s) (i'd prefer to avoid completely phpMyadmin on the forum server)
> I would prefer avoid giving console access until there is a real need. I would favor then a remote
> mysql access, and forcing ssl, maybe even limited by fixed ip address if you wish to avoid bruteforce.
>
> ( I will not go to the point of proposing to use a vpn too, but
> almost ).
>
> Maybe we could think of some kind of ssh bastion for such access ( or
> maybe that's overkill too ).
>
well for forum administration i daily use sql CLI because that's loads more effective than using phpBB SQL interface or uglier things like phpMyadmin :-/
>> For performance questions : i guess forum opening will trigger a rather vast
>> amount of people coming (at least to register their nicks)... i'd be happy to
>> avoid the server being loaded to death.
> Registration will be done on catdap from what I think we agreed on, no ?
> ( correct me if I am wrong ).
you are right but once registered they'll go to the forum (in particular if they came to registration form through a link on the forum)
> So we need to work on that part ( starting more processes, and so
> letting us tune that with puppet ( this is hardcoded now, AFAIK ). So
> depending on where we host the forum, we can surely avoid this effect.
:-/
>> So i'm targetting at least one thousand simultaneous users being active on the
>> forum... that will do for apache tuning.
> Ok so let's say 120 simultaneous process for apache, which also mean we
> need to keep apache process as lean as possible ( ie, no unused module
> loaded ). I assume that there is no guarantee on being thread safe from
> php and associated library, so we will use mpm-prefork.
>
> Since the server is isolated and serve only for php hosting, I guess
> using fast-cgi will not bring much to the equation, when compared to
> mod_php.
>
> Let's also assume 30 processes for forum registration on catdap ( if I
> am not wrong on that part, of course ) ? We could surely mitigate the
> potential overload by not announcing this on every possible channel at
> the same time ( ie, first a mail, then a blog post, then
> identica/twitter ).
>
> Should we also maybe need to tune ldap ?
well nginx could be an option as Thierry said... dunno if ldap will be loaded though
>> For database that will mean 800 to 1200 requests per seconds...
>>
>> We'll have 2 - 3 months to see the tables grow and tune the indexes and the memory accordingly.
> That mean that we will have to deploy some monitoring, and we didn't
> decided anything ( buchan proposed hobbit, I proposed munin, purely by
> familiarity ).
>
That can be done a little bit later :)
> What metrics would you need so we can work on them in priority ( once we
> start to set up something ) ?
>
the ram used by database and by the web server, the number of threads, the cpu load, and perhaps some data from the database server (cache fill level, number of locked requests, time taken by long requests - over 1 sec -)...
>> But i think our needs will stabilize around 4-6 GO for RAM if the forum gets really
>> used (we'll have to tune mysql to keep many requests in cache) apache+mysql all
>> included... if we split later apache and mysql on separate machines the needs on
>> each machine will be obviously lower.
> No cache ( squid, varnish ) ?
> No php level cache too ?
>
> ( not that it may be requested now )
we can see that later indeed :)
>> For app disk space code is under 50 megs... and with hundred of avatars uploader
>> we will not grow above 1GO
>>
>> For database disk space even after years of activity we'll remain under 5GO
> Ok so let's allocate a 10 g partition for the db + ssthat on lvm.
> We should take in account logs, and logs backup ( french law requires 1
> year of logs ).
>
> How many logs are to be expected per day ?
> The only busy webserver I can think of is d-c, but Nanar and I just
> discovered that the configuration is not good.
> So now, that's 5g of log, uncompressed, per month.
>
it will depend on the success of mageia but that can be rather high
>> We'll need to set up some tables with heavy read and write accesses with InnoDB (not all) :
>> that would be great to have one file per table innodb option enabled
> Ok, I guess it should be safe to enable it for all mysql db I guess.
yes
>> Nota : i'd like to use https (at least for admin accesses)... so that will mean to enable
>> ssl and open 443 port also
> We did not plan to let people use their password under cleartext at all.
> Centralized auth have been setup ( and should be used for forum too ),
> so people will reuse their password, the same used at others part of the
> infrastructure, and that mean svn, or bugzilla, etc. Since people with
> access will use it, no cleartext at all when the password is sent ( or
> over my dead body, after fighting my ghost ).
so https for all ? ok for me but that will be a little bit heavier for cpu
> I guess we can make exception for the cookie, as long as it is not
> shared ( ie, we will have to rethink the scheme if we deploy SSO ).
>
> That also mean that people will complain because of firefox if we do not
> buy a certificate.
>
:-(
>> That's all for system level... i think directory structures (Which concerns apache web root config) can be dealt with later...
>>
>> Tell me if you have got everything you need for VM creation...
> What I needed was more information for forum deployment, not vm
> requirements, I guess I didn't express myself clearly. The requirement
> for the vm in term of memory/disk have been roughly drafted before. What
> I would prefer is a deployment document.
>
ash and i are writing one atm
your mail changes some things... so let's adapt it :)
----8<----
(pfeeew... sorry for the length guy)
Maât
More information about the Mageia-webteam
mailing list