[Mageia-dev] Mirror layout, round two

andre999 andr55 at laposte.net
Mon Nov 29 10:09:55 CET 2010

Michael scherer a écrit :
> On Sat, Nov 27, 2010 at 08:00:17PM +0200, Thomas Backlund wrote:
>> Michael scherer skrev 27.11.2010 10:43:
>>> On Fri, Nov 26, 2010 at 10:29:14PM +0200, Thomas Backlund wrote:
>> [...]
>>>> Then we come to the "problematic" part:
>>> This part look really too complex to me.
>>>> ------
>>>> /x86_64/
>>>>         /media/
>>>>               /codecs/ (disabled by default)
>>> so, ogg, webm, being codec, should go there or not ?
>>> What about patents problem about something else than codec ?
>>> ( freetype, image such as gif, DRM stuff )
>> Actually this is the "maybe_legal_greyzone" repo,
>> but since flagging it as "codecs" would really make people
>> react, I named it so for now...
> Sorry to be so direct, but that's doesn't answer the question :/
>>>>               /core/ (old main+contrib)
>>>>                    /backports/ (disabled by default)
>>>>                    /backports_testing/ (disabled by default)
>>>>                    /release/
>>>>                    /testing/ (disabled by default)
> Shall I suggest to name this one "updates_testing", for consistency ?
> ( consistency with backport_testing, and because this explain what goes in
> more clearly. This also look simpler ).
>>>>                    /updates/
>>>>               /extra/ (unmaintained, disabled by default)
>>> If used by people, then why no one step to maintain anything ?
>> Yeah, thats the problem.
> If this is the problem, how does it help to have people to maintain
> the application ?
> So far, the only way that really work is
> "someone take care or we shoot the do^W rpm".
> So maybe we could just be more active with cleaning ?
>> And reality shows we have a lot of packages assigned to nomaintainer@ ...
>>>>               /firmware/ (disabled by default)
>>> Why separate firmware from non_free ? What does it bring ?
>>> Since both of them are disabled by default, they can be simply merged.
>> Well, this suggestion is partly based on the fact that we have users
>> that want a firmware free install, wich this would satisfy...
> I do not think this warrant a full media, maybe just a way to filter package.
> Using a media seems overkill to me, since this bring complexity in dialog box, from
> easyurpmi to rpmdrake and installer, and since it bring complexity on mirror, on BS
> and on our policy.
> Maybe we could find a way to tag them "firmware", like a rpmgroup.
The filtering will be more involved, but rpmdrake/urpmi need overhauling 
We just need to add an rpm group "system/firmware", and move firmware 
packages from "system/kernel+hardware".
> The benefit is the complexity will only be on rpmdrake side, not on mirroring and BS
> side.
> More ever, this would much more flexible ( ie, see the games option I propose later ).
>> But yes, if we ignore those suggestions, we split the firmwares in
>> GPL ->  /core/ and the rest to /non-free/
>>>>               /games/ (disabled by default)
>>> That's a simplification that make no sense.
>>> Not all games are big, not all big packages are games ( tetex, openoffice ).
>> It's not only a size question, its also a nice option for companies
>> to not mirror games ("employees should work, not play...")
> Such companies likely already have admins to prevent users from installing games.
> Maybe we could add feature in rpmdrake for that ( like "do not show package
> that match such conditions : group =~ games/, maintainer =~ nomaintainer@, requires =~ python ).
Excellent idea.  I like the nomaintainer option :)
You can set some urpmi options by editing a config file.  There has 
already been suggestions to make this directly accessible in rpmdrake.  
Once done, we just need to add more options.

> The problem of private internal companies mirrors is really not our concern.
> And their software policy, even if they may decide to apply it on a public mirror,
> should not leak on our side.

Right.  No point in confusing issues.
>> And we have some contributors that already have stated that they
>> plan to add all possible games so it will grow.
>> and we all know games are the fastest growing /space demanding...
> Well, so either that will cause a problem on our side, in which case this will
> just be unhelpful on our primary mirrors, or it will only cause issues on some mirrors,
> and in this case, there is lots of other thing that can take space that we do not
> take in account :
> - debug
> - source code ( except that a GPL requirement )
> - adding another arch ( like arm/mips )
> - adding more iso ( something that is asked each time, like 64 bits one, etc )
> So if we decide "mirrors will not handle the load, so we need to split games", then we
> should also say "mirrors will not handle the load, so we need to do less iso/offer to not
> mirror debug/offer to not mirror some architecture", and we end with a non consistent
> network of mirror, with lots of complexity on our side to handle the possible choice
> made by mirrors. I am not sure that users
> will truly benefit from this. And I am sure that we will not benefit from the complexity.
> If the space is a issue ( and I think that's one of the main one ), then we should decide
> based on metrics. Ie, we plan to have no more than X% growth in mirror size for 1 year.
> If we hit some soft limit, then we investigate and decide ( ie, stop adding big backport,
> stop adding new package, etc ).
> And decide the metrics based on mirrors input, and based on packagers input.
> But so far, apart from Olivier and Wolfgang, we do not have much metrics and
> requirements :/
>>>>               /non-free/ (disabled by default)
>>>>               /debug_*/ (disabled by default)
>>> And what are the relation of requirements ?
>>> Ie, what can requires non_free, codecs, games, etc ?
>> IMHO /core/ should be selfcontained.
>> We are promoting open source after all.
> Yes, but what about the others ?
> Ie, can a game requires a codec or not ? a package in extra ?
> If we remove a package from extra, do we remove everything
> that requires it ?
>>> And what about something that can goes in both media, ie a non_free
>>> game goes where ? A unmaintained codecs goes where ?
>> Yeah, to be precise, that would need a games_non-free
> another media ? Really, I think most users are already lost with the
> current media selection.
> For core, we have 15/20 medias ( src + debug + binary ( 1 or 2 ) * update/release/testing/backport/
> backport testing ). Each media we add at the level of core will therefore add 15 to 20 medias too.
> So firmware, game, extras, codecs, non_free, that would make the total around 80 to 90 medias for a single
> arch ( I assume that firmware may not have debug_* )
> While it can be partially solved with a better interface for selecting media,
> we cannot do miracles if there is too much things :/
> So let's try to think how we can reduce the number of media.
> We have 2 kind of issue we try to solve at mirror level :
> - the concern of mirror admins
> - the concern of users.
> with impact on BS and packagers
> Mirror admins are concerned by :
> - size and growth ( see Wobo mail in the past thread )
> - content ( or at least, we think )
> Content part is mainly legal matter, but I didn't heard any admin
> telling "we can't do that", so that's my interpretation. The concern is
> mainly around DCMA and EUCD, even if lesser know laws also exist around
> the world ( like the Paragraph 202C of German law, who ban "hacking tools" ).
> For DMCA, there is some protection for them :
> http://www.benedict.com/Digital/Internet/DMCA/DMCA-SafeHarbor.aspx .
> For EUCD and the rest, I do not know.
> Users are concerned with a wide range of issues, some contradictory :
> - some want newer stuff, some don't
> - some want stable stuff, some do not care as much
> - some want non_free, some don't want it
> - some want firmware, some don't
> - etc
> Yet, the users concern mainly evolve around 2 things :
> - package availiability
> - package filtering, based on packages content
> The first part is already solved by the subdivision ( release, etc ). We
> need to split them for build reason. So we can't really avoid adding
> medias on this part.
> The second part is more tricky. And in fact, I think we can avoid creating media
> for this. Ie, do not let the concern of filtering appearing on
> the BS and mirrors, and push this on endusers system.
> Some people do not want firmware on their system, they do not really care about
> the firmware being in a separate directory on mirrors, as long as they can
> disable them easily from the list of package they can install ( at
> perl-urpm level, IMHO ).
> Same goes for non_free, or for nomaintained software. Or even games.
> So if we push the users issues on endusers system, we only have to manage the
> mirror admins issue on mirror.

> And so here is a proposal that start by the size issue :
> - discuss with mirror admin, decide on a size that everybody would agree to mirror
> for core/ for the next release, or the 2 next one. Ie, every year or every 6 months,
> we do a survey of our mirrors, to see if everything goes well for them.
> - discuss also of the growth of core in term of size
> - decide on a limit size
> - if anything goes off limit for mirror, add a overflow/ to hold the packages
> that will not be mirrored by everybody. Overflow will be treated like core, in all points.
> Only difference is that mirroring is optional ( but strongly encouraged )
> - put everything in core, except what goes to overflow.
> - let users filter on their system, with something urpmi side ( I suggest a filtering
> when we do urpmi.update, but the exact details of how to do it are not relevent now ).
> Overflow will be filled with packages that :
> 1) are not required by anything else ( thus games data would likely fit,
> but not only )
> 2) have triggered the limit of size
> After the limit of core size is raised ( ie after all mirror have agreed ),we can readd packages
> from overflow to core, based on
> criteria not defined yet ( first come first serve, try to make most useful first ?
> or some wild guesstimate based on some mirrors stats ? ). But being in core or
> overflow should not change anything for both enduser and packagers. This is
> a mirror only concern, and so should be kept there only.
> And this should avoid discussion about the location of packages by packagers.
> This mean that both core and overflow should be by default on users system.
> ( and I would not be against a better name, but I didn't found one )
I like extra, which would fit nicely with the approach I'm about to suggest.

I would take a similar but somewhat different approach, which would 
probably have at least as good results.
First, decide what is *essential* to a fully functioning desktop or 
server or development system.  That goes into core.
Then decide what would be *very useful* in a typical such system.  Add 
that to core.
Of course, only the free packages.  Those not free would remain in non-free.

Core should then have the various kernels, the usual Linux utilities and 
development tools, the drak* and associated utilities.  Various 
pilotes.  The compete desktops, such as Gnome, KDE, LXDE, etc.  And 
certain common applications, such as LibreOffice, Firefox.
This leaves a lot of other packages, to go into extra.  (or overflow, if 
you prefer.)
Games would generally be in extra.  Or non-free.  (There may be a few 
small exceptions.)

This leaves many applications now in main, as well as virtually 
everything now in contrib, which would be in extra.
So core would be (probably much) smaller than main, and thus extra 
bigger than contrib.
And core would be just that : the core of Mageia.

Besides any advantage for space-limited mirrors which may exclude extra, 
we could collectively focus on ensuring first that everything in core 
works, to help ensure that user's systems would always be functional.  
Being reliable won't hurt our reputation.
> In order to reduce number of media, another question is :
> - should non_free have it own media ?
> Having them in core would simplify the BS, the upload and the mirroring.
> Having it separated would be better from various points of view ( political,
> communication, etc ). Maybe some people will refuse to help us if we don't,
> maybe there is some further restriction on some non-free software leading us
> to create another media whatever we do, I do not know.
> To me, as long as we can filter on user side, it would be ok.
> I cannot really tell what I prefer for that :/
I think it is better to keep non-free.  It makes it very obvious to 
everyone, and avoids the down sides you mention.
We still avoid adding 3 sets of repositories :)
> So the only important mirror issue left to solve is the greyzone area.
> And well, that's quite complex.
> So we can either :
> 1) decide to not care ( ie everything in core )
> 2) decide to not offer them at all ( aka offload to PLF )
> 3) decide to add a media ( aka the "codecs" media )
> 1 is the simplest. But maybe not really a good idea.
> If we care, then what indeed should be done is another media, and let admins
> choose to mirrors it or not. I would even propose to revise the idea of
> separation every year, because if all mirrors have the
> 2 medias, no need to split in reality ( but I doubt it will happen, but
> at least, this would show that we try to revise our fondation on a regular
> basis ). And at least, we should revise the packages present in such medias.
> If there is some packages that can be moved to core,
> then they should.
> We could also simplify a bit the BS by placing non-free packages there
> ( instead of either having a non_free media, or the non_free pacakges in core ).
> It would sadden me a little to blur the line between "free with patents problems"
> from "non free", but my PLF experience showed that most people do not care, and that
> it requires more than a media separation.
> So, in the end, we would have :
> core/
>    release
>    updates
>    updates_testing
improved name.
>    backports
>    backports_testing
> "overflow"/<- big packages, just for mirroring issues

Which I would prefer to call "extra".  (Note that I suggest above a 
somewhat different contents, which would probably make it larger, and 
core smaller.)
Thus a mirror dropping it would save more space.

> restricted/<- with non_free, firmware, "codecs"

It seems to me that totally free (of patents, etc) codecs would be 
better in core.
But the name allows containing packages that are nominally free, which 
is excellent.
> with the 5 directories under them, and with src, debug, binary.
> Imho, 3 upper medias is the simplest we can have ( besides debug/src, that
> I would place also on the same level than the binaries, but my
> mail is already long enough :/ )

Long but very useful :)
To save additional space, maybe minimal mirrors could drop ISOs as well.
And maybe we could have other minimal mirrors with only all current ISOs.

In any case, at this point the important to decide what repositories to 
Essentially I agree with your proposals.
>> For codecs either a extra_codecs or simply drop after a grace period.
>> but I guess codecs are important to people, so hopefully they wont
>> get orphaned...
> Unfortunately, there is not always a relation between "being important
> to users" and "someone want to take the burden of maintaining it" :/
> For example, something like etherpad would be nice for users,
> yet no one will take time to maintain it.

The proposed Mageia-app-db will hopefully help Mageia respond better to 
user's needs/desires.

my 2 cents :)

- André

More information about the Mageia-dev mailing list