[Mageia-dev] Mirror layout, round two

Maarten Vanraes maarten.vanraes at gmail.com
Mon Nov 29 02:33:29 CET 2010


Op maandag 29 november 2010 01:24:42 schreef Michael scherer:
> On Sat, Nov 27, 2010 at 08:00:17PM +0200, Thomas Backlund wrote:
> > Michael scherer skrev 27.11.2010 10:43:
> > >On Fri, Nov 26, 2010 at 10:29:14PM +0200, Thomas Backlund wrote:
> > [...]
> > 
> > > > Then we come to the "problematic" part:
> > >This part look really too complex to me.
> > >
> > > > ------
> > > > /x86_64/
> > > > 
> > > >        /media/
> > > >        
> > > >              /codecs/ (disabled by default)
> > > 
> > > so, ogg, webm, being codec, should go there or not ?
> > > What about patents problem about something else than codec ?
> > > ( freetype, image such as gif, DRM stuff )
> > 
> > Actually this is the "maybe_legal_greyzone" repo,
> > but since flagging it as "codecs" would really make people
> > react, I named it so for now...
> 
> Sorry to be so direct, but that's doesn't answer the question :/
> 
> > > >              /core/ (old main+contrib)
> > > >              
> > > >                   /backports/ (disabled by default)
> > > >                   /backports_testing/ (disabled by default)
> > > >                   /release/
> > > >                   /testing/ (disabled by default)
> 
> Shall I suggest to name this one "updates_testing", for consistency ?
> ( consistency with backport_testing, and because this explain what goes in
> more clearly. This also look simpler ).
> 
> > > >                   /updates/
> > > >              
> > > >              /extra/ (unmaintained, disabled by default)
> > > 
> > > If used by people, then why no one step to maintain anything ?
> > 
> > Yeah, thats the problem.
> 
> If this is the problem, how does it help to have people to maintain
> the application ?
> 
> So far, the only way that really work is
> "someone take care or we shoot the do^W rpm".
> So maybe we could just be more active with cleaning ?
> 
> > And reality shows we have a lot of packages assigned to nomaintainer@ ...
> > 
> > > >              /firmware/ (disabled by default)
> > > 
> > > Why separate firmware from non_free ? What does it bring ?
> > > Since both of them are disabled by default, they can be simply merged.
> > 
> > Well, this suggestion is partly based on the fact that we have users
> > that want a firmware free install, wich this would satisfy...
> 
> I do not think this warrant a full media, maybe just a way to filter
> package.
> 
> Using a media seems overkill to me, since this bring complexity in dialog
> box, from easyurpmi to rpmdrake and installer, and since it bring
> complexity on mirror, on BS and on our policy.
> 
> Maybe we could find a way to tag them "firmware", like a rpmgroup.
> 
> The benefit is the complexity will only be on rpmdrake side, not on
> mirroring and BS side.
> 
> More ever, this would much more flexible ( ie, see the games option I
> propose later ).
> 
> > But yes, if we ignore those suggestions, we split the firmwares in
> > GPL -> /core/ and the rest to /non-free/
> > 
> > > >              /games/ (disabled by default)
> > > 
> > > That's a simplification that make no sense.
> > > Not all games are big, not all big packages are games ( tetex,
> > > openoffice ).
> > 
> > It's not only a size question, its also a nice option for companies
> > to not mirror games ("employees should work, not play...")
> 
> Such companies likely already have admins to prevent users from installing
> games. Maybe we could add feature in rpmdrake for that ( like "do not show
> package that match such conditions : group =~ games/, maintainer =~
> nomaintainer@, requires =~ python ).
> 
> The problem of private internal companies mirrors is really not our
> concern. And their software policy, even if they may decide to apply it on
> a public mirror, should not leak on our side.
> 
> > And we have some contributors that already have stated that they
> > plan to add all possible games so it will grow.
> > and we all know games are the fastest growing /space demanding...
> 
> Well, so either that will cause a problem on our side, in which case this
> will just be unhelpful on our primary mirrors, or it will only cause
> issues on some mirrors, and in this case, there is lots of other thing
> that can take space that we do not take in account :
> - debug
> - source code ( except that a GPL requirement )
> - adding another arch ( like arm/mips )
> - adding more iso ( something that is asked each time, like 64 bits one,
> etc )
> 
> So if we decide "mirrors will not handle the load, so we need to split
> games", then we should also say "mirrors will not handle the load, so we
> need to do less iso/offer to not mirror debug/offer to not mirror some
> architecture", and we end with a non consistent network of mirror, with
> lots of complexity on our side to handle the possible choice made by
> mirrors. I am not sure that users
> will truly benefit from this. And I am sure that we will not benefit from
> the complexity.
> 
> If the space is a issue ( and I think that's one of the main one ), then we
> should decide based on metrics. Ie, we plan to have no more than X% growth
> in mirror size for 1 year. If we hit some soft limit, then we investigate
> and decide ( ie, stop adding big backport, stop adding new package, etc ).
> 
> And decide the metrics based on mirrors input, and based on packagers
> input. But so far, apart from Olivier and Wolfgang, we do not have much
> metrics and requirements :/
> 
> > > >              /non-free/ (disabled by default)
> > > >              /debug_*/ (disabled by default)
> > > 
> > > And what are the relation of requirements ?
> > > Ie, what can requires non_free, codecs, games, etc ?
> > 
> > IMHO /core/ should be selfcontained.
> > We are promoting open source after all.
> 
> Yes, but what about the others ?
> Ie, can a game requires a codec or not ? a package in extra ?
> If we remove a package from extra, do we remove everything
> that requires it ?
> 
> > > And what about something that can goes in both media, ie a non_free
> > > game goes where ? A unmaintained codecs goes where ?
> > 
> > Yeah, to be precise, that would need a games_non-free
> 
> another media ? Really, I think most users are already lost with the
> current media selection.
> For core, we have 15/20 medias ( src + debug + binary ( 1 or 2 ) *
> update/release/testing/backport/ backport testing ). Each media we add at
> the level of core will therefore add 15 to 20 medias too. So firmware,
> game, extras, codecs, non_free, that would make the total around 80 to 90
> medias for a single arch ( I assume that firmware may not have debug_* )
> 
> While it can be partially solved with a better interface for selecting
> media, we cannot do miracles if there is too much things :/
> 
> So let's try to think how we can reduce the number of media.
> 
> We have 2 kind of issue we try to solve at mirror level :
> - the concern of mirror admins
> - the concern of users.
> with impact on BS and packagers
> 
> Mirror admins are concerned by :
> - size and growth ( see Wobo mail in the past thread )
> - content ( or at least, we think )
> 
> Content part is mainly legal matter, but I didn't heard any admin
> telling "we can't do that", so that's my interpretation. The concern is
> mainly around DCMA and EUCD, even if lesser know laws also exist around
> the world ( like the Paragraph 202C of German law, who ban "hacking tools"
> ). For DMCA, there is some protection for them :
> http://www.benedict.com/Digital/Internet/DMCA/DMCA-SafeHarbor.aspx .
> For EUCD and the rest, I do not know.
> 
> 
> Users are concerned with a wide range of issues, some contradictory :
> - some want newer stuff, some don't
> - some want stable stuff, some do not care as much
> - some want non_free, some don't want it
> - some want firmware, some don't
> - etc
> 
> Yet, the users concern mainly evolve around 2 things :
> - package availiability
> - package filtering, based on packages content
> 
> The first part is already solved by the subdivision ( release, etc ). We
> need to split them for build reason. So we can't really avoid adding
> medias on this part.
> 
> The second part is more tricky. And in fact, I think we can avoid creating
> media for this. Ie, do not let the concern of filtering appearing on
> the BS and mirrors, and push this on endusers system.
> Some people do not want firmware on their system, they do not really care
> about the firmware being in a separate directory on mirrors, as long as
> they can disable them easily from the list of package they can install (
> at perl-urpm level, IMHO ).
> 
> Same goes for non_free, or for nomaintained software. Or even games.
> 
> So if we push the users issues on endusers system, we only have to manage
> the mirror admins issue on mirror.
> 
> And so here is a proposal that start by the size issue :
> 
> - discuss with mirror admin, decide on a size that everybody would agree to
> mirror for core/ for the next release, or the 2 next one. Ie, every year
> or every 6 months, we do a survey of our mirrors, to see if everything
> goes well for them. - discuss also of the growth of core in term of size
> - decide on a limit size
> - if anything goes off limit for mirror, add a overflow/ to hold the
> packages that will not be mirrored by everybody. Overflow will be treated
> like core, in all points. Only difference is that mirroring is optional (
> but strongly encouraged ) - put everything in core, except what goes to
> overflow.
> - let users filter on their system, with something urpmi side ( I suggest a
> filtering when we do urpmi.update, but the exact details of how to do it
> are not relevent now ).
> 
> Overflow will be filled with packages that :
> 1) are not required by anything else ( thus games data would likely fit,
> but not only )
> 2) have triggered the limit of size
> 
> After the limit of core size is raised ( ie after all mirror have agreed
> ),we can readd packages from overflow to core, based on
> criteria not defined yet ( first come first serve, try to make most useful
> first ? or some wild guesstimate based on some mirrors stats ? ). But
> being in core or overflow should not change anything for both enduser and
> packagers. This is a mirror only concern, and so should be kept there
> only.
> And this should avoid discussion about the location of packages by
> packagers.
> 
> This mean that both core and overflow should be by default on users system.
> ( and I would not be against a better name, but I didn't found one )
> 
> 
> 
> In order to reduce number of media, another question is :
> - should non_free have it own media ?
> 
> Having them in core would simplify the BS, the upload and the mirroring.
> 
> Having it separated would be better from various points of view (
> political, communication, etc ). Maybe some people will refuse to help us
> if we don't, maybe there is some further restriction on some non-free
> software leading us to create another media whatever we do, I do not know.
> To me, as long as we can filter on user side, it would be ok.
> 
> I cannot really tell what I prefer for that :/
> 
> 
> So the only important mirror issue left to solve is the greyzone area.
> And well, that's quite complex.
> 
> So we can either :
> 
> 1) decide to not care ( ie everything in core )
> 2) decide to not offer them at all ( aka offload to PLF )
> 3) decide to add a media ( aka the "codecs" media )
> 
> 1 is the simplest. But maybe not really a good idea.
> 
> If we care, then what indeed should be done is another media, and let
> admins choose to mirrors it or not. I would even propose to revise the
> idea of separation every year, because if all mirrors have the
> 2 medias, no need to split in reality ( but I doubt it will happen, but
> at least, this would show that we try to revise our fondation on a regular
> basis ). And at least, we should revise the packages present in such
> medias. If there is some packages that can be moved to core,
> then they should.
> 
> We could also simplify a bit the BS by placing non-free packages there
> ( instead of either having a non_free media, or the non_free pacakges in
> core ). It would sadden me a little to blur the line between "free with
> patents problems" from "non free", but my PLF experience showed that most
> people do not care, and that it requires more than a media separation.
> 
> So, in the end, we would have :
> 
> core/
>   release
>   updates
>   updates_testing
>   backports
>   backports_testing
> 
> "overflow"/    <- big packages, just for mirroring issues
> restricted/    <- with non_free, firmware, "codecs"
> 
> with the 5 directories under them, and with src, debug, binary.
> Imho, 3 upper medias is the simplest we can have ( besides debug/src, that
> I would place also on the same level than the binaries, but my
> mail is already long enough :/ )
> 
> > For codecs either a extra_codecs or simply drop after a grace period.
> > but I guess codecs are important to people, so hopefully they wont
> > get orphaned...
> 
> Unfortunately, there is not always a relation between "being important
> to users" and "someone want to take the burden of maintaining it" :/
> For example, something like etherpad would be nice for users,
> yet no one will take time to maintain it.

I agree with you partly (mostly on the basis that mirror setup should be 
primarily for mirror admins), however:
 - some of those big packages are pretty much core
 - and a big core repos is having a big hdlists as well; and you should take 
into consideration that some people have regular phone line internet.
 - i'm not entirely sure that mirror admins would like the overflow idea:
    - if you're a small public mirror (ie: storage size), you would not mirror 
the overflow; however some big packages would be pretty essential. seperating 
extra (unmaintained pacakages); and games would seem easier; also on the 
following up side; (ie: when problems arise); also a point is what about those 
big packages and their dependencies (or rather other packages which depend on 
it).
 - i don't believe unmaintained packages is something that can be avoided


More information about the Mageia-dev mailing list