Re: Pristine source archive

Lists: spi-general
From: Glenn McGrath <bug1(at)optushome(dot)com(dot)au>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Pristine source archive
Date: 2002-04-13 07:24:16
Message-ID: 20020413172416.75406765.bug1@optushome.com.au
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

The free software community is wasting resources due to the fragmented
approach to the format of source code distribution.

The source packages of all distributions are based on the same upstream
source code, however distributions combine their metadata and patches in
different, possibly incompatable ways.

The problem with this aproach.
- Storage space: Mirrors contain the source code packaged in different
ways for every distribution they mirror - Availability: If a user wants
the pure source code to an application its easiest if they go via the
projects home page, distributions _may_ have modified the source code, or
the latest version may not be available yet.

I would like to see a co-operative distribution method between
distributions and the individual project release managers that - Allows
upstream project managers to optionally sign, or in other ways
authenticate the source in the archive is unmodified. - Allows
distributions to use the pristine source archive as the primary source of
their upstream source code, leaving only the metadata and patches in their
source package.

I believe this aproach would
- make distributions more accountable to their users as its easier to see
how a project's source code has been modified by the distribution. -
Reduce the space requirements for mirroring free software, possibly
leading to greater availability of source code. - Make it easier for
smaller unrecognised distributions as they can draw from the common pure
source archive. - Encourage greater co-operation between distributions...
maybe even one day leading to a common metadata format.

I imagine the biggest problem with this idea is getting distributions to
use it, which would probably require modification of their archive and
package managment tools.

Distributions who use it would have to be able to upload to the master
archive if the source package wasnt already there, and not be allow to
remove a package unless its not used by anyone... which may be a problem
determining.

I mentioned it here as i think such an effort would have to be independent
of any one distribution.

What do you all think, is it a good/bad idea, would it have ant chance of
being acceptanced ?

Glenn


From: "M(dot) Drew Streib" <dtype(at)dtype(dot)org>
To: Glenn McGrath <bug1(at)optushome(dot)com(dot)au>
Cc: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-13 08:06:19
Message-ID: 20020413080619.GR21792@dtype.org
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

On Sat, Apr 13, 2002 at 05:24:16PM +1000, Glenn McGrath wrote:
> I imagine the biggest problem with this idea is getting distributions to
> use it, which would probably require modification of their archive and
> package managment tools.

Actually, the biggest obstacle would be the loss of control of
a distribution to be able to fully test a controlled set of binaries.

In a sense, both Red Hat and Debian already do this (as do many others)
in that the SRPMS or source dpkgs do include the original upstream
source (usually) and patches made by the distribution. The binary
release can obviously not do this.

I agree in philosophy, but might argue that in practice, distributions
already do this for the majority of packages now. It isn't that hard
to see what a distribution has changed on any given piece of software,
as patches are available.

Don't underestimate the need for controlled testing of a full binary
distribution, something which a source-only distribution can unfortunately
not accomplish. (Not that I don't think there is use for these,
but there are certainly big disadvantages.)

-drew

--
M. Drew Streib <dtype(at)dtype(dot)org>, Free Standards Group (freestandards.org)
co-founder, SourceForge.net | core team, freedb | sysadmin, Linux Intl.
creator, keyanalyze report | maintnr, *.us.pgp.net | other freedom/law


From: Glenn McGrath <bug1(at)optushome(dot)com(dot)au>
To: "M(dot) Drew Streib" <dtype(at)dtype(dot)org>
Cc: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-13 08:32:57
Message-ID: 20020413183257.6da12937.bug1@optushome.com.au
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

On Sat, 13 Apr 2002 08:06:19 +0000
"M. Drew Streib" <dtype(at)dtype(dot)org> wrote:

> On Sat, Apr 13, 2002 at 05:24:16PM +1000, Glenn McGrath wrote:
> > I imagine the biggest problem with this idea is getting distributions
> > to use it, which would probably require modification of their archive
> > and package managment tools.
>
> Actually, the biggest obstacle would be the loss of control of
> a distribution to be able to fully test a controlled set of binaries.
>
I dont follow, if the upstream source is in a different location to the
remaining parts of the source package (metadata/patches) the binary can
still be built and tested.

> In a sense, both Red Hat and Debian already do this (as do many others)
> in that the SRPMS or source dpkgs do include the original upstream
> source (usually) and patches made by the distribution. The binary
> release can obviously not do this.
>

Yes, SRPMS, debian and slackware source packages have what is most likely
to be pristine upstream source, my point is that why not seperate the
common source from each of these into one common pool instead of having
the same source in over a dozen locations on most free software mirrors

> I agree in philosophy, but might argue that in practice, distributions
> already do this for the majority of packages now. It isn't that hard
> to see what a distribution has changed on any given piece of software,
> as patches are available.
>
Well, with debian there are situations where the "upstream source" has
been modified, debian only handles .tar.gz, so if upstream source is in
.tar.bz2, or if the package maintainer wants to handle multiple patches
seperatley then the real upstream source is hiden inside a false upstream
source package, i dont know how pure SRPM and slackware sources are.

> Don't underestimate the need for controlled testing of a full binary
> distribution, something which a source-only distribution can
> unfortunately not accomplish. (Not that I don't think there is use for
> these, but there are certainly big disadvantages.)
>

Im not proposing a new distribution as such, it would be just something
that existing distributions could use as the mirror for the upstream
source of their packages.

Glenn


From: Peter S Galbraith <GalbraithP(at)dfo-mpo(dot)gc(dot)ca>
To: Glenn McGrath <bug1(at)optushome(dot)com(dot)au>
Cc: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 13:25:01
Message-ID: 20020415132501.2B01B2981F@mixing.qc.dfo.ca
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

> The free software community is wasting resources due to the fragmented
> approach to the format of source code distribution.

> I would like to see a co-operative distribution method between
> distributions and the individual project release managers that -
> Allows
> distributions to use the pristine source archive as the primary source
> of their upstream source code, leaving only the metadata and patches
> in their source package.

Does this respect the letter of the GPL and not just the spirit?
We distribute binaries and rely on a third party to distribute source?
Or are we a member of that third party, making that okay?

Just wondering...

Peter


From: "Dale E Martin" <dmartin(at)cliftonlabs(dot)com>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 15:03:47
Message-ID: 20020415150347.GA926@cliftonlabs.com
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

> Does this respect the letter of the GPL and not just the spirit?
> We distribute binaries and rely on a third party to distribute source?
> Or are we a member of that third party, making that okay?
>
> Just wondering...

The GPL says that the source code has to be available. It doesn't say
anything about who or where from. So if the source is truly pristine,
(centralized or not) upstream availability qualifies. Distro-specific
patches would need to be available, which was part of the suggestion.

My thoughts about this proposal in general:

1) Distros won't want to upgrade simultaneously, so you'll end up with many
versions of each application in the upstream repository. I.e. the union of
all of the current archives (minus the duplication, of course, which is the
current "problem" in the proposer's view.)

2) Not every distro uses the same set of tools, so you might end up with a
bunch of different upstreams of the same applications. Certain tools (like
"procps") seem like they have wide variance between distros - perhaps
even being totally different upstream.

3) The upstream repository would need more bandwidth than any current
distro's source repository, since it would be getting mauled by the users
of all of the distros.

4) The source repository is a critical bit of infrastructure to any distro,
and you'd be taking it out of their control. I'm thinking most of the
distros would not like that, particularly the commercial ones.

5) The current distributed nature is a benefit in many ways - redundancy
being one of them...

One of the things that would be cool about the proposal would be that the
baseline tools common to all distros might be agreed upon, and then
security auditing might be easier. Basically if everyone agreed that
"sysvinit" version 2.84 was golden within some time period, then each
distro could have some resources dedicated to security audits of the code.
The proposed arrangement might make it easier to see the common codebases
and track the usage...

Later,
Dale
--
Dale E. Martin, Clifton Labs, Inc.
Senior Computer Engineer
dmartin(at)cliftonlabs(dot)com
http://www.cliftonlabs.com
pgp key available


From: Antti-Juhani Kaijanaho <gaia(at)iki(dot)fi>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 15:34:27
Message-ID: 20020415153427.GA5203@kukkaruukku.keltti.jyu.fi
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

On 20020415T110347-0400, Dale E Martin wrote:
> The GPL says that the source code has to be available. It doesn't say
> anything about who or where from.

Have you read the GPL? This is _not_ what it says.

The GPL says that if you provide binaries, you have to provide source,
or promise (in writing) to provide source to anybody for at least three
years. The only exception is that if you've received such a promise,
you may forward it along with the binaries, if you distribute the
binaries noncommercially.

--
Antti-Juhani Kaijanaho, LuK (BSc) * http://www.iki.fi/gaia/ * gaia(at)iki(dot)fi


From: Dale E Martin <dmartin(at)cliftonlabs(dot)com>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 16:46:25
Message-ID: 20020415124625.A13520@clifton-labs.com
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

> > The GPL says that the source code has to be available. It doesn't say
> > anything about who or where from.
>
> Have you read the GPL? This is _not_ what it says.

Let me apologize for my oversimplification.

Within the context of the original question, my point was simply that the
GPL doesn't say "you have to keep the source on your own site". You _are_
responsible for making sure people can get the source; since simplifying
that very process is one of the key features of the proposal, I glossed
over that.

I do maintain that it as a distributor of binaries, technically I don't
have to be the one physically storing source. I can distribute (from
pristine source) binaries and if someone wants source, I can say "I got
them from http://www.giantsourcearchive.org, go get them there." If the
binaries aren't from pristine source, I _do_ have to provide access to the
patches as well - which I did point out. This argument assumes you
consider the Internet "a medium customarily used for software
interchange.")

Perhaps your point is that if those sources disappear then I would be in
violation of the GPL - I agree. Hence my point that losing control of the
source archive would be a problem for most distros.

IANAL, and don't even pretend to be one on email lists. Disregard
everything I say freely ;-)

Take care,
Dale
--
Dale E. Martin, Clifton Labs, Inc.
Senior Computer Engineer
dmartin(at)cliftonlabs(dot)com
http://www.cliftonlabs.com
pgp key available


From: Antti-Juhani Kaijanaho <gaia(at)iki(dot)fi>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 17:38:53
Message-ID: 20020415173853.GE5203@kukkaruukku.keltti.jyu.fi
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

On 20020415T124625-0400, Dale E Martin wrote:
> Within the context of the original question, my point was simply that the
> GPL doesn't say "you have to keep the source on your own site".

Yes it does.

"If distribution of executable or object code is made by offering
access to copy from a designated place, then offering equivalent
access to copy the source code from the same place counts as
distribution of the source code, even though third parties are not
compelled to copy the source along with the object code."

(Or rather, it does not allow not having the source on your own site.)

> I can distribute (from
> pristine source) binaries and if someone wants source, I can say "I got
> them from http://www.giantsourcearchive.org, go get them there."

No, you can't, at least according to my reading of the GPL.

> IANAL, and don't even pretend to be one on email lists. Disregard
> everything I say freely ;-)

Neither am I AL.

You should still read and understand the license. That does not require
being AL.

--
Antti-Juhani Kaijanaho, LuK (BSc) * http://www.iki.fi/gaia/ * gaia(at)iki(dot)fi


From: Dale E Martin <dmartin(at)cliftonlabs(dot)com>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 18:16:59
Message-ID: 20020415141659.A16462@clifton-labs.com
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

> "If distribution of executable or object code is made by offering access
> to copy from a designated place, then offering equivalent access to copy
> the source code from the same place counts as distribution of the source
> code, even though third parties are not compelled to copy the source
> along with the object code."

This says to me "if you put the binary on an ftp site, putting the source
there counts as a distribution" - i.e. you don't have to do MORE than that,
it's sufficient. You don't have to mail people CDs on demand or something
else beyond simply having the source on the ftp site.

But it doesn't say I'm _compelled_ to do it that way. Before that, in part
B of Section 3, it tells me what I am compelled to do - make a written
offer that you can get the source in some way compatible with Section 1 &
2, charging no more than the cost of the distribution itself. How I
provide you with the source is simply stated as "on a medium customarily
used for software interchange". Pointing the end user at a 3rd party URL
counts (in my opinion) as long as the source is really there. I'd be
foolish to do it that way (and only that way) if there was some reasonable
chance that the third party URL was the only source, and that it would
disappear.

AND, if I'm actually distributing modified versions of anything, I do have
to supply my mods too. This is a very important point! For instance, my
reading says that if Lindows is based on Debian, then they could simply
distribute their own diffs and point at Debian's ftp site for the rest. If
Debian shut off their ftp site, then they would be in violation of the GPL
until they found a new source of it.

The practical consequence of these terms means that I would be wise to keep
a copy of the source on my own site in case it disappears upstream. Then I
could distribute the source myself if I could no longer point people at the
available 3rd party source. The context of this discussion, however,
assumes a giant site whose purpose in life is to hold upstream source of
the Linux distribution universe. If that was the case, perhaps relying on
this site would be OK, just for the sake of argument.

> No, you can't, at least according to my reading of the GPL.

Our interpretations differ then. Since we both admit to not being lawyers,
it's time to drop this thread IMHO.

> You should still read and understand the license. That does not require
> being AL.

I have read the GPL and LGPL several times, and reread the pertinent parts
of the GPL before sending my reply to you. Just because our
interpretations differ on some of the fine points doesn't mean I have not
read it; please stop implying that I have not.

Take care,
Dale
--
Dale E. Martin, Clifton Labs, Inc.
Senior Computer Engineer
dmartin(at)cliftonlabs(dot)com
http://www.cliftonlabs.com
pgp key available


From: Antti-Juhani Kaijanaho <gaia(at)iki(dot)fi>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 18:28:41
Message-ID: 20020415182841.GF5203@kukkaruukku.keltti.jyu.fi
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

On 20020415T141659-0400, Dale E Martin wrote:
> I have read the GPL and LGPL several times, and reread the pertinent parts
> of the GPL before sending my reply to you. Just because our
> interpretations differ on some of the fine points doesn't mean I have not
> read it; please stop implying that I have not.

It certainly looked like you hadn't, as your interpretation (now it
seems) relies on a certain reading of a certain phrase, which you did
not even mention in your first two mails. Now that you've elaborated on
your interpretation I see that you have read the license.

BTW, if, as you seem to imply, distribution of A on the Internet at one
site and B on a different site and telling people who ask for B that
it's on that site, counts as "accompanying" B with A, then I don't see
why the license has to contain the last paragraph of section 3.

--
Antti-Juhani Kaijanaho, LuK (BSc) * http://www.iki.fi/gaia/ * gaia(at)iki(dot)fi


From: Dale E Martin <dmartin(at)cliftonlabs(dot)com>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-15 19:12:21
Message-ID: 20020415151221.A19435@clifton-labs.com
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

> It certainly looked like you hadn't, as your interpretation (now it
> seems) relies on a certain reading of a certain phrase, which you did
> not even mention in your first two mails. Now that you've elaborated on
> your interpretation I see that you have read the license.

> BTW, if, as you seem to imply, distribution of A on the Internet at one
> site and B on a different site and telling people who ask for B that it's
> on that site, counts as "accompanying" B with A, then I don't see why the
> license has to contain the last paragraph of section 3.

If the license said that source had to accompany the binaries explicitly, I
would agree with you. It does not, however, say that source has to
"accompany" the executables. It says I have "Accompany it with a written
offer ... to give any third party ... a complete machine-readable copy of
the corresponding source code ... on a medium customarily used for
software interchange." Your interpretation of that goes beyond what it
actually says in my opinion. For the sake of argument, what is the
difference in me saying to you "get it from
http://www.dalessite.com/source" vs "get it from
http://www.upstream.org/source", as long as:

1) It's really available where I say it is.
2) AND I really didn't modify it, OR I did give you patches.

For that matter I could release binaries on an ftp site and then have a
webpage where you could request CDs of the source for $5.

Read the part about the intent of the license in the Preamble:

"Our General Public Licenses are designed to make sure that you have the
freedom to distribute copies of free software (and charge for this service
if you wish), that you receive source code or can get it if you want it,
that you can change the software or use pieces of it in new free programs;
and that you know you can do these things."

If I really was to point you at the source that _I_ got it from, it does
not inhibit you on any of these principles, right?

As far as the last paragraph goes, it's (in my opinion) to protect someone
from distributing GPL code from having to do things at the whim of some
"crazed" user. Let's say Microsoft ftps a Debian binary image from the ftp
server, or obtains it in some other way. (I'm sure that's happened.) Now
imagine they decide they want the source. So they read the README that
came with the CD that says where they can ftp it from. They come back and
say "we cancelled our Internet access and now we only have floppy drives;
if you don't send me floppies of the source (at cost of distribution),
you'll be in violation of the GPL". This clause says that Debian does NOT
have to do that; having the source on the ftp site where the binaries came
is enough.

That's different than saying that this is the only way to satisfy the
requirement, though, IMHO.

I don't advocate ANY of these approaches (aside from the simple publishing
of source code), btw. I'm just trying to understand/explain the GPL within
the context of the original proposal and the resulting question.

Later,
Dale
--
Dale E. Martin, Clifton Labs, Inc.
Senior Computer Engineer
dmartin(at)cliftonlabs(dot)com
http://www.cliftonlabs.com
pgp key available


From: Glenn McGrath <bug1(at)optushome(dot)com(dot)au>
To: spi-general(at)lists(dot)spi-inc(dot)org
Subject: Re: Pristine source archive
Date: 2002-04-16 13:25:27
Message-ID: 20020416232527.1945edaf.bug1@optushome.com.au
Views: Raw Message | Whole Thread | Download mbox
Lists: spi-general

On Mon, 15 Apr 2002 11:03:47 -0400
"Dale E Martin" <dmartin(at)cliftonlabs(dot)com> wrote:

> My thoughts about this proposal in general:
>
> 1) Distros won't want to upgrade simultaneously, so you'll end up with
> many versions of each application in the upstream repository. I.e. the
> union of all of the current archives (minus the duplication, of course,
> which is the current "problem" in the proposer's view.)
>
> 2) Not every distro uses the same set of tools, so you might end up with
> a bunch of different upstreams of the same applications. Certain tools
> (like"procps") seem like they have wide variance between distros -
> perhaps even being totally different upstream.
>
Good point, that could be a problem, hadnt thought of that.

> 3) The upstream repository would need more bandwidth than any current
> distro's source repository, since it would be getting mauled by the
> users of all of the distros.
>
For a site that was already mirroring the source of those distros it
shouldnt have a major effect bandwidth.

> 4) The source repository is a critical bit of infrastructure to any
> distro, and you'd be taking it out of their control. I'm thinking most
> of the distros would not like that, particularly the commercial ones.
>
Distro's that are participating would have to have upload rights to the
master site, deciding when to remove an app would be more of a problem,
there would have to be some automated way of determining when the source
is no longer required.

> 5) The current distributed nature is a benefit in many ways - redundancy
> being one of them...
>
> One of the things that would be cool about the proposal would be that
> the baseline tools common to all distros might be agreed upon, and then
> security auditing might be easier. Basically if everyone agreed that
> "sysvinit" version 2.84 was golden within some time period, then each
> distro could have some resources dedicated to security audits of the
> code. The proposed arrangement might make it easier to see the common
> codebases and track the usage...
>

A cleaner seperation between upstream source and the distribution modified
source would make it easier to audit the patches distributions add as
well. which is possibly a more dangerous place for an exploit to reside.

You raise some good points, i will need to look into it in more depth, try
and workout more closely what the composition of such an archive would end
up being.

Thanks

Glenn