|
|
Subscribe / Log in / New account

SourceForge: the "Hotel California" of open source projects?

You can check out any time you like, but you can never leave

SourceForge (SF) provides a valuable service to the free and open source software communities, but it is not without its flaws. It is quite common that, as projects mature and gain popularity, they move away from SF for a variety of reasons. Unfortunately, because of a well-intentioned data retention policy at SF, this can lead to projects held hostage by the high regard search engines have for SF.

SF is one of the earliest providers of free hosting for projects claiming over 100,000 projects with over one million registered users. It provides source code repositories, mailing lists, bug tracking, download space for releases, and has recently added wikis for the projects hosted there. For many small projects it has been an essential part of the infrastructure. It provides a way to draw developers' attention and it is a place for users to get information and releases.

At least partially because of its popularity, SourceForge has its share of problems. Complaints about the tools chosen, user interface, number and type of advertisements, etc. are commonly heard. Perhaps the biggest issue for most projects is the availability of the site. Development grinds to a halt if the SF server goes down; communication disappears without the mailing lists and, because it uses centralized source code management, no code can be checked in or out. SF becomes the single point of failure for the entire project.

If a project gets unhappy enough with SourceForge, they can, of course, just pick up and move elsewhere. There are other project hosting sites available, some geared towards particular kinds of projects. It is likely that other sites suffer many of the same shortcomings as SF, so projects often find their own host, where they can control the tools and advertising policies. They can also impact the reliability issues by choosing tools that are less centralized. To their credit, SF does nothing to discourage projects from moving, but they do have a policy regarding what happens to the project's data and, ultimately, to the project's SF entry itself.

A weblog entry by kernel hacker Dave Jones gives his opinion, rather forcefully, about the retention policy. It seems he had tried to have his x86info project removed from SF, but was foiled by the policy. This rubbed him the wrong way:

My biggest beef is that of ownership. I feel I've effectively been forced to fork my own project. As I understand their policies, the terms mention that they won't remove projects that have released code just in case someone wants to fork an earlier version, or see the older history. In my case, I have a complete preservation of history in the git tree imported from the original CVS, along with tarballs of all releases. Should someone wish to fork my project, they'd be far better served by grabbing either of those than the 4 year old code stagnating in the CVS attic at sourceforge.

Search engine ranking plays a big role in his annoyance as well. A page at SF with a particular project name attached to it will be very high or at the top of any search engine results. Anyone looking for the project is likely to end up at the SF site, which will require another hop to get to the active site, if they see the link, as Jones puts it:

So now I'm left with one line of text forwarding to the new site, amongst a sea of commercials for sourceforge's "services".

The policy is for the protection of the code and the project, so that a loose cannon project administrator cannot, in a fit of pique, get the project and all of its files deleted. It also protects against data loss when projects move, but then disappear from their new site. There is certainly nothing wrong with the policy per se, but it has some, probably unintended, side effects.

SF has a built up a well deserved reputation as a solid, if a bit annoying, home for projects, and it certainly cannot be faulted for the trust that search engines have in it. There is also nothing wrong with providing a repository for old releases of open source software. It would just be nice if they could provide what Jones calls the "yes, I really know what I'm doing, and I understand your reasons, but please kill this project" option. In some ways like the trademark issue described on this page last week, this adds another decision that a project leader may need to consider in the early stages of a project.



to post comments

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 1:44 UTC (Thu) by jamienk (guest, #1144) [Link] (6 responses)

If there is just a small bit of text saying "This project is now actively developed at site X now," but this might not be seen in the sea of ads, then why not advocate for SF to have a "New Project Location Link" feature, where the devs can indicate that the project has moved and SF can try to ensure that the link is very visible...?

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 6:08 UTC (Thu) by zooko (guest, #2589) [Link] (5 responses)

It already has that feature, e.g.:

http://sourceforge.net/projects/pyutil

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 19:55 UTC (Thu) by giraffedata (guest, #1954) [Link] (4 responses)

That page looks like just what Dave is complaining about: the two lines that say "this is not the page you're looking for" are drowned out by all the other stuff on the page. Some people read a page from top to bottom, but many approach a page quite differently and would in fact miss those lines.

A redirect page would look like a redirect page, not a normal SF project page with two lines in the middle saying not to use it. However, to solve the Google problem, SF would also have to obliterate the old project page while somehow still giving seekers of the old project access to it. That's probably hard.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 21:04 UTC (Thu) by kamil (guest, #3802) [Link] (2 responses)

Couldn't robots.txt be used to distinguish between Google and ordinary people?

In the Alternative...

Posted Jun 14, 2007 22:26 UTC (Thu) by GreyWizard (guest, #1026) [Link]

Good point. Alternately, SourceForge could change the URL for the old code, making it findable but giving the new project page a chance to pull ahead in search rankings (or not).

SourceForge: the "Hotel California" of open source projects?

Posted Jun 16, 2007 2:29 UTC (Sat) by giraffedata (guest, #1954) [Link]

Well, that isn't really how robots.txt is supposed to be used. Archived project pages are exactly the kind of thing Google's spider is interested in.

I think Dave's issue is really with Google (or search technology in general), not with Sourceforge.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 15, 2007 9:00 UTC (Fri) by jospoortvliet (guest, #33164) [Link]

The page has ads, true, but the first thing I did see was the mention that the page has moved. Second, I noticed the ads. It's not that terrible, imho. It's really not like you have to search for the link to the new page or something.

I do however have problems with the fact ALL ppl searching for this project on google will end up on the sourceforge site. That's just not right, they should end up on the right page. If Sourceforge doesn't want to remove the page, they should redirect immediately, or at least automatically after 5 seconds.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 9:27 UTC (Thu) by brwk (guest, #6849) [Link] (2 responses)

What I don't understand about this concern is that sourceforge is a very good advertising location for a project, because people naturally look at it when searching for open source projects. It would seem to me little more than "good practice" for a project that may have moved it's development archive elsewhere to continue to upload it's package releases to sourceforge as a distribution channel. It's a useful, visible and distributed way to make a project's release tarballs more widely available and the mirroring of them is basically under the control of the project administrator. Put it up, it'll get mirrored to a lot of different archives around the world. Seems to me keeping a presence has a purpose way beyond just using it's CVS/SVN servers.

Regards, Bevis.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 16:00 UTC (Thu) by JohnNilsson (guest, #41242) [Link] (1 responses)

"It would seem to me little more than "good practice" for a project that may have moved it's development archive elsewhere to continue to upload it's package releases to sourceforge as a distribution channel."

Wouldn't it be even better if SF could pull code and releases automatically?

SourceForge: the "Hotel California" of open source projects?

Posted Jun 15, 2007 9:01 UTC (Fri) by jospoortvliet (guest, #33164) [Link]

Indeed.

But the projects should have a choice in this, and they don't. I think that sucks.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 10:20 UTC (Thu) by gouyou (guest, #30290) [Link] (1 responses)

I was wondering: SF is effectively distributing software under the GPL license, so don't they have to continue providing the source ?

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 14:34 UTC (Thu) by branden (guest, #7029) [Link]

Only as long as they provide binaries corresponding to that source (which won't apply for projects that generate no object code).

You might be thinking of the 3-year requirement (clause 3b), but that only applies if one distributes binaries *without* the corresponding source code. As far as I know, SourceForge has never done this. It's more common in a "push" model of distribution, whereas SF is pull-oriented. That is, they don't mail you a binary-only CD/DVD of code. You come to their website and take what you want, source or binary. They incur no obligation under the GNU GPL to keep source around for three years if you came by and elected to take only the binary when you had an opportunity to simultaneously acquire the source.

Getting out of SF.

Posted Jun 14, 2007 11:29 UTC (Thu) by dion (guest, #2764) [Link] (2 responses)

It's relatively simple to get your project taken off SF: Simply read the terms of use and find the easiest and most civil way to violate them.

It might not be very nice, but it works and perhaps it will induce the SF people to provide a proper EOL procedure.

Getting out of SF.

Posted Jun 15, 2007 9:02 UTC (Fri) by jospoortvliet (guest, #33164) [Link]

Very good!!!!

Getting out of SF.

Posted Jun 23, 2007 10:38 UTC (Sat) by zotz (guest, #26117) [Link]

"It's relatively simple to get your project taken off SF: Simply read the terms of use and find the easiest and most civil way to violate them."

Yes, but what they should do for this is to remove the offenders control over the project, not remove the project from the public.

If their code retention policies mean anything.

all the best,

drew

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 14:28 UTC (Thu) by amikins (guest, #451) [Link] (2 responses)

Personally, I agree with the intent of Sourceforge's policy. I think if you use them to publish open source/free software, then there's some responsibility on their part to make sure it remains available indefinitely. Having said "no, really, I said delete it" option violates both their intent and one of their critical roles. As mentioned in the article, it is all too common that a project has useful code on sourceforge, then moves on, only to vanish into the nether.
If someone is looking for code that may solve a problem they're having, or looking for a base on which to build a project, having the sourceforge repository available is indispensible, even if you don't like sourceforge itself.
I think a proper solution to respect the wishes of developers while still maintaining this role would be to have a method of 'archiving' a project, such that it isn't in the normal list of projects. Instead, it goes to a seperate section of the site, and an attempt to access the project page yields a "This isn't actively maintained anymore. More current development can be found at <link>. For the Sourceforge archival entry, click <link>." If you -must- visit a seperate page that's clearly archived to get the code from sourceforge, that underscores the "we don't live here anymore" point FAR more than a little piddly link like they currently allow.
This would more effectively solve the stated problem without compromising Sourceforge's intent.

Requirements gathering

Posted Jun 14, 2007 17:23 UTC (Thu) by JLCdjinn (guest, #1905) [Link]

I agree with this. SourceForge should do its best to maintain projects' histories, but it should also provide a comprehensive set of redirect features when active development has moved elsewhere. The comments on this article have highlighted some requirements for these features:

  • switchboard for allowing a project administrator to indicate that the project should be forwarded
  • separate project lists or visual indicators for active and forwarded projects
  • strongly overt visual indication for forwarded projects, with the forwarding location
  • explicit search engine exclusion (e.g. with robots.txt) for forwarded projects
  • all project tools (e.g. version control, mailing lists, and bug trackers) become read-only

Others?

These features should be implemented by other project hosting services, as well.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 20:02 UTC (Thu) by giraffedata (guest, #1954) [Link]

I think a proper solution to respect the wishes of developers while still maintaining this role would be to have a method of 'archiving' a project, such that it isn't in the normal list of projects. Instead, it goes to a seperate section of the site, and an attempt to access the project page yields a "This isn't actively maintained anymore. More current development can be found at ... For the Sourceforge archival entry, click ..."

The problem isn't the Sourceforge project list. Hardly anyone uses that. The problem is Google. Google will index that archive page as well as -- probably better than -- the "Alice doesn't live here anymore" page.

To solve the Google problem, Dave wants all mention of his project obliterated from the site. I don't see how to do that while keeping the archival stuff available to the public.

SourceForge not advisable for other reasons, like running non-free software

Posted Jun 14, 2007 14:35 UTC (Thu) by ber (subscriber, #2142) [Link]

Just for completeness, the Free Software community has issues with SourceForge for quite a while, here is an FSFE-Article from 2001: SourceForge drifting.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 15:09 UTC (Thu) by dw (subscriber, #12017) [Link] (2 responses)

I really don't understand Dave's complaint: he seems to be saying that SF would be (shock!) helping anyone who wanted to fork his project by giving them access to CVS. My understanding of this is, anyone using an open source license should be gulping hard and accepting this fact before licensing their work open source.

As pointed out by others, it's not like SF is trying to "steal" his work, he is still in full control of the project there, including having a full ability to set redirects etc.

What Dave seems to be complaining about, is trying to apply restrictions on work already licensed to allow this kind of behaviour.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 17:59 UTC (Thu) by vmole (guest, #111) [Link]

Mr. Jones complaint is not that SF provides access to old code, but that he has no good way to redirect people from SF to the new website for x86info. In particular, SF has such a good rep with the google algorithm that it will show up as the top link for a long time.

That said, having looked at the top link (sf.net / projects / x86info /) I think it's pretty clear that they've moved. You'd have to be pretty oblivious to think SF was current.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 18:25 UTC (Thu) by martinfick (subscriber, #4455) [Link]

I really don't understand Dave's complaint: he seems to be saying that SF would be (shock!) helping anyone who wanted to fork his project by giving them access to CVS.

No, if you read the article, he says that he does not have a problem with them helping someone fork, but:

"Should someone wish to fork my project, they'd be far better served by grabbing either of those than the 4 year old code stagnating in the CVS attic at sourceforge"

Where those refers to I have a complete preservation of history in the git tree imported from the original CVS, along with tarballs of all releases.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 15:35 UTC (Thu) by trutkin (guest, #3919) [Link] (2 responses)

The problem we (Jack) have had with sourceforge is you can't delete the mailing lists once they've
been created. So even though we've moved away, people sign up for the old sf.net hosted mailing
lists. We've also had trouble transistioning existing users away from the old mailing lists.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 14, 2007 19:40 UTC (Thu) by jengelh (guest, #33263) [Link] (1 responses)

You can deactivate lists so they at least do not show up.

SourceForge: the "Hotel California" of open source projects?

Posted Jun 16, 2007 5:04 UTC (Sat) by roelofs (guest, #2599) [Link]

You can deactivate lists so they at least do not show up.

You can also set the welcome message(s) so it says "this is a dead list, go over yonder to subscribe to the current one." It may even be possible to set subscriptions so moderator approval is required (and never granted, obviously).

Greg

With all due respect...

Posted Jun 16, 2007 0:37 UTC (Sat) by kena (subscriber, #2735) [Link]

While I understand Dave's concerns, I can't help but think that there's a bigger picture, here: "save the source." Imagine, if you will, a developer that wishes to harness the power of OSS. They use Sourceforge, release their code under the GPL or somesuch, have a modicum of success, and, just when it's getting interesting, decide to pull their code, and re-license it under a commercial license. Sure, some folks have probably downloaded the code, but, unless it's become quite popular, there probably aren't any fellow developers, and it may be damn hard to get momentum again if there's no real community in-place yet. (For the record, I've seen this happen. There was a NetApp-like snapshot project for the 2.0 kernel that did this exact thing.) I think, instead of being able to delete the code, a "deprecated" flag should be able to be toggled that
a) points out that the code is no longer maintained, and
b) gives further contact info (e.g., a URL) to the current site.

$.02, etc.

Stupid idea?

Posted Jun 23, 2007 10:53 UTC (Sat) by zotz (guest, #26117) [Link]

OK, I have just read the whole thread. Would something like this help:

A meta tag that tells search engines to rank an alternate site higher...

<META name="highersite" content="http://www.newsite.com">

or

<META name="newsite" content="http://www.newsite.com">

Then when search engines crawl a page with this tag, they could ensure that the new site shows up higher than the old.

Would this help the issue at hand?

Could doing it this way result in other problems? (gaming it somehow for other purposes)

all the best,

drew

SourceForge: the "Hotel California" of open source projects?

Posted Jun 27, 2007 0:52 UTC (Wed) by pfalcon (guest, #45953) [Link]

I find gentleman's logic non-monotonic. Reading his blog: "google for x86info. Note which hit gets returned first."

Aha, so it's SourceForge what sucks. Google, well, it's just there. And of course, it's SF.net what should drop "his" stuff, not Google boost score for *his* site. Well, he asked SF.net, and got a definitive, though maybe not per his likes, response. I wonder, did he contact Google on scoring matter, and what reply was.

Also, he speaks about "decentralized source control", but he would prefer to control that decentralization. And of course, next time he gets change in mood (going commercial, maybe?), he won't grep logs and send cease'n'desists to fetchers. Obviously no, he just doesn't want a big site to distribute his code now, that's all.

While he for sure won't be changing his mind so radically, some projects do. Many projects of those would just send DMCA cease and desist, and have their GPL code removed. Yet some apparently understand both legal and moral binding of a license, so don't go that far. They remove their own source mirrors, deactivate downloads, kill docs, but at least leave project in such state. And interested parties at least somehow may dig out something out there. Good that or bad? Well, I don't envy those "interested parties" to dig in such dumps. But those caring about morals of OpenSource, not just pragmatics ("it's easier to maintain project as open source because many services are free, and when we want, we just close it down"), may find it amusing that OpenSource license is enforced even in such strange ways.

Morals of the last paragraph towards the specific case in question? It's not possible to know what are intentions of project admins who want to close down projects. That's why there's formal retention policy.


Copyright © 2007, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds