Snellman: On open sourcing existing code

[Posted March 20, 2015 by n8willis]

Juho Snellman has an interesting treatise on the oft-overlooked challenges that face developers attempting to release an existing, proprietary codebase under open-source terms. "As soon as you get outside of the "one self-contained file or directory" level of complexity, the threshold for releasing code becomes much higher. And likewise every change to a program that was made in order to open source it will make it less likely that the two versions can really be kept in sync in the long term. In this case the core code is maybe 2k-3k lines and won't require much work. It's all the support infrastructure that's going to be an issue." Snellman also reflects on possible strategies for writing internal code that may some day be released to the public.

Snellman: On open sourcing existing code

Posted Mar 21, 2015 12:15 UTC (Sat) by ledow (guest, #11753) [Link] (7 responses)

Is it just me that takes away the only impression of:

We code like junk when nobody is looking.

If ever there's an argument for open-source, that has to be it. It's not the be-all-and-end-all, however (OpenSSL is currently proving that, but it's feasible that "nobody is looking" applied in that case too).

Snellman: On open sourcing existing code

Posted Mar 21, 2015 13:33 UTC (Sat) by zyga (subscriber, #81533) [Link]

I code open source for a living. I'm doing my best but there are days where the code I make is not what I want to make because deadlines and other events are coming up.

I think it's hard to say which code is better. I've seen some terrible proprietary code. I've seen some terrible free software code (not that it was anything in the center of the spotlight though).

Still, I must agree there is something to your argument. It's in our inner desires to show ourselves from the best possible view. Knowing that the code we write is out in the view for everyone to see and will stick with our name forever is always a motivating factor to do the best possible thing at the time.

Snellman: On open sourcing existing code

Posted Mar 21, 2015 18:33 UTC (Sat) by dskoll (subscriber, #1630) [Link]

We code like junk when nobody is looking.

No, I don't think it's as simple as that. I work on both free and proprietary software and it really is a different environment. I'm very proud of our actual proprietary software and I would have no issues whatsoever with anyone seeing the source. In fact, we ship the source, which is buildable on pretty much any UNIX-like platform.

However, because we have a pretty constrained build and test environment, I'm not quite so proud of our build infrastructure and test scripts. To generalize those and clean them up would be an awful lot of work and we simply can't justify the effort.

Usually in a proprietary development project, you have the luxury of these sorts of constrained environments, so your infrastructure code isn't as clean and portable as it could be. It's far easier to accumulate technical debt in such an environment than in an open-source environment, and business concerns rather than aesthetics dictate if and when the technical debt is repaid.

Snellman: On open sourcing existing code

Posted Mar 22, 2015 20:09 UTC (Sun) by NAR (subscriber, #1313) [Link]

It's usually not the coding, but the build and test environment. When there's only one build server, there's not much point in making that configurable. Also if the version and the path to the 3rd party libraries is fixed, so noone's gonna pay to write scripts to detect, configure or check them.

Snellman: On open sourcing existing code

Posted Mar 23, 2015 9:26 UTC (Mon) by fb (guest, #53265) [Link] (1 responses)

> We code like junk when nobody is looking.

Sorry but that just isn't true.

I've worked for both proprietary and FOSS projects as a full time job.

FOSS projects tend to be better organized for similar sized projects. Not a question of code quality but really of making the infrastructure around the code (e.g. instructions on how to do this or that, build files, amount of work/configuration it takes to actually build it etc).

I don't think this is because of more or less altruism or concern with aesthetics. The core reason IMO is that on FOSS (any large-ish successful project) **has to** put some effort cleaning the infrastructure because otherwise:
- devs will get pestered over and over again on IRC with simple questions. Often by people that think that a 'dev on IRC' is a 'development help desk at their disposal'
- you lose any chance of having any 'accidental contributors', who come by, fix something, send a pull request and go away.
- new full time contributors are often located a continent away

So cleaning up has a real concrete pay off other than 'the project is neater now'.

On a proprietary code base you don't have 'accidental' contributors. New developers are assigned to work on the project, often sitting two desks away. So the pay off around having perfectly clean build files and perfectly documented README files is, in practice, a lot lower.

Snellman: On open sourcing existing code

Posted Mar 23, 2015 14:14 UTC (Mon) by pboddie (guest, #50784) [Link]

I was going to comment on this much earlier, but many of the helpful commentators have made some of the same points on my behalf, and in more detail, too!

What comes into play here is what Brooks describes in the classic "The Mythical Man Month", which despite discussing large software projects in the main, touches on some of the same problems when you make a software project for more general consumption. You can certainly get away with developing something with minimal documentation, implicit build dependencies, and with a good-enough but infrastructure-specific build system, but as soon as you have other people wanting to use that software (either as actual users or other developers, either within or beyond your organisation), the software becomes a product and demands multiples of the original effort put in to make something that is functional.

Something that is openly developed often has to maintain this readiness for external usage at the cost of time spent on things like documentation, although some Free Software projects manage to muddle through by making new users and contributors "learn the ropes" the hard way. But aside from exceptional cases where the motivation to adopt the software overrides deficiencies in the materials of the "product" and where people will use it no matter what, many projects will struggle to get the attention of their target audience. The last thing potential adopters want to see is a dependency on arcane technologies or a menagerie of weird programs and libraries (or worse: network services) just to even evaluate whether some software does what it is claimed to do.

So, projects developed in the open may very well be oriented towards being some kind of consumable "product", although many of them may also be looking for contributors to help them get to that state, and being exposed to broader requirements and demands (running on different systems and in different environments) by potentially attracting outside help to meet such needs probably keeps the applicability at a higher level than some internal software that will eventually be thrown over the wall.

In short, something without a broader audience in mind probably won't have had much extra effort put in to make it a "product", regardless of the licensing applied to that software, but openly-developed - not merely freely-licensed - software may well have been incurring the costs of building a "product" all along in order to grow the project and get people to use it.

Code Reuse vs Reduced Dependancies

Posted Mar 24, 2015 10:50 UTC (Tue) by gmatht (guest, #58961) [Link]

There were examples of code practices that were better for internal use, but less so for external. For example code reuse is good, so using convenience libraries is "good", but in an open source application reducing dependencies can be more important if you only use a little bit of a library.

Snellman: On open sourcing existing code

Posted Mar 26, 2015 23:00 UTC (Thu) by deepfire (guest, #26138) [Link]

Time pressure is what you're looking for -- it cannot fail to affect the outcome.. sadly.