The importance of free software to science

June 4, 2025

This article was contributed by Lee Phillips

Free software plays a critical role in science, both in research and in disseminating it. Aspects of software freedom are directly relevant to simulation, analysis, document preparation and preservation, security, reproducibility, and usability. Free software brings practical and specific advantages, beyond just its ideological roots, to science, while proprietary software comes with equally specific risks. As a practicing scientist, I would like to help others—scientists or not—see the benefits from free software in science.

Although there is an implicit philosophical stance here—that reproducibility and openness in science are desirable, for instance—it is simply a fact that a working scientist will use the best tools for the job, even if those might not strictly conform to the laudable goals of the free-software movement. It turns out that free software, by virtue of its freedom, is often the best tool for the job.

Reproducing results

Scientific progress depends, at its core, on reproducibility. Traditionally, this referred to the results of experiments: it should be possible to attempt their replication by following the procedures described in papers. In the case of a failure to replicate the results, there should be enough information in the paper to make that finding meaningful.

The use of computers in science adds some extra dimensions to this concept. If the conclusions depend on some complex data massaging using a computer program, another researcher should be able to run the same program on the original or new data. Simulations should be reproducible by running the identical simulation code. In both cases this implies access to, and the right to distribute, the relevant source code. A mere description of the algorithms used, or a mention of the name of a commercial software product, is not good enough to satisfy the demands of a meaningful attempt at replication.

The source code alone is sometimes not enough. Since the details of the results of a calculation can depend on the compiler, the entire chain from source to machine code needs to be free to ensure reproducibility. This condition is automatically met for languages like Julia, Python, and R, whose interpreters and compilers are free software. For C, C++, and Fortran, the other currently popular languages for simulation and analysis, this is only sometimes the case. To get the best performance from Fortran simulations, for example, scientists often use commercial compilers provided by chip manufacturers.

Document preparation and preservation

The forward march of science is recorded in papers which are collected on preprint servers (such as arXiv), on the home pages of scientists, and published in journals. It's obviously bad for science if future generations can't read these papers, or if a researcher can no longer open a manuscript after upgrading their word-processing software. Fortunately, the future readability of published papers is enabled by the adoption, by journals and preprint servers, of PDF as the universal standard format for the distribution of published work. This has been the case even with journals that request Microsoft Word files for manuscript submission.

PDF files are based on an open, versioned standard and will be readable into the foreseeable future with all of the formatting details preserved. This is essential in science, where communication is not merely through words but depends on figures, captions, typography, tables, and equations. Outside the world of scientific papers, HTML is by far the dominant markup language used for online communication. It has advantages over PDF in that simple documents take less bandwidth, HTML is more easily machine-readable and human-editable, and by default text flows to fit the reader's viewport. But this last advantage is an example of why HTML is not ideal for scientific communication: its flexibility means that documents can appear differently on different devices.

The final rendering of a web document is the result of interpretation of HTML and CSS by the browser. The display of mathematics typically depends on evolving JavaScript libraries, as well, so the author does not know whether the reader is seeing what was intended. The "P" in PDF stands for "portable": every reader sees the same thing, on every device, using the same fonts, which should be embedded into the file. The archival demands of the scientific record, combined with the typographic complexity often inherent to research papers, requires a permanent and portable electronic format that sets their appearance in stone.

To aid collaboration and to ensure that their work is widely readable now and in the future, scientists should distribute their articles in the form of PDF files, ideally alongside text-based source files. In mathematics and computer science, and to some extent in physics, LaTeX is the norm, so researchers in these fields will have the editable versions of their papers available as a matter of course. Biology and medicine have not embraced the culture of LaTeX; their journals encourage Word files (but often accept RTF output). Biologists working in Word should create copies of their drafts in one of Word's text-based formats, such as .docx or .odt; though these files may not be openable by future versions of Word, their contents will remain readable. Preservation of text-based, editable source files is essential for scientists, who often revise and repurpose their work, sometimes years after its initial creation.

Licensing problems

Commercial software practically always comes with some form of restrictive license. In contrast with free-software licenses, commercial ones typically interfere with the use of programs, which often throws a wrench into the daily work of scientists. The consequences can be severe; software that comes with a per-seat or similar type of license should be avoided unless there is no alternative.

One sad but common situation is that of a graduate student who becomes accustomed to a piece of expensive commercial analytical software (such as a symbolic-mathematics program), enjoying it either through a generous student discount or because it's paid for by the department. Then the freshly-minted PhD discovers the real price of the software, and can't afford it on their postdoc salary. They have to learn new ways of doing things, and have probably lost access to their past work, which is locked up in proprietary binary files.

A few months ago, an Elsevier engineering journal retracted two papers because their authors had used a commercial fluid-dynamics program without purchasing a license for it. The company behind the program regularly scans publications looking for mentions of its product in order to extract license fees from authors. In these cases, the papers had already been cited, so their retraction is disruptive to scholarship. Cases such as these are particularly clear examples of the potential damage to science (and to the careers of scientists) that can be caused by using commercial software.

In addition, certain commercial software products with per-seat licensing "call home" so that the companies that sell them can keep track of how many copies of their programs are in use. The security implications of this should be obvious to anyone, yet government organizations, while adhering minutely to security rituals with questionable efficacy, permit their installation. While working at a US Department of Defense (DoD) lab, I was an occasional witness to the semi-comical sight of someone running around knocking on office doors, trying to find out who was using (or had left running) a copy of the program that they desperately needed to use to meet some deadline—but were locked out of.

Software rot

Ideally scientists would only use free software, and would certainly avoid "black box" commercial software for the various reasons mentioned in this article. But there is another category that's less often spoken of: commercial software that provides access to its source code.

When I joined a new project at my DoD job, the engineer that I was supposed to work with was at a loss because a key software product had stopped working after he upgraded the operating system (OS) on his workstation. The operating system couldn't be downgraded and the company was no longer supporting the product. I got a thick binder from him with the manual and noticed a few floppy disks included. These contained the source code. Right at the top of the main program was a line that checked the version of the OS and exited if it was not within the range that the program was tested on. I figured we had nothing to lose, so edited this line to accept the current OS version. The program ran fine and we were back in business.

The point of this anecdote is to illustrate the practical value of access to source code. Such proprietary but source-available software occupies an intermediate position between free software and the black boxes that should be strictly avoided. Source-available software, although more transparent, practical, and useful than black boxes, still fails to satisfy the reproducibility criterion, however, because the scientist who uses it can't publish or distribute the source; therefore other scientists can't repeat the calculations.

Software recommendations

The following specific recommendations are for free software that's potentially of use to any scientist or engineer.

Scientists should, when practical, test their code using free compilers, and use these in preference to proprietary options when performance is acceptable. For the C family, GCC is the venerable standard, and produces performant code. A more recent but now equally capable option is Clang.

For Fortran, GFortran (which is a front-end for GCC) is a high-quality compiler and the standard free-software choice. Several more recently developed alternatives are built, as is Clang, on LLVM. To avoid potential confusion, two of these are called "Flang". Those interested in investigating an LLVM option should follow the project called (usually) "LLVM Flang", which is written from scratch in C++, and was renamed to "Flang" once it became part of the LLVM project in 2020. Its GitHub page warns that it is "not ready yet for production usage", but this is probably the LLVM Fortran compiler of the future. Another option to keep an eye on is the LFortran compiler. Although still in alpha, this project (also built on LLVM) is unique in providing a read-eval-print loop (REPL) for Fortran.

For those scientists not tied to an existing project in a legacy language, Julia is likely the best choice for simulation and analysis. It's an interactive, LLVM-based, high-level expressive language that provides the speed of Fortran. Its interfaces to R, gnuplot and Python mean that those who've put time into crafting data-analysis routines in those languages can continue to use their work.

Although LaTeX is beloved for the quality of its typesetting, especially for mathematics, it is less universally admired for the inscrutability of its error messages, the difficulty of customizing its behavior using its arcane macro language, and its ability to occasionally make simple things diabolically difficult. Recently a competitor to LaTeX has arisen that approaches that venerable program in the quality of its typography (it uses some of the same critical algorithms) while being far easier to hack on: Typst. Like LaTeX, Typst is free software that uses text files for its source format, though Typst does also have a non-free-software web application. Typst is still in alpha, and so far only one journal accepts manuscripts using its markup language, but its early adopters are enthusiastic.

A superb solution for the preparation of documents of all types is Pandoc, a Haskell program that converts among a huge variety of file formats and markup languages. Pandoc allows the author to write everything in its version of Markdown and convert into LaTeX, PDF, HTML, various Word formats, and more. Raw LaTeX, HTML, and others can be added into the Markdown source, so the fact that Markdown has no markup for mathematics (for example) is not an obstacle. The ability to have one source and automatically create a PDF and a web page, or to produce a Word file for a publication that insists on it without having to touch a "what you see is what you get" (WYSIWYG) abomination, greatly simplifies the life of the writer/scientist. Pandoc can even output Typst files, so those who use it are ready for that revolution if it comes.

Conclusion

The goals of the free-software movement include ensuring the ability of all users of software to form a community enriched and liberated by the right to study, modify, and redistribute code. The specific needs of the scientific community bring the benefits of free software into clear focus and they are critical to the health and continued progress of science.

The free-software movement has an echo in the "open-access movement", which is centered around scientific publication and began in the early 1990s. It has its origins in the desire of scientists to break free of the stranglehold of the commercial scientific publishers. Traditionally, those publishers have interfered with the free exchange of ideas, while extracting reviewer labor without compensation and attaching exorbitant fees to the access of scientific knowledge. Working scientists are aware of the movement, and most support its aims of providing free access to papers while preserving the curation and quality control inherited from traditional publishing. It is important to also continue to nourish awareness of the crucial role that free software plays throughout the scientific world.

Index entries for this article
GuestArticles	Phillips, Lee

Engineers need to do better

Posted Jun 4, 2025 14:26 UTC (Wed) by willy (subscriber, #9762) [Link] (3 responses)

Most working scientists are not computer experts (those who are frequently get recruited into industry and stop publishing). This means that most software written by scientists is of poor quality and does not use best practices which will let it run on future substrates.

I found some code which had been written against Python 2.6 and would have needed substantial changes to make it work with 2.7. Part of that was using a library which was no longer available. And I'm no Python expert, so I just gave up.

You're right that open source software gives us an advantage, but we have to have a better legacy story than this! Whether that's preserving digital artifacts better or having a better backward compatibility story or something else ...

Engineers need to do better

Posted Jun 4, 2025 16:07 UTC (Wed) by fraetor (subscriber, #161147) [Link] (1 responses)

This is a notable issue in research, but one for which there is progress, albeit gradually. The role of a Research Software Engineer (RSE) is becoming more common to support technical development in science and there is a growing body of practice aiming to improve the state of software. This practice includes both traditional development "best practice", as well as training of scientists and pushing for the software lifecycle to be included in grants and proposals. [1]

Software is an area where university based researchers often struggle more than their industry counterparts, largely due to the short term nature of university funding and contracts, and a focus on publication output for promotion, etc. Over the past few years a number of UK universities have established a central pool of RSEs, often employed on a permanent basis, to mitigate this.

However, while there is a lot of focus around reproducibility, especially in the context of FAIR [2], it does seem that a lot of the effort is going towards freezing all the dependencies and effectively reproducing the original environment, whether through conda environments, containers, or VMs. I guess it is the difference between single paper analytics, and creating a reusable analytics tool.

Software is more essential to science than ever before, so this is definitely an area to keep on improving.

[1]: J. Cohen, D. S. Katz, M. Barker, N. Chue Hong, R. Haines and C. Jay, "The Four Pillars of Research Software Engineering," in IEEE Software, vol. 38, no. 1, pp. 97-105, Jan.-Feb. 2021, https://doi.org/10.1109/MS.2020.2973362
[2]: Wilkinson, M., Dumontier, M., Aalbersberg, I. et al. The FAIR Guiding Principles for scientific data management and stewardship. Sci Data 3, 160018 (2016). https://doi.org/10.1038/sdata.2016.18

Engineers need to do better

Posted Jun 4, 2025 22:01 UTC (Wed) by kleptog (subscriber, #1183) [Link]

So there's a name for that now: Research Software Engineer.

The projects at $DAYJOB I find the most fun are when I'm given a pile of code written by some researcher or analyst to solve a problem to turn it into something usable. The joy on their faces when you spend an afternoon restructuring their code to produce something more readable and reliable and completes in a fraction of the time is priceless.

Engineers need to do better

Posted Jun 4, 2025 17:21 UTC (Wed) by fenncruz (subscriber, #81417) [Link]

This means that most software written by scientists is of poor quality and does not use best practices which will let it run on future substrates.

As someone who has made the move between academia and industry I can assure you that people can and do write bad code in any ~~language~~ industry.

LyX

Posted Jun 4, 2025 16:24 UTC (Wed) by joib (subscriber, #8541) [Link] (4 responses)

One for some reason relatively little known application is LyX, which is a kind-of-semi WYSIWYG editor for LaTeX. In particular, it has a very good equation editor. I largely wrote my PhD thesis with it (including the journal articles). And the LaTeX it generates is fairly readable, so if there's some final tweaking you need to do that LyX doesn't support natively, you can drop down to LaTeX to do that stuff before submitting.

LyX

Posted Jun 6, 2025 18:38 UTC (Fri) by parametricpoly (subscriber, #143903) [Link] (3 responses)

Unfortunately LyX is pretty crash prone, probably due to the fact that it was written in C++, which makes it horribly difficult to write non-crashing code. I've contributed to the project and used it for over 10 years. Unfortunately the last time I suggested a colleague using Windows to try it out, it crashed multiple times during the first 15 minutes. It has been quite stable on my Lnux, but typically crashes 1-2 times per day.

LyX

Posted Jun 6, 2025 20:05 UTC (Fri) by joib (subscriber, #8541) [Link] (1 responses)

That's too bad. I used LyX roughly from the late 90ies to the early 2010s, and I didn't have any issues with crashing (not saying it never crashed, but I don't remember it to be any worse than other complex gui applications like say Firefox). Maybe quality has gone down since then, if so too bad, I really liked it.

LyX

Posted Jun 6, 2025 22:05 UTC (Fri) by sfeam (subscriber, #2841) [Link]

I have used, and continue to use, LyX when writing papers for publication [biochemistry/computational biology]. I have never had a problem with it crashing. The biggest problem has been deciding "is it worth the time to write a LyX template to match the LaTeX template this journal will accept for submission?". It has been a pleasant surprise going back 20 years or so how many biology journals are willing to accept submissions in LaTeX, even though it may require bypassing the default final-stage submission process and corresponding directly with the print office rather than the editorial office.

Monkey testing LyX

Posted Jun 7, 2025 17:37 UTC (Sat) by gmatht (subscriber, #58961) [Link]

Heh, I have submitted 244 bug reports to LyX. Most of these are crash/aborts found by my Jankey monkey testing tool. Unfortunately, they couldn't fix bugs as quickly as I could find them. I bisected the crashes to find the regressing commit, but working on Coverity reports (as they do now) looks like it would be more developer time effective.

I also imagine LyX would be more reliable if you use the last release of the previous version. (i.e. 2.3.8 rather than the latest 2.4.3).

Reproducibility

Posted Jun 4, 2025 22:08 UTC (Wed) by randomguy3 (subscriber, #71063) [Link] (5 responses)

I'm suspicious of the idea that reproducibility means being able to run exactly the same software, compiled in exactly the same way. It doesn't look good for the robustness of the result if changing the software, or the implementation of the algorithm, breaks it.

I think the bigger win for using free software (along with open data) is making it harder to hide flawed analysis. When different experiment appear to disagree, it can provide a way of investigating why, and whether something underhanded (or incompetent) has been going on.

Reproducibility

Posted Jun 5, 2025 0:20 UTC (Thu) by pizza (subscriber, #46) [Link] (3 responses)

> I'm suspicious of the idea that reproducibility means being able to run exactly the same software, compiled in exactly the same way. It doesn't look good for the robustness of the result if changing the software, or the implementation of the algorithm, breaks it.

Without the former, there is no point in even attempting the latter.

Let's say your new implementation produces different results with the same data. Maybe the problem is with your implementation, maybe the problem is with the algorithm, or maybe the problem is actually in the original, casting the original conclusions into question.

You often (usually?) don't know what differences from the original may turn out to be material, so ideally you'd try to precisely recreate the original results (ie with the same input and exactly-as-described procedure) before changing anything..

Reproducibility

Posted Jun 5, 2025 10:40 UTC (Thu) by farnz (subscriber, #17727) [Link] (2 responses)

The other critical point IME is that it's not unknown for algorithms to have external dependencies that aren't documented because "everyone" uses the same setup.

For example, I've had to work with code that started failing when recompiled for an SGI machine, rather than the Windows NT box it had been written for, and the Debian x86 boxes it had been ported to successfully. We instrumented the code base, and determined that the SGI machine was underflowing intermediates in floating point calculations, where the same code compiled for Windows or Debian was not - because intermediates were being kept in extended precision x87 stack slots, where the SGI was using double precision FPU registers.

Without the ability to compare, I'd just have a chunk of C code that the original author claimed worked "just fine" (and was backed by other users, who also found it working "just fine" on Linux and Windows NT), but which failed on my target system. With the ability to compare, I could find the problem, determine that the hidden assumption is that all FPUs are x87-compatible, and then work with the originator on a fix.

Reproducibility - NixOS

Posted Jun 5, 2025 17:51 UTC (Thu) by gmatht (subscriber, #58961) [Link] (1 responses)

NixOS (and Docker) could be useful to make it explicit what software you think "everybody" has; though, people who know about Nix may not be the ones who make this mistake.

Reproducibility - NixOS doesn't fix hardware assumptions

Posted Jun 5, 2025 18:38 UTC (Thu) by farnz (subscriber, #17727) [Link]

Historically, it's not just been software (which Docker helps with - since you're now running a static userspace, and NixOS can extend across the kernel) - it's been hardware and firmware as well. Nowadays, x87 is dead, but we're not at a stage where all CPUs behave identically for a given source code.

For example, there's still differences in memory consistency models between CPUs, so code that appears to be reproducible on one CPU model may fail on a different system, simply because it relied on hardware implementation details that aren't true of all implementations.

More insidiously, I've encountered algorithms that contain "harmless" race conditions that happen to be not harmless if you swap, or implementations that have problems if there's more than 256 CPU threads available to them (since you can fit the "number of available threads" counter in a byte, right?), or implementations that "know" that you can't have more than 48 virtual address bits (since that's the limit on the CPU they have today), or otherwise embed false assumptions about hardware.

Similarly, I've seen implementations that "know" that PCIe devices must have their BARs allocated below 4 GiB (since the firmware on their machine did that), or that the CPU numbers are contiguous, or that the CPU numbers convert to NUMA node numbers with a simple mask-and-shift, or that odd numbered CPUs are the SMT siblings of the even numbered CPUs.

All of these are details that are often consistent if you're comparing to a bunch of machines your institution bought - but they're details that can change once you change hardware vendors.

Reproducibility

Posted Jun 5, 2025 9:42 UTC (Thu) by magi (subscriber, #4051) [Link]

I tend to agree, although the strict reproducibility (using the same environment and setup) clearly has to hold.

I think that in order to check reproducibility you need automatic tests. Chances are that you will end up with results that are not strictly the same (as in binary equality). The question is, are the results significantly different whatever that might mean in this context. I think this is really one of the big differences between scientific software and "normal" software. In addition what is the correct result anyway, given that simulations are used to explore a system that cannot be tackled otherwise. Another issue with testing scientific software is that the tests might require huge resources (think weather model) or closed data (think medical data).

Having good tests will allow people to convince themselves the software is doing what it is supposed to. It will also help port the software to new systems and deal with the software rot.

paper production

Posted Jun 5, 2025 9:51 UTC (Thu) by magi (subscriber, #4051) [Link]

Using markdown for writing papers is great. This works particularly well when used with gitlab/github for collaboration and CI/CD pipelines to automate spell checking and pdf production. Obviously LaTeX works equally well but the syntax is slightly heavier.

I missed in the article mentioning of reproducible documents. I think the idea comes from the R world where the document contains the code that can be run to produce the outputs for the document. Quarto supports many languages including python and R, uses pandoc to produce the output and can be edited using jupyter notebooks.

Scheduling influences on simulation outcomes?

Posted Jun 5, 2025 10:37 UTC (Thu) by taladar (subscriber, #68407) [Link] (3 responses)

Wouldn't the simulation programs need to take some additional precautions for full reproducibility of the results similar to the way compilers do for compiling reproducible code or games do for reproducible seed-based procedural generation?

I am thinking of things like not using shared PRNGs from multiple threads where the scheduling order might then give each thread different parts of its (otherwise deterministic) output depending on which thread is scheduled first.

But beyond that some auto-configuration of the program might also be a problem, e.g. detecting the RAM size or CPU core count and scaling operations by that by spawning more threads or processing larger batches at a time.

Forwards-compatibility is hard

Posted Jun 5, 2025 11:20 UTC (Thu) by farnz (subscriber, #17727) [Link]

You also get into having to think about what might be different in the future; for example, I've seen problems with an algorithm in the late 1990s that assumed x87 FPUs and was reproducible on all x86 CPUs of the era, but not on future x86 family CPUs (using SSE2 instead of x87), or on MIPS CPUs.

And then there's things like using 32 bit counters because you can't overflow them in reasonable time; this is true when things are slow enough, but as they get faster, it can become false. For example, an Ethernet packet is a minimum of 672 bit times on the wire; in 1995, a 32 bit packet counter represented over 8 hours of packets at the maximum standardised rate. However, today's maximum standardised rate (from 2024) is 800 Gbit/s, or overflow in a bit over 3.6 seconds.

The best we can reasonably ask for is that it's possible to follow your documentation and reproduce your results - that can include documenting the hardware, OS and other details of the system you produced the result on. That way, it becomes possible for a future reimplementation of your algorithm to do the A/B comparison between your system, and their new system, even if they've had to get help from a museum to build up the required hardware to reproduce your results.

Scheduling influences on simulation outcomes?

Posted Jun 5, 2025 13:01 UTC (Thu) by fenncruz (subscriber, #81417) [Link]

Yes these are problems that can be solved if people care enough. It takes work and dedication to get it working with an existing codebase, and to keep it that way as people add new code (ask me how I know :) ).

On your last point you actually want fixed sized batches, for reproducabilitly. Then distribute each batch to a thread as the thread becomes free. That way you always do the same ordering of your floating point numbers ( (a+b)+c /= a +(b+c) in floating point maths). Think about summing elements in an array broken into chunks per thread, naively more threads would mean more intermediate values that need to get get summed up. With fixed sized blocks it doesn't matter whether someone runs your code with 1 thread or 100, the number of intermediate values is the same. So you get the same answer when the intermediate values get added up at the end.

On random numbers you would need each thread to have its own stream, plus someway to initialise each block of work to its own seed (not thread, again as the number of threads might vary between users).

Scheduling influences on simulation outcomes?

Posted Jun 5, 2025 16:46 UTC (Thu) by fraetor (subscriber, #161147) [Link]

In HPC land you often have a slightly different way of thinking.

As supercomputers are expensive you are often thinking in terms of The Computer rather than a computer. Porting to a different machine is usually a significant task, and to maintain maximum performance you are often using lots of non-portable tricks that will need adjusting. Things like memory size, core count, or core binding are often configured relatively statically.

Because of this a supercomputer port is usually accompanied by an evaluation phase where subject matter experts look at the outputs and decide if it is close enough.

Once you have completed a port then Known Good Outputs (KGOs) are very commonly used within that single machine, to ensure any output change, even to the most insignificant of bits, is able to be explained.

.odt

Posted Jun 5, 2025 10:52 UTC (Thu) by grawity (subscriber, #80596) [Link]

> Biologists working in Word should create copies of their drafts in one of Word's text-based formats, such as .docx or .odt; though these files may not be openable by future versions of Word, their contents will remain readable

In particular because .odt isn't even a Word format – it's the OpenOffice/LibreOffice format that Word only has secondary support for.

Though compatibility of LibreOffice with Word-produced .docx vs Word-produced .odt might well be the same these days?

Accelerators, parallelism, and determinism

Posted Jun 5, 2025 18:33 UTC (Thu) by DemiMarie (subscriber, #164188) [Link] (4 responses)

Are deterministic results in HPC practical? My understanding is that parallel algorithms generally assume that they can reorder floating point operations, because not doing so is too constraining. Also, I believe GPUs don’t provide the same guarantees CPUs do when it comes to deterministic floating-point results.

Determinism, reproducibility, and numerical analysis

Posted Jun 6, 2025 10:44 UTC (Fri) by farnz (subscriber, #17727) [Link] (3 responses)

GPUs do provide the same IEEE 754 guarantees on determinism when it comes to floats as CPUs do; they're fully deterministic machines in that regard.

The reason people talk about GPUs as non-deterministic is all in arbitration logic; arbitration logic is often deliberately non-deterministic between requesters of the same priority, and in GPUs that happens during memory access and in hardware scheduling of GPU work items.

Also note that having internal non-determinism (e.g. because you're reordering FP operations) does not imply that your result is non-deterministic. You can do things with numerical analysis that prove that, for a given allowed set of possible orders of operation, the output is always the same; if you then ensure that the allowed set of possible orders of operation in your analysis is a non-strict superset of the possible orders of operation due to non-determinism in your implementation, you have a proof that you have a deterministic output from a non-deterministic machine.

And numerical analysis can extend beyond purely numerical results; if your prediction is that a statistical measure of the simulation's output falls in a range, I can do numerical analysis to show that the simulation's output's error bars are such that the statistical measure must fall inside that range if it falls inside an analytically determined range for a single run of the simulation.

Finally, note that if your result is not reproducible, what you're doing is arguably not science; reproducibility is necessary because otherwise you can claim results from your simulation and insist that your hypothesis is correct, even though my results are different and falsify your hypothesis, simply by saying that my run of the simulation is wrong.

Determinism, reproducibility, and numerical analysis

Posted Jun 6, 2025 13:11 UTC (Fri) by Wol (subscriber, #4433) [Link]

> Finally, note that if your result is not reproducible, what you're doing is arguably not science; reproducibility is necessary because otherwise you can claim results from your simulation and insist that your hypothesis is correct, even though my results are different and falsify your hypothesis, simply by saying that my run of the simulation is wrong.

Not "arguably", it *cannot* be science.

Science is accurately predicting the results of your experiments, not just doing experiments and "seeing what happens" - that's called playing.

That's why whenever you see "a new experiment has proven that ...", you know either they don't know what they're doing, or they do know what they're doing and it's called propaganda/lying.

It's not Science until you do the exact same experiment and get the result you predicted. Nothing wrong with the prediction being vague, as long as it is correct as far as it goes - you can always refine it afterwards. And then do another experiment, of course!

Cheers,
Wol

Determinism, reproducibility, and numerical analysis

Posted Jun 7, 2025 16:48 UTC (Sat) by DemiMarie (subscriber, #164188) [Link] (1 responses)

At least CUDA explicitly does not guarantee reproducibility unless the hardware and software are unchanged. In fact, you have to opt-in to determinism in some cases.

Determinism, reproducibility, and numerical analysis

Posted Jun 9, 2025 9:54 UTC (Mon) by farnz (subscriber, #17727) [Link]

CUDA's non-determinism is in scheduling and memory access, not computation itself. This is the same guarantees as you get with a CPU; the computation is deterministic, but the point at which you work across CPU cores is non-deterministic.

However non-determinism is not the same as non-reproducibility. There exist plenty of non-deterministic algorithms that have a deterministically reproducible output; it's just extra analysis steps to confirm that your result is reproducible from a set of non-deterministic intermediate steps.

And note that for science purposes, we don't necessarily even need a deterministic output; if your prediction is "this simulation will never have an output whose mean is below 64.0", you can do an analysis that confirms that the algorithm's error bars on the mean are (say) ±31.5 due to non-determinism. Then, if the simulation gets an output of 127.0, you know that, while you can't reproduce the exact value, you can deterministically answer the question "is the mean of the output greater than or equal to 64.0" with "yes, because the worst case an attempted reproducer will see is a mean of 95.5, and the error bar on that will tell them that the algorithm's output must be above 64.0".

PDF is an open standard, except when it's not

Posted Jun 7, 2025 18:15 UTC (Sat) by marcH (subscriber, #57642) [Link] (10 responses)

> PDF files are based on an open, versioned standard and will be readable into the foreseeable future with all of the formatting details preserved.

Yes and no. There is indeed a standard, which was a gift from Adobe, and we should be grateful.

More recently, Adobe has been adding proprietary extensions to the PDF format and these can only be open using Adobe software. A famous example is XFA forms produced by LifeCycle designer.

Ironically, this was deprecated by Adobe itself. Probably because people don't buy computers anymore. They only have a smartphone and for some reason XFA never worked on smartphones. But Adobe LifeCycle designer is still around and still producing unusable PDFs.

PDF is an open standard, except when it's not

Posted Jun 7, 2025 18:24 UTC (Sat) by marcH (subscriber, #57642) [Link] (4 responses)

To be more precise: it sounds possible to read static XFA with Adobe software, even on Linux.

Dynamic XFA is the "real deal" https://kbdeveloper.qoppa.com/livecycle-dynamic-xfa-forms/
> There are very few PDF viewers that support XFA Dynamic Forms, one can count them on the fingers of one hand.

Both static and dynamic XFA have been deprecated in PDF 2.0?

(That entire confusion and lack of standardization is the problem)

PDF is an open standard, except when it's not

Posted Jun 9, 2025 5:07 UTC (Mon) by DemiMarie (subscriber, #164188) [Link] (3 responses)

PDF.is can handle XFA just fine.

PDF is an open standard, except when it's not

Posted Jun 9, 2025 14:05 UTC (Mon) by marcH (subscriber, #57642) [Link] (2 responses)

I tried opening a dynamic XFA form on pdf.is and it went much further than everything I tried before, thanks for the recommendation! I can at least see and fill the form, that's great progress. However the form also shows:

> JavaScript has been disabled, the form requires JavaScript to validate properly.
>
> Please enable JavaScript through Preferences under the Edit menu and reopen the form.

The "Validate" and "Clear Form" have no effect.

Maybe it would work better with a higher subscription tiers? But even if it does, we're straying away from "open", "portable" and "reproducible"...

I also tried to convert it to PDF/A and the output still shows "Please Wait..."

I admit dynamic XFA is unlikely to be used by a scientist. But 1. you never know 2. there could be other proprietary extensions.

tl;dr: beware proprietary extensions.

PDF is an open standard, except when it's not

Posted Jun 9, 2025 14:10 UTC (Mon) by DemiMarie (subscriber, #164188) [Link] (1 responses)

Try opening the PDF in Firefox’s or Tor Browser’s built-in PDF viewer. Those might have JavaScript enabled.

PDF is an open standard, except when it's not

Posted Jun 9, 2025 16:18 UTC (Mon) by marcH (subscriber, #57642) [Link]

No difference in Firefox, I'm curious why you expected some?

PDF is an open standard, except when it's not

Posted Jun 7, 2025 18:30 UTC (Sat) by leephillips (subscriber, #100450) [Link] (4 responses)

Implicit in my recommendation is that we should only produce PDFs that adhere to the open standard, avoiding the use of any proprietary extensions. This will be the case as a matter of course when using LaTeX.

PDF is an open standard, except when it's not

Posted Jun 7, 2025 19:34 UTC (Sat) by marcH (subscriber, #57642) [Link] (2 responses)

> Implicit in my recommendation is that we should only produce PDFs that adhere to the open standard, avoiding the use of any proprietary extensions

This should not be implicit. People should not believe that "PDF" automatically means "good". Usually: yes. Always: no. I think the article as it is now gives that wrong impression.

PDF is an open standard, except when it's not

Posted Jun 7, 2025 19:56 UTC (Sat) by leephillips (subscriber, #100450) [Link] (1 responses)

Which free software tools produce PDFs making use of Adobe’s proprietary extensions?

PDF is an open standard, except when it's not

Posted Jun 8, 2025 0:09 UTC (Sun) by marcH (subscriber, #57642) [Link]

No idea, that's totally besides my point.

PDF is an open standard, except when it's not

Posted Jun 8, 2025 2:24 UTC (Sun) by spigot (subscriber, #50709) [Link]

At a previous job we had to save legal documents (e.g. contracts), and the requirement was to use PDF/A, which standardizes a subset of PDF features. It's intended for archival purposes.

Agree on the reproducibility aspect, but not on the static medium of PDF

Posted Jun 8, 2025 5:09 UTC (Sun) by rsidd (subscriber, #2582) [Link] (1 responses)

Here's a good blog post (from 2018) by Nobel-winning economist Paul Romer, on "Jupyter, Mathematica and the future of the research paper". He makes many of the same points about open science and reproducibility, and not being locked down to a proprietary system. But much more.

The main point is, in 2018 and even more in 2025, a static PDF is a very limited way of communicating science. We can do much better.

A Jupyter notebook is a significant game-changer here. Though it is not accepted as a medium of publication by journals, there is a lot of useful supplementary data out there in such formats. And it is of course important that it is not proprietary, unlike Mathematica. But its dynamic nature is also important.

Agree on the reproducibility aspect, but not on the static medium of PDF

Posted Jun 9, 2025 19:04 UTC (Mon) by fraetor (subscriber, #161147) [Link]

The big issue with notebooks is reproducability; I often have difficulty getting a notebook running that we written by a colleague at the same organisation a few months ago.

Given their expanded capabilities, notebooks are inherently going to be harder to reproduce in their orignal fidelity than a PDF, which is fully described by a self-contained specification, though how one is meant to read a specification distributed as a PDF leads to other questions. Being open makes reproduction possible, but it is still a significant hurdle.

I think static documents such as PDF still has a role to play, even if it is just a medium to determine one's interest in some work before investigating the code.

The web is an interesting publishing medium, forming a sort of middle ground between static PDFs and the fully dynamic notebook. I see more journals embracing it as a distribution mechanism; some even allow for interactive plots powered by JavaScript. So perhaps that will be the future direction.

Overleaf

Posted Jun 9, 2025 14:35 UTC (Mon) by Klaasjan (subscriber, #4951) [Link] (10 responses)

With the emphasis in this article on the venerable and excellent (La)TeX, on free software, and on the collaborative nature of modern science, it may also be relevant to point out the usefulness of Overleaf (see, e.g., wikipedia).

Overleaf

Posted Jun 9, 2025 16:01 UTC (Mon) by rsidd (subscriber, #2582) [Link] (9 responses)

Overleaf is useful. It is not free, either libre or (except in a crippled form) gratis. Recently (14 May) an outage of a few hours inconvenienced many colleagues. I do use it but am careful to keep local copies. I don't think it deserves mention in the context of this article.

Overleaf

Posted Jun 9, 2025 17:29 UTC (Mon) by leephillips (subscriber, #100450) [Link] (8 responses)

It does not. I was forced to use Overleaf to write a book and the experience is far worse in every way than using a locally installed LaTeX with Vim. Glacially slow, and instead of useful diffs a clumsy interface to track changes.

Like GitHub in relation to Git, it’s a thin layer of mercenary slime befouling a limpid core of free software.

Overleaf

Posted Jun 9, 2025 20:58 UTC (Mon) by Klaasjan (subscriber, #4951) [Link] (1 responses)

I’m sorry to hear Overleaf left such a bad impression. I should say I do find it useful, though.
And I do agree having local backups is important and that working locally in one’s preferred editor can be more pleasant.
Perhaps the Wikipedia entry is overly positive about the free nature of the software?

Overleaf

Posted Jun 10, 2025 1:33 UTC (Tue) by rsidd (subscriber, #2582) [Link]

My bad, they do seem to have an open source version that you can run locally. I don't know how popular that is.

I don't think it is as bad as Lee says. Both overleaf and GitHub have their places. But not really relevant to this article -- overleaf is widely used by scientists, as is MacOS, but as a proprietary tool for creating open science.

Github

Posted Jun 11, 2025 16:40 UTC (Wed) by Klaasjan (subscriber, #4951) [Link] (5 responses)

Since you disqualify GitHub and Overleaf in the same sentence, this makes me wonder what alternatives to GitHub you would recommend, in particular to users (scientists) that want to collaboratively develop software using (mainline) git from their local machines.

Github

Posted Jun 11, 2025 19:07 UTC (Wed) by dskoll (subscriber, #1630) [Link] (4 responses)

I migrated off GitHub to two other solutions:

Codeberg
A self-hosted Forgejo installation.

I also mirror my repos on salsa.debian.org, but that's not freely-available to anyone; you need to apply for an account and (AFAIK) be developing free software. That runs the Gitlab software, which you can use as a service or self-host.

Github alternatives

Posted Jun 12, 2025 9:34 UTC (Thu) by Klaasjan (subscriber, #4951) [Link] (3 responses)

Thanks. I should have mentioned that the request was for a non-self-hosted solution. I'll look into Codeberg.

Github alternatives

Posted Jun 12, 2025 12:22 UTC (Thu) by dskoll (subscriber, #1630) [Link]

Gitlab.com also hosts a free tier, though it says that it's for "individuals working on personal projects and open source contributions."

Github alternatives

Posted Jun 12, 2025 12:55 UTC (Thu) by jzb (editor, #7867) [Link] (1 responses)

I like Codeberg. You might also look at sourcehut, too. It is a bit more... baroque? All its features are supposed to work without JS (it works well with Nyxt, for instance), and it is 100% free software.

Github alternatives

Posted Jun 12, 2025 14:05 UTC (Thu) by liw (subscriber, #6379) [Link]

While we're listing alternatives: I work on, and use, Radicle, which is distributed, is fully free and open source software. There's also Tangled, another take on distributed, but built on ATproto, which underlies Bluesky. (For personal reasons, I only care about distributed systems for this.)