LWN.net Logo

OpenOffice.org 3.0 released

OpenOffice.org 3.0 released

Posted Oct 13, 2008 15:06 UTC (Mon) by endecotp (guest, #36428)
Parent article: OpenOffice.org 3.0 released

> The Calc spreadsheet has been given another increase in capacity -
> now up to 1024 columns per sheet.

Why on earth did they have an arbitrary limit in the first place?
And why increase it to 1024, rather than removing it entirely?

(Gnumeric seems to allow only 256 columns, which has hit me when I've tried to read in data generated by a script.)


(Log in to post comments)

OpenOffice.org 3.0 released

Posted Oct 13, 2008 15:15 UTC (Mon) by jreiser (subscriber, #11027) [Link]

Why on earth did they have an arbitrary limit in the first place? There are step functions (and/or steep slopes) in costs for complexity, space, and execution time. Nearly all spreadsheets are "small", and most of those users don't want to pay the costs (space, time, time-to-market, maintenance, ...) for "unlimited" capacity.

OpenOffice.org 3.0 released

Posted Oct 13, 2008 15:41 UTC (Mon) by cine (guest, #5597) [Link]

Not really. The whole worksheet is virtualized anyway, only the cells that actually is written something in exists in memory.
I would be good for them to have to think of it as unbounded, so they can get rid of the last few bugs that trivially allows you to crash your scalc, just because it tries to e.g. paint all the cells.

OpenOffice.org 3.0 released

Posted Oct 14, 2008 3:36 UTC (Tue) by efexis (guest, #26355) [Link]

only the cells that actually is written something in exists in memory

Yes but they have to be addressed. A value of 0-1023 requires 10bits to address, and it's position can quickly and simply be resolved in just a few bytes of code. "Unlimited" requires scalable values, which means loops, conditional jumps, using memory rather than registers, everwhere the value is used (so a function the finds the left or rightmost cell of two or more cells becomes a lot more complex). Libraries like gmp do make doing this easier, but it's still nowhere as quick as a bit fixed width field where number of bits <= register size.

Alex

OpenOffice.org 3.0 released

Posted Oct 14, 2008 5:17 UTC (Tue) by k8to (subscriber, #15413) [Link]

That problem was solved long ago. Pre-written code exists to use the machine values until you overflow and then trap the processor exception and special case from there on out.

Even if that wasn't true, the cost of doing slightly more math to figure out what memory to pull in is trivial compared to the slowness of pulling it in. The only downside is consuming more memory in the addressing.

The ONLY downside? Ha!

Posted Oct 14, 2008 9:48 UTC (Tue) by khim (subscriber, #9252) [Link]

The REAL downside is neither speed nor complexity. The REAL downside is the need to rewrite millions of already existing code which is designed to use fixed width numbers. I presume 256=>1024 was just a change in #define plus some testing, "unlimited" will be HUGE undertaking...

OpenOffice.org 3.0 released

Posted Oct 14, 2008 10:40 UTC (Tue) by efexis (guest, #26355) [Link]

"Pre-written code exists to use the machine values until you overflow and then trap the processor exception and special case from there on out"

Yep, that's the conditional loop bit... you loop loading/comparing each byte/word at a time, until the condition (overflow) hits and says to stop. There might be prewritten code (like the gmp library I mentioned) but it's still orders of magnitude slower than what it would be for a fixed bit width. which can be aligned to fit many into cachelines etc.
This wouldn't just need to be resolved for loading cells in, but all reference pointers (think loop variables that iterate through cells updating them) would need to be as this too.
Quite how much noticable difference this makes on a modern processor though... couldn't say without some profiling info.

OpenOffice.org 3.0 released

Posted Oct 13, 2008 15:44 UTC (Mon) by PO8 (guest, #41661) [Link]

I'm having a hard time imagining a (sane) data representation for spreadsheets that has these unfortunate properties. It would be great if you could elaborate a bit on oocalc's internal data structures.

OpenOffice.org 3.0 released

Posted Oct 13, 2008 15:56 UTC (Mon) by endecotp (guest, #36428) [Link]

> Nearly all spreadsheets are "small", and most of those users don't
> want to pay the costs for "unlimited" capacity.

If they have a fixed-size array in there somewhere, then those users with "small" spreadsheets are now paying for ~1000 columns that they're not using. If they're using anything more complex than a fixed-size array - and I imagine that they are - then they're already paying those costs. So I don't think this is a space/time/complexity thing.

My guess is that they are trying to save some bits somewhere by packing row and column number, plus some other bits, into a word. (I have a vague recollection that Excell has a 64k row limit because it uses a 16-bit row number somewhere.)

OpenOffice.org 3.0 released

Posted Oct 13, 2008 15:58 UTC (Mon) by dkite (guest, #4577) [Link]

Nonsense.

Spreadsheets are a class of application that drive users to the limits of
it's capability. It either fills the need, or is used as you described
'nearly all spreadsheets are "small"'.

If it is only good for small stuff, serious users don't/can't use it.

They use Excel.

Derek

OpenOffice.org 3.0 released

Posted Oct 14, 2008 9:57 UTC (Tue) by nowster (subscriber, #67) [Link]

> They use Excel.

Would that be Excel 2003 with a 256 column limit? (Excel 2007 is said to have a 16384 column limit. This larger limit is irrelevant if you wish to share spreadsheets with someone who has the older version of Excel, assuming the file format hasn't changed between versions.)

OpenOffice.org 3.0 released

Posted Oct 14, 2008 9:57 UTC (Tue) by forthy (guest, #1525) [Link]

They also still have the 65536 row limit, which Excel 2007 has raised to 1M rows (and certainly also not taken away completely). This is my grief about OO.o: Too much design mistakes taken from Microsoft.

Copyright © 2012, Eklektix, Inc.
Comments and public postings are copyrighted by their creators.
Linux is a registered trademark of Linus Torvalds