LWN: Comments on "Google Linux servers hit with $5m patent infringement verdict (The Register)"

Changing Hash Function without wholesale rehashing/rebuilding

orcmid — Fri, 29 Apr 2011 01:17:05 +0000

I thought about the need to retire hash(key) in favor of an incompatible hash'(key) some more (in the shower naturally), and I realized that even if hash'(key) doesn't produce the same low order bits, it is easier to migrate than wholesale rebuilding.

Here is a brief sketch with lots of important details left out.

I assume that N=1 and we use Mmin, M, LFmin and LFmax as I have come to be using them in this series of comments. This is all for simplicity.

When it is necessary to migrate the current hash-index base to another hash-indexed base, do it this way.

1. Arrange, for the to-be-retired base, for a level to be exactly completed, either by shrinking or splitting. So M is at some 2^level - 1.

2. Set up the new (initially-empty) base, base'. If this is a linear hashing base too, then that's cool because it can be set to some Mmin' initial number of (empty) buckets and we will let it grow naturally. The technique we use will allow base' to grow gradually as the old base shrinks.

3. Stop inserting new records in the old base. The procedure will be such that new records not found in either base will be added to base' only.

4. Adjust the hashing bucket(key,level) algorithm in the old base so that whenever hash(key) > M, we don't use the bf(key, level-1), instead we go to base' and look for it there. If it is not there, we insert it there. Likewise, if we do use bf(key, level-1) and we don't find the record, we go to the new base and insert it there. (It shouldn't be there already.)

5. When we are ready to decrease M (probably fixing LFmin to some large value and keeping LFmax always larger than that and what the current LF is, to encourage that, e.g., LFmax > LFmin > LF), instead of merging the same-hash values in bucket [M] back to the split buddy of [M], we instead insert those records into base' and then reduce M by 1. We will never look for those records in the old base any longer.

6. Now, at some point we will want to stop searching first in the old base but start searching in the new base. If the record is not in the new base, we should search the old base for it (hashing with the old hash of course). If it is in the old base, we are done looking although it *might* be worth removing it from the old base and inserting it in the new base. I don't have any intuition on when that trade-off is important. The easy way is simply to end the query. If the query in the old base doesn't find the record though, then the old base sends the record back to the new base to be inserted.

7. The bottom line is that the new base never returns a record/query that comes to it from the old base, but when it is accepting queries directly, it will forward to the old base if it can't find the record in its base.

8. This continues until, at some point, the old base is compacted enough that it is simply time to move everything to the new base. And it is simple enough to have that be when M = Mmin = 0 and old-base bucket [0] is all that's left (the mask has gone to 0, in effect).

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Thu, 28 Apr 2011 17:48:11 +0000

If N, the MINIMUM.MODULUS is a power of 2, it is possible to change it without consequence. It means that level numbers might seem to change, but there is a way out of that. Instead of using N as the minimum, just set some Mmin as the minimum number of buckets. Let N be any power of 2 not greater than Mmin. It might as well be 1. Now you can adjust Mmin all you want, although making it greater than the current M forces splitting. Reducing it simply allows the splitting algorithm to shrink the database if that is what is wanted. (There are probably some games with LFmin and LFmax to coax things along.)

Now, if you decide to adjust Mmin for some practical reason, do so! In the creation of an initial hash-table, one wouldn't do it by splitting of arrivals but by creating Mmin empty buckets, and the current level being completed would be whatever it needs to be to allocate into those M = MMin buckets properly.

I think the bigger problem is when hash(key) needs to change because it doesn't provide enough good uniformly random bits for the level that has been reached or there are not enough bits in it to increase the level at all.

Changing hash(key) arbitrarily breaks the linearity scheme by which the hash-table grows and shrinks linearly at the end and splitting depends on rehashing of one bucket dividing things between the current bucket and the new bucket at the end.

However, you can safely change hash(key) to a new hash'(key) that still produces the same low order bits but provides more or better high-order bits, ones that aren't selected by the current biggest mask so there are no buckets that are selected by them in the current state of the system.

The above arrangements only if N hasn't been wired into other aspects of the implementation. It is important to make the design of the expandable hash-indexed directory to the buckets and any clustering of buckets be flexible enough to handle variation in N and, better, fluctuation in Mmin. It should be possible to adjust and tune that scheme independently. The scheme needs to be robust enough so that the base never has to be rebuilt by rehashing everything, since that is terrible for high-availability distributed cases where we are talking about records and buckets/overflows on disk.

(If there is a problem at the level of the current mask, one can adjust LFmax and LFmin to force the system to shrink to a lower level if the performance hit while waiting to get to the desired lower completed level is tolerable. The techniques for distributed operation of linear hashed bases might be relevant if it is necessary to do anything more drastic than that in order to institute an entirely different hash(key) in an alternate base that allows the original base to be switched off at an appropriate time.)

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Wed, 27 Apr 2011 19:18:34 +0000

Oh. NO, Nbits is just the number of bits in N, for N a power of 2, and which is constant for the hash table, regardless of level. (I may be off by one here). I think it is log2(N)+1.

Level moves up and down as needed. However, at any point, we only have bf(key, level) and bf(key, level-1) in use.

The initial conditions are level = 1, M = N-1, and buckets 0 to N-1 are ready and empty. So the LF is initially 0 but we never shrink below N-1 anyhow. With linear hashing, we always grow the buckets linearly at the end, so the first bucket to split is bucket [0] and it splits between itself and new bucket M+1. (Then M is then advanced by 1).

At the time of the first split, presumably LF > LFmax has happened and let us assume for simplicity that it is over just a hair and we can assume LF = LFmax for convenience.

Then bucket [0] is expected to have LF records in it and after the split it will have kept an expected LF/2 of them with new bucket [N] getting the other expected LF/2. By the time the level is complete (so there are 2N buckets), each of them will again have an expected LF records in them (which is how the saw-tooth fits in). Then we start splitting again from bucket [0], M = 2N-1, and we work on building out the next level, which will end up with a total of 4N buckets if it is completed. Rinse. Repeat.

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Wed, 27 Apr 2011 18:59:51 +0000

No, I never change the value of N which you remark is also known as MINIMUM.MODULUS. And level grows as necessary to keep the load factor in bounds. Every time the load factor gets too large, a bucket is split. That is, a new bucket M+1 shows up and it is a split of its "mirror" bucket farther back in the preceding buckets. Every time a level is filled, splitting starts over from bucket 0 moving forward through all of them to the end of the just-completed level. Each completed level has double the number of buckets that the previous level did.

It is also possible for the number of used buckets to decrease when the load factor goes too small, and I allowed for that as well.

I did some interesting back-of-the-envelope analysis and there is an amusing situation. At the time a level is completed, the expected occupancy of bucket [0] is the load factor. But the expected occupancy of the last bucket to be split in completing the level is double the load factor at that point in time. That's because it is still double the size of the already-split buckets, under the uniform random assumption. (It is still a (level)-1 bucket and splitting other buckets has no effect on its share of the load factor until it too is split.) Splitting then has there be load-factor-expected content in the shrunk bucket and its split partner, which is the instantaneous expectancy for all of the buckets in the just completed level.

This saw-tooth effect may have to be taken into account when wanting to avoid bad congestion, especially if we are talking about the likelihood of some sort of disk or memory block overflow into another block at the 2*LFmax expectancy.

Of course, even for uniform randomness one needs to account for variance as well. I haven't looked at that and I have no idea how important this observation is under practical conditions.

PS: By masking I mean computing the equivalent of hash(key) mod (N*2**level) by simply extracting the low-order bits of hash(key) by a mask of level+log2(N) bits where N is a power of 2. It is trivial to keep a pair of current mask values, one for level, one for level-1 and implementing bucket(key,level) very quickly, adjusting them as level changes. And if we need the bucket number to be a multiple of a power of 2 (for pointer manipulation), we can handle that by adjusting the masks too. It's all about getting through the directory of buckets to the bucket slot as quickly as possible.

TIFF

tialaramex — Wed, 27 Apr 2011 08:37:55 +0000

TIFF isn't a file format in the sense we're used to. In fact that was actually the point at the time. Dozens of vendors wanted a "standard" format but they refused to agree on anything (including even which corner of the image is the start of the image data, since all of the scanner vendors wanted to begin with whichever corner would mean least buffer RAM given their unique [probably patent-avoiding] scan order arrangements. So TIFF is a way to write metadata which enumerates all the decisions made in choosing the actual file format.

Thus there's little point trying to support "TIFF". libtiff tries but it works only for whatever TIFF formats were common enough in the wild at the time the version you have was written. Over the years there has been JPEG-encoded TIFF (early references to a 'JPEG file' are to any of the various incompatible ways to store JPEG data in a TIFF, and not to the simple JFIF standard file format eventually used today), and more recently there are numerous high-bit-per-channel TIFF variants for professional photography, most of them mutually unintelligible of course...

Linear Hashing Work-Around Potential - LoadFactor clarification

Wol — Tue, 26 Apr 2011 13:11:28 +0000

Trying to get my head round what you're saying ... but I think you're using "minimum buckets" completely differently to my MINIMUM.MODULUS. It looks to me like you're using Nbits to keep level always at 0 or 1.

MINIMUM.MODULUS cannot change on the fly. As is shown by my comment about UniData :-) The recommended use for it is where a file can shrink, grow, or be cleared regularly, so its size is very variable, but it also has a "normal" size. By declaring a minimum size equal to the typical size, you avoid a lot of unnecessary split/merge activity. The other use is where you are loading a file with a lot of data - again by setting the minimum modulus to whatever it will be once the file is loaded, you're saving a load of activity while the file is built.

Mind you, for my first implementation, to calculate the bucket I used the following pseudo-code

unsigned int mask = -1 // yes I know :-)
bucket = hash;
while (bucket gt current-modulo)
mask = rightshift( mask)
bucket = and( bucket, mask)
repeat

Assuming "int" is a short, that means I've got a valid bucket with less than 16 iterations. And the only operations I'm using are and, shift, and compare. All cheap compared with multiply, divide, and mod, I think.

Even cheaper if rather than calculating mask each time, I cache the upper value :-) Unless you're on a Z80, it's quicker to get the lower mask by right shifting the upper, rather than going the other way by doubling and incrementing. (I gather the Z80, due to a bug in the silicon, didn't have a leftshift function. Then somebody tested the instruction that should be leftshift and discovered it set the new rightmost bit to 1. So what should have been leftshift became "left shift and increment", at first informally then Zilog actually documented it because everyone was using it :-)

Cheers,
Wol

Google Linux servers hit with $5m patent infringement verdict (The Register)

ssmith32 — Tue, 26 Apr 2011 07:27:45 +0000

I think not.
I'm not an expert, but I think Texas oil is pretty much dead.

There's some big oil companies there, but they're drilling elsewhere. I believe the biggest find by far in the US (in recent years), is the Bakken formation in Montana. Last assessment was about 22 billion barrels of oil (2008 USGS survey) - but only about 3-5 billion are recoverable. And yes, this is mostly from the last New Yorker.

But Wikipedia seems to agree:
http://en.wikipedia.org/wiki/Oil_reserves_in_the_United_S...

So,yah, bye bye Texas :)

Even better - with no say in our government, we could finally actually tax the oil companies, via tariffs, all the drilling they do elsewhere, etc.

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Tue, 26 Apr 2011 02:55:28 +0000

For some reason, an edit I tried to make didn't get into the just-immediate post:

Although caching hash(key) might be useful to avoid its recomputation when splitting a bucket, it is not needed at all when removing a bucket, since we know absolutely what bucket the removed-bucket's records go into. It is the bucket number (actually, M) masked by the smaller mask at the time of the merge-back and before M is decreased.

Whether or not hash(key) is cached, it is a nice value to keep as the record itself in a fast simulation looking for where splitting doesn't catch up fast enough and what the headroom needs to be on load factor, etc. OK, that's more than enough speculation from me on one day.

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Tue, 26 Apr 2011 02:43:14 +0000

I used the formulation given in the two now-classical papers on Linear Hashing.

I was having the modulus always be a power of 2. Then the bit mask to compute it for a given bf(key,level) is 2^(level+Nbits)-1 where N = 2 ^ Nbits is the minimum number of buckets (which are then repeatedly doubled as new levels are completed). So the bucket algorithm can be optimized to use the two masks (one of which is double the other + 1) precomputed and it is easy to adjust them as level is expanded or contracted. If hash(key) is also cached in the linked-list entries for the same bucket, rehashing becomes trivial. (This could matter in-memory more than if disk accesses are involved, but there are the usual time-space trade-offs to ponder, including whatever the cost of computing hash(key) is.)

For the linked-list case in memory, I used a bucket size of one, meaning exactly one linked-list cell (the head) per bucket.

However, how buckets are organized into blocks/clusters of contiguous storage (or disk) and how overflows are kept near or in those blocks is also something interesting to explore, since that also matters when dealing with either disk blocks or virtual-memory allocations. Since the splitting algorithm runs sequentially through the buckets, I speculate that having clusters with location affinity can be important.

I haven't looked at the time-space trade-offs enough to have formulated a simulation and determined the sensitivity to different bucket directory and block/cluster organizations. Too many opportunities, not enough time ...

I think you and I are looking at the same sort of considerations, we are just cutting up the storage organization in different but more-or-less equivalent ways.

Links to additional articles

forlwn — Tue, 26 Apr 2011 01:06:43 +0000

And why? Is it Florian the only intelligent human being here, able to look for and provide extra info to others?
Anyway, thanks Florian, and now bye bye.

Linear Hashing Work-Around Potential - LoadFactor clarification

Wol — Mon, 25 Apr 2011 19:52:15 +0000

Actually, that explains something I didn't understand before ... in both UniData and UniVerse, N is called MINIMUM.MODULUS. Apparently in UniData (with which I am unfamiliar), changing this triggers a complete file rebuild, and I never understood why.

Your version of linear hashing explains why ... :-)

Cheers,
Wol

Linear Hashing Work-Around Potential - LoadFactor clarification

Wol — Mon, 25 Apr 2011 19:48:33 +0000

In UniVerse (the current implementation with which I am familiar, Prime having gone bust) you can adjust the size of the bucket. In INFORMATION it was stuck at 2K (which gave you a maximum key size of 1600 bytes because it needed to fit in one bucket).

In practice, strange as it may seem, I don't think records spilt out of the primary block that much - they couldn't have if my comment about needing only 1.05 reads average to find a record is correct ... :-)

At worst, even if all records were oversized, you'd need the 1.05 reads to locate it, and then one more read to actually get it because the primary bucket would tell you where it was.

btw, you said that "if N is a power of 2, masking can be used". I think you've got a bit confused, but you've also missed a trick. You can always use masking. First of all, I wouldn't use N at all in the hash function. It would only be used in split/merge. I had some trouble getting my head round your use of N, but it does appear to work :-)

However, what I'd do is

Given M buckets, calculate P such that 2^(P-1) <= M < 2^P

mask = 2^P -1

If hash mod mask < M then that's our bucket, else hash mod rightshift( mask) gives us our bucket.

Then

if loadfactor > 80% then split
if loadfactor < 50% AND M > N then merge

Cheers,
Wol

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Mon, 25 Apr 2011 18:01:02 +0000

Thanks for this and the previous expansion on the linear hashing implementations you're experienced with.

The load factor definition I use has to do with the average number of keys that have the same hash in the set of possible hashes (actually, same buckets).

I should clarify what I mean by same-hash. Typically, the adjustable bucket function that works with linear hashing is something like

bf(key,level) = hash(key) mod (N*2^level)

where N is the minimum (starting) number of buckets and level is initially 0.

Once splitting starts, there are at any time at most two of these functions being used.

Let M (not less than N-1) be the highest number of the buckets currently in use and let let N*2^level be the least integer greater than M.

Then the rule for finding the correct bucket is,

bucket(key,level) = if bf(key,level) > M then bf(key,level-1) else bf(key,level)

Note that for the pathological case, I had hash(key) be what was constant. It doesn't matter what constant, but 0 is easiest to visualize.

In the simplest linked-list in-memory case, a buck holds one record directly (in the head entry of the linked list) and anything else is linked from that. An empty bucket has some sort of special NULL entry so it can be distinguished for a bucket holding onto one record. Note that if N is a power of 2, the bucket(key, level) algorithm becomes very simple since masking can be used instead of division but there needs to be more attention to the quality of hash(key). I prefer that case because masking can be used to dramatically speed up the entire process of getting to a specific bucket through whatever the bucket director structure is.

If we are talking about disk spill, and on-disk buckets, the overflow case is different and different load factor values are more important. For example, one would like to know the average number of disk blocks taken for buckets corresponding to the same hash, on the assumption that one wants to minimize the number of buckets with overflows to the point where the extra hit has a negligible impact on performance. I imagine for tuning purposes, the load factor as I described it also matters because it gives a clue as to how much empty space is carried in the first/primary disk block of a bucket, since having too many disk blocks that are nearly empty has its own performance consequences. It looks like these load factors (being estimates) are derivable from each other assuming primary and overflow blocks are all of the same size.

We've wandered a distance from workarounds for '120, which I think are more-easily handled without going to linear hashing if it is not already being used in the Linux kernel code that infringes an essential claim. Thanks for the exploration though.

Off topic

corbet — Mon, 25 Apr 2011 15:41:03 +0000

So can we agree that this conversation has gone far enough off topic that it's maybe best continued in a different venue? Thanks.

Google Linux servers hit with $5m patent infringement verdict (The Register)

Trelane — Mon, 25 Apr 2011 15:38:50 +0000

> I've been to SF and I was quite surprised by the number of USA flags which were displayed there, so apparently it's not only Texas..
And nationalism isn't a good thing in my book as each country has a lot of skeletons in its closets..

So displaying a country's flag necessarily implies nationalism?

Interesting.

> USA people still elected Ronald Reagan, Bush Jr (twice!!), elections which were quite surprising for a foreigner like me (not that us French did much better with Sarkozy..).

If you don't understand something, generally I thought you were supposed to get to understand them, not hate them.

Google Linux servers hit with $5m patent infringement verdict (The Register)

renox — Mon, 25 Apr 2011 15:30:11 +0000

>They [Texans], quite literally, believe that they are the best people in the world.

Well, I've been to SF and I was quite surprised by the number of USA flags which were displayed there, so apparently it's not only Texas..
And nationalism isn't a good thing in my book as each country has a lot of skeletons in its closets..

>USA government != USA people

Well, USA people still elected Ronald Reagan, Bush Jr (twice!!), elections which were quite surprising for a foreigner like me (not that us French did much better with Sarkozy..).

Linear Hashing Work-Around Potential - LoadFactor clarification

Wol — Sun, 24 Apr 2011 22:18:13 +0000

Whoops - that is NOT the definition of load factor that I am used to!

Bearing in mind I'm describing a disk file, to me "load factor" is the average %age used of each primary bucket.

Let's say I've got 50K of data spread across 40 buckets of 2k each. That's a load factor of 50/80, or 60%. And then that's modified to exclude any oversize records that aren't stored in primary buckets :-) A typical Pick file will grow when load-factor hits 80%, and shrink when it goes below 50%. In practice, even wasting disk space equal to 100% of your data, it seems to be more compact than MS SQL Server, from my personal experience :-)

So "load factor" is actually quite a complex beast, depending on what you're doing :-)

Cheers,
Wol

Linear Hashing Work-Around Potential

Wol — Sun, 24 Apr 2011 22:10:38 +0000

Actually, Pick wouldn't run foul of the patent (aiui) because it doesn't store time-stamps with records, and so couldn't automatically expire records.

I want to re-implement a pick-style datastore, actually, and storing time stamps and expiring data was something I planned to do - I thought it was blatantly obvious! However, I would have done it as part of writing a record, because that's an obvious point at which to compact a bucket.

The original Pick model didn't include disk storage, it viewed it as a permanent virtual memory, and everything was based on hashed files, including RAM! This goes back to about 1967 - Pick is pretty much the same age as Unix :-)

Going forward to Prime INFORMATION, that had 2k buckets which can easily hold several records, depending on how big they are (stuff that was large than - by default - 1600 bytes got pushed to a secondary store). So, every time I add a record to a bucket, it's possible - even probable - that I need to compact the bucket to make room inside the 2k. (Deleting records, I would probably just flag them - not worth compacting them at that point.)

As for your surprise at the way of handling overflow buckets, you're thinking in terms of memory. And you badly misunderstand the hashing technique if you think expanding buckets will avoid collisions! :-) In a Prime Technical Paper I've got, they give the example of "number of buckets equals number of records", record id is sequential unique, BUT they add the records from 0 to 16 in random order. So at the end, with 17 records in seventeen buckets, there are no collisions. But if let's say the first two records to be added are both odd, 1 and 3 for example, they will both hash to bucket 1, and bucket 0 will be empty.

Linear hashing doesn't guarantee your file won't be lumpy. But the maths I've seen says that - absent a pathological situation - given a known key you need to access on average less than 1.05 buckets to find your record (or, more importantly, to know that it doesn't exist!) If every bucket access is a disk read, then that's damn fast! I don't know of any non-hashing technique that can come close. That's why, for pretty much ALL database sizes, a Pick system will leave pretty much any other database in the dust for speed.

My favourite war story is the company that ported their database from UniVerse (a Pick derivative) to Oracle. The Oracle consultants, after spending six months tweaking a particularly nasty SQL query, proudly announced to management that their new system was now ten percent faster than the system they were replacing. Unfortunately, they did it within earshot of the guy maintaining the old system, who scornfully responded "you're proud that your twin Xeon 800 is only 10% faster than my Pentium 90?" !!!

And that speed discrepancy is typical. I've seen loads of comments from people who can't believe how fast Pick systems are. We ran 32 users on a MIPS 3000 (equivalent to a 386) with 16Mb of RAM. System response was pretty good, although I don't think we hammered the system the way some people did. And watching the hard disk light, that machine was thrashing like mad ... !!! (Management didn't want to pay for a ram upgrade - until a ram chip failed and I wangled an upgrade with the support people :-)

Cheers,
Wol

Liability of Users for Patent Infringement

orcmid — Sun, 24 Apr 2011 21:58:44 +0000

The link to justia.com provided by FOSSPatents earlier on this thread provides a lot of the soap opera: http://dockets.justia.com/docket/texas/txedce/6:2009cv002...

I got tired looking through it, and I don't think one can find actual exhibits or expert testimony.

What is interesting is that some of these companies have extensive web farms for their own operations. We know that Google's uses Linux. I presume these other users do also.

It is interesting that hosting services are included. I assume because they are engaging in commerce by leasing the use of LAMP-based hosted sites, etc. So my hosting service could easily make it onto this list, not to mention sites where I am an user (Facebook, amazon.com, etc.). Ick.

I suppose free cloud services (E.g., my Amazon Cloud Drive) may not be very free for long. And we all wait patiently to learn there has been a workaround that removes the infringing code from the Linux kernel.

Reviewing the Patent

wahern — Sun, 24 Apr 2011 20:06:00 +0000

I didn't mean to imply that I thought the USPTO got this wrong according to their examination process. I didn't mean "obvious" in a legal sense, as in an articulable relation to identifiable prior art. I was speaking more in lay terms, as in this is yet another fscking absurd result of allowing patents on software.

I'm not very adept at analyzing patent claims; e.g. means-plus-function and anticipation, both of which I would guess were fairly critical to this case.

Liability of Users for Patent Infringement

wahern — Sun, 24 Apr 2011 19:02:09 +0000

Well, they didn't just sue Google. Here's a list of defendants from their Amended Complaint 2011-01-24. Don't infer anything about these defendants--other than Google--regarding the trial verdict.

Softlayer Technologies, Inc.,
CitiWare Technology Solutions, LLC,
Google Inc.,
Yahoo! Inc.,
MySpace Inc.,
Amazon.com Inc.,
PayPal Inc.,
Match.com, Inc.,
AOL LLC,
CME Group Inc.

Red Hat preemptively sued in 2009 for a declaratory judgment. Pacer charges per document so I can't figure out how out all of this unfolded (without paying), but here's the defendant list in Facebook's 2010-09-03 Amended Answer to Bedrock's crossclaim. I would guess these are all Red Hat customers impleaded by Bedrock.

1 & 1 Internet, Inc.,
Rackspace Hosting, Inc.,
Go Daddy Group, Inc.,
Sungard Data Systems, Inc.,
Whole Foods Market, Inc.,
The Gap, Inc.,
NYSE Euronext,
ConAgra Foods, Inc.,
Nationwide Mutual Insurance Company,
Facebook, Inc.,
ConocoPhillips Company,
Virgin America Inc., R.L. Polk & Co.,
The Planet.com Internet Services

Reviewing the Patent

orcmid — Sun, 24 Apr 2011 18:52:20 +0000

IE 8 doesn't support TIFF either, but the free ActiveX plug-in that the Patent site mentions worked for me. Once I have copies on my machine, I can view them with any number of built-in utilities, of course.

The amended claims do fit. Also, the revision to claims 3 and 5 close a loophole in the wordings that would have had the claims apply to access to the same-hash list for any purpose whatever, rather than only "accessing the linked list of records to search for a target record."

Unless covered by some other patent, it would appear that having a separate process that scans buckets for the sole purpose of scavenging automatically-expired records is not covered, so long as it does not involve the "record search means" for finding a record by its key.

Linear Hashing Work-Around Potential - LoadFactor clarification

orcmid — Sun, 24 Apr 2011 18:40:25 +0000

In the note above, I mentioned "load-factor (the average number of list items that are examined on a keyed search)."

That's not what the load-factor is. The load factor is computed as the average length of the list in all current buckets, where a list is of length 0 for an empty bucket.

The load factor is an *estimate* of the maximum number of list items that are examined, on average, for a single keyed search under the assumption that searching is uniformly random by hash-value. (The estimated average number of list items consulted under that assumption is half the load factor.) That, of course, is the appeal of hash-table lookups. But as we must be reminded, when that is not the access pattern, the experienced average moves toward the worst case.

A pathological worst-case is easy to demonstrate. Suppose, through some bug or some peculiarity of the data being operated with, all keys have the hash code 0. That is, everything goes into bucket 0. As the load factor goes up, the linear hashing scheme will start splitting buckets and then rehashing their lists between the original bucket and its addition at the end of the used hash table. But bucket 0 still has everthing. So the actual search time gets longer and longer since it is a linear search of all of the records, the load factor keeps going down, and the hash-table expands for no useful purpose whatsoever. There are also cases that move the list from bucket to bucket while still having worst-case search behavior.

Although one would presumably not use hashing where this case had any likelihood at all, any kind of clumping of records and in the access pattern will cause performance deviations, that persist long enough to be important.

This is why it is valuable to test hash systems by throwing many distributions of data at them and carrying out other kinds of stressful simulations. It is also good to collect statistics in a running implementation so that degrading situations can be identified.

Linear Hashing Work-Around Potential

orcmid — Sun, 24 Apr 2011 18:08:59 +0000

It is true that we may be talking about patents that may have expired already. And even '120 will have expired no later than 2018 (probably sooner).

However, '120 does apply to linear hashing today if that linear hashing implementation deals with automatically-expiring records and it removes some or all of those on a list of same-hash records while carrying out an operation of the "record search means utiliizing a search key to access a linked list of records having the same hash address."

Considering how fruitful linear hashing has been seen to be, especially for distributed databases, I would still want to make sure I understood about applicable patents before distributing code using the technique (and especially if automatically-expired records are handled opportunistically as part of the split and condense process).

Linear Hashing is used in Berkeley DB in both the free and commercial versions. I actually have no need for a commercial one myself and would use an open-source one if I distributed an application that needed a DB.

I am surprised that Pick handles oveflow buckets the way you summarized it. I see no reason to ever have collisions with the expanding buckets of the linear hash directory (but I am looking only at the in-memory linked-list form at the moment).

Linear Hashing Work-Around Potential

Wol — Sun, 24 Apr 2011 08:25:05 +0000

If you're worried about patents, you shouldn't be. At least, no more than any other technology. You've seen the mention of the papers from 1970 - that's forty years ago!

I first started working with a commercial linear-hashed product (Prime INFORMATION) in 1985 - that was rev5.3, and I don't know how long linear hashing had been in it. It's been very widely used since then - the entire Pick/MultiValue ecosystem is based on it!

But if you want to know more about it, and its commercial use, investigate Pick. A good web-site is www.u2ug.org, that leads to the multi-value ring, and you'll find masses of good information. Maybe I'll see you on the u2ug mailing lists? ...

Cheers,
Wol

Google Linux servers hit with $5m patent infringement verdict (The Register)

wahern — Sun, 24 Apr 2011 05:37:03 +0000

Why does Firefox still not support TIFF natively after all these years? Chrome doesn't either, it appears.

I did read the written description, though, and presuming that it's not totally fubar'd from the original application, and presuming the patent office and trial court reasonably stuck to the rule (IIRC) that amended claims must fit within the scope of the written description, then it still seems painfully obvious to me, even by my skeptical, biased, anti-patent standards.

Liability of Users for Patent Infringement

ThinkRob — Sat, 23 Apr 2011 23:51:28 +0000

> Yes, it means that. If this is in an essential part of the kernel, every use of Linux distribution with such a kernel would constitute an infringing operation.

> In the commercial world, providers of hardware-software systems tended to hold their customers harmless from such suits and would defend them.

Oh boy. This could be fun.

I can't wait until the patent trolls go after IBM, probably reasoning that since almost all (or maybe all?) of IBM's big iron can/does run Linux they're a good target. The mental image that such a lawsuit evokes is that of a foolish man poking a large sleeping grizzly bear. Eventually, the bear will snort a few times, bat its ears, open one eye, look sleepily around, and then simply maul the annoying guy into a small patch of red pulp.

If there's one company that I would expect to defend Linux pretty damn well, it's IBM; given their patent history I bet they could find numerous counts of infringement on their portfolio by any business, even if that business is just a shell company/patent troll. Even if they didn't have patents with which to fight back... well... what are the odds that a patent troll can outlast IBM's legal army?

Of course IANG (I Am Not Groklaw)...

Linear Hashing Work-Around Potential

orcmid — Sat, 23 Apr 2011 19:39:00 +0000

As part of digging into materials on hashed-index techniques, I wondered if there was anything about Linear Hashing that would provide an alternative in which a non-infringing method of deleting automatically-expired records could be incorporated.

I did find such a place where the deletion of automatically-expired records could be handled opportunistically as part of a different operation that has a same-hash linked list be operated on, but not while searching for the presence or absence of a target record by its key (or the equivalent ways of speaking of "record search means" in the '120 patent).

THE DOWN SIDE

1. It appears that the novel feature of linear hashing (the way that the number of hash buckets is dynamically expanded and contracted at very low cost and in a smooth manner) as well as extended uses of linear hashing may already be subject to patent protection. I cannot conduct a serious search, but finding one example of a patent with regard to distributed linear hashing set off my early-warning detector. Linear Hashing developer Witold Litwin and colleagues, including Marie-Anne Niemat and Donovan Schneider may be involved.

2. The use of Linear Hashing is very appealing because of how it allows for the dynamic management of load-factor (the average number of list items that are examined on a keyed search) and the way the array of buckets can grow smoothly (without requiring dynamic memory-reallocations of array storage or anything heavy-duty like that, especially in a kernel). Although the 1970 Litwin paper is a great source, the later distillation of linear hashing as an in-memory technique has been superbly described by Per-Åke (Paul) Larson ["Dynamic Hash Tables", Communications of ACM 31, 4 (April 1988), 446-457].

3. The problem of switching to Linear Hashing, apart from the red flag that (1) raises, is that any simple code that accomplishes opportunistic deletions within the "record search means" has to be removed, the hash function is a little trickier (but very clever), and the way arrays of buckets are organized and managed is going to be completely replaced, with inclusion of the special procedures that continually adjust the number of buckets, the list for each hash value, and the number of hash-values delivered by the hashing function at any time.

MIXED BLESSING

4. An interesting facet of in-memory linear hashing is that it appears that the splitting and combination procedures by which buckets are added and removed can be handled concurrently with the "record search means" by which individual records are being searched for, inserted, accessed, or deleted by key. Keeping in mind consideration (1) of course.

5. It is trivial to add the deletion of automatically-expired records to the situations when a list is visited for the purpose of rehashing as part of the mechanism of (4), assuming that hasn't already been covered in the claims of a patent applicable to linear hashing.

6. It is not so trivial that the presence of automatically-expired records and their deletion (5) interferes with the management of load factor, and one can imagine pathological cases where there is chatter in the expansion-contraction of the hash-bucket array. Implementations require heuristics for avoiding that. [There are already pathological cases, this just adds to them and requires more-careful heuristics to avoid chatter without over-engineering the situation.]

BOTTOM LINE

Having dug around, I am now fascinated by Linear Hashing and its prospects for use in many situations, including the in-memory and the distributed database cases.

Yet I hesitate to use it myself considering that it is likely that there are any manner of patents that apply to it and its extensions and that it may be prohibitive to determine what such patents might be.

Google Linux servers hit with $5m patent infringement verdict (The Register)

orcmid — Sat, 23 Apr 2011 18:37:36 +0000

Make sure you read the entire patent in its TIFF form, not the HTML transcription, which is incomplete.

Also, this patent has been re-examined and a reexamination certificate issued on 2011-04-12. The last few TIFF images are the pages from the reexamination.

There are now other patents listed as well as more citations to documents and publications. The net effect was to amend the patent by modifying 4 of the original 8 claims and ADDING 4 MORE.

Google Linux servers hit with $5m patent infringement verdict (The Register)

orcmid — Sat, 23 Apr 2011 18:01:03 +0000

I cannot respond to whether or not some speculative implementation would be an infringement.

Note, however, that the essential claims of '120 all involve the access to a linked list of same-hash records as part of the by-key record access mechanism and cleaning out at least some of the expired records on that list while doing so.

You should consult the text of U.S. Patent 5,893,120 to satisfy yourself one way or the other. It is found at http://patft.uspto.gov/netacgi/nph-Parser?Sect1=PTO1&...

Also, you will need to look at the TIFF images for the code, flowcharts, and especially for the modifications made to the claims as part of a reexamination amendment that was issued on 2011-04-12. (The reexamination determination is on the last page. It modifies 4 of the 8 claims and adds 4 more.)

Google Linux servers hit with $5m patent infringement verdict (The Register)

Ad — Sat, 23 Apr 2011 13:38:32 +0000

Most of the cache associated network (DNS cache, HTTP server and may be browser) has expiry time associated with each entry. The key, value map in cache can also use hashing.

Also network programs using asynchronous IO typically has only one thread. Tasks (including GC) can be done only when processing events. No separate GC thread, no separate GC pass.

So if a cache implements a mechanism where, while inserting entry, iterates and also cleans up expired entries, does that count as violation of this patent?

Just curious.

Google Linux servers hit with $5m patent infringement verdict (The Register)

wahern — Fri, 22 Apr 2011 23:32:56 +0000

Wow. If I didn't know about the actual suit, after reading that entire patent I would be convinced that it was a hoax. It's as obvious as it sounds in the news summaries.

Maybe it is a hoax and it's some kind of performance art project. It's like we live in some techno-absurdist reality. If there's a judgment on this verdict, or it doesn't get reversed on appeal... geez.

Google Linux servers hit with $5m patent infringement verdict (The Register)

phil42 — Fri, 22 Apr 2011 19:43:41 +0000

here we go again :(

Google Linux servers hit with $5m patent infringement verdict (The Register)

xxiao — Fri, 22 Apr 2011 18:53:42 +0000

heard this yesterday on radio, can't believe this!

Opportunistic Garbage Collection Obviousness

Wol — Fri, 22 Apr 2011 18:32:12 +0000

In linear hashing (typically used in Pick-based systems) it's normal to have a bucket that can hold multiple records. Records with the same hash are chained in a bucket, and should the bucket overflow, the records are often chained on the end of the file. So let's say your "modulo" is 20 - buckets 0-19 - the first bucket to overflow chains into 20, then next into 21, etc.

This then causes a bit of grief when the modulo increments, because you have to shove bucket 20 out of the way to create a new bucket 20 as non-overflow.

At least one implementation (Prime Information) used a special file/directory type, so each SEGSAM file contained subfiles, one per bucket, and they could grow as big as required.

Cheers,
Wol

Places to make a hash of it

orcmid — Fri, 22 Apr 2011 18:11:39 +0000

As long as we're speculating on what sort of things might be handled with a hash-table mechanism that is covered by the patent we're concerned about, I should add the case of UUIDs. UIIDs (GUIDs to some of us) are quite heavily used these days and hashed-table lookups for them are not uncommon.

Whether and how that might figure in Linux use of the patented method is not anything I have a clue about. And without knowing what the actual cases are, it is difficult to speculate further about what would be a suitable work-around.

I presume that the Linux kernel developers are doing what they can to figure that out.

Google Linux servers hit with $5m patent infringement verdict (The Register)

drag — Fri, 22 Apr 2011 18:02:14 +0000

> I have no penchant towards or against Texas, but your remarks come off rather bigoted to me. Replace Texas with any other locale and ask yourself if you would feel comfortable making the same generalizations about any other place, even if you felt them to be true, as you have in your statement?

You've never been to Texas, have you?

For people that are native residents being Texan is a often a extreme point in pride. They, quite literally, believe that they are the best people in the world. They will take it as a personal insult if you ask them if they are from Texas or not... and will actually get upset about it. You should know instinctively!

But really, Texas does kick ass in a lot of ways. I would actually be very happy to live there. It really is better then most other states.

The patent issue, however, is a federal issue. Not state. Since USA law is based a lot on case precedent it's often quite a bit by accident what jurisdictions are more patent-friendly or not.

> I am really disgusted how there seems to be more and more of a free for all in blatant anti US bigottism displayed here on LWN. I don't care that it is against the US, I am just annoyed at how ugly the behavior is and how people seem to think ugly behavior is OK for some reason if directed against some US entity.

They are ignorant. That is what makes them bigoted. They don't realize that the USA is a easy scapegoat that their media and governments use to divert attention away from their own shit policies and bad decisions. They don't realize that most of their civilization for the past 60 years has been prompt up by American-style capitalism.

They also don't realize that USA government != USA people. Also states are separate entities from the federal government. Our two-party system is a pile of shit and we are paying the price for it with our failing economy. They don't understand that the people that are resentful and irritated at the USA government are the ones fighting against the badness that USA produces in the world.

One of the these days thy will start to fully realize that the only reason our government's bad decisions are hurting them so much is because their economies and governments have become so utterly dependent on USA.
They know this partially already and it burns. It's natural that this dependencies lead to resentment.

Liability of Users for Patent Infringement

orcmid — Fri, 22 Apr 2011 17:20:31 +0000

You misunderstood my observation about karma. Google has users of software that it distributes under GPL (and based on Linux) for which those users are being sued for patent infringement (not Google). I am referring to the Android business, of course.

In this case, Google has lost a suit (obviously yet to be appealed) because it has used software (Linux) in its own business for which there is no source of indemnification.

To argue that there should not be software patents (and I am very much aligned with that notion) has nothing to do with it. If that is the defense of Google's actions, I would say we are also seeing the consequences of hubris.

However the software-patent story unfolds, it will be over an extensive length of time and very painful for those who enjoy pissing into the wind.

Google Linux servers hit with $5m patent infringement verdict (The Register)

orcmid — Fri, 22 Apr 2011 17:14:40 +0000

Knuth addresses the case of a delete operation by key not actually deleting the record on the linked-list but marking it deleted, with the record storage and list slot available for reclaiming at a later time, possibly by a new insertion having the same hash.

Knuth also raises (in my 2d edition copy) a case where there might be pointers to records from other than the hash-table index and some sort of reference counting might be needed to avoid deleting or reusing the actual record until the record actually becomes unreachable. This is the closest I have found that is a kind of "expiration." I don't know if that was in the 1973 edition cited in the patent itself. I obsserve that this is a situation that can arise in operating-system work, of course, and then race-condition avoidance, locking, and such come into the picture as well.

None of the other discussion of this kind of hash-table indexing, including in the exercises, touch any further on that nor the very specific claim of the patent.

The patent deals with removals where the expiration of a record is not by an effort to delete it by key but by virtue of some other condition that arises independent of use of the hash-table index. As I said, what the patent applies to is methods of removal that discover such expired records on the linked-list while it is being traversed on behalf of some other operation involving a record with the same hash code. The patent is quite specific about that, with some additional claims around details for limiting the number of such deletions done at one time.

An earlier comment mentioned a network application where some connections or network probes might time out. Another case might be indexes to the active records for running processes, or shared libraries and their stickiness, or something like that. I don't see this method being essential to any of those, but if it has been so used in Linux (or in some other way), an alternative is called for.

Google Linux servers hit with $5m patent infringement verdict (The Register)

martinfick — Fri, 22 Apr 2011 16:23:33 +0000

I have no penchant towards or against Texas, but your remarks come off rather bigoted to me. Replace Texas with any other locale and ask yourself if you would feel comfortable making the same generalizations about any other place, even if you felt them to be true, as you have in your statement?

I am really disgusted how there seems to be more and more of a free for all in blatant anti US bigottism displayed here on LWN. I don't care that it is against the US, I am just annoyed at how ugly the behavior is and how people seem to think ugly behavior is OK for some reason if directed against some US entity.