Files with negative offsets
Files with negative offsets
Posted Jun 2, 2005 7:09 UTC (Thu) by xoddam (subscriber, #2322)Parent article: Files with negative offsets
A negative offset from the beginning of a file really is meaningless, and
a cleaner solution to this problem would be to make the offset type
unsigned so that addresses with the high bits set are simply high
positive integers. After all, very very very very large files might
conceivably have meaningful contents past offset 2^63 (not).
Unfortunately the semantics of the loff_t type require a signed integer
because it may also describe an offset from the current position in the
file, or from the end of the file.
Short of defining a new unsigned type for the current offset into a file,
Linus' solution is about as correct as can be in this situation.
Posted Jun 2, 2005 11:29 UTC (Thu)
by brother_rat (subscriber, #1895)
[Link] (7 responses)
Posted Jun 2, 2005 22:22 UTC (Thu)
by giraffedata (guest, #1954)
[Link] (6 responses)
Nobody said that negative numbers or negative addresses are entirely meaningless. The statement was that negative file offsets are meaningless, and it's hard to argue with that. Besides defying common sense, they defy POSIX. The goal of this interface, even for the special files, is to implement the POSIX standard and reap the benefits of presenting a uniform interface to application programs.
While it's nice that Linus and Al have saved users of all the normal files from this assault on common sense, it would have been nicer to spare the users of ALL files. There are plenty of interfaces the kernel could provide that access a memory location given an address, but this one is about making memory look like a file. A file doesn't have data before its beginning.
There are plenty of ways to map a 48 bit address space into a 63 bit file offset space. It's a kmem device driver problem, not a VFS problem. VFS was already as right as it could be.
Incidentally, another reason that loff_t has to be signed is that it is a return value from the lseek system call and C library function. As a positive number, it is a file offset; as a negative number, it indicates a failure. In the very special case of /dev/kmem, this turns out not to be critical because the particular negative numbers that would be used for failures are not addresses that Linux uses.
Posted Jun 3, 2005 12:23 UTC (Fri)
by farnz (subscriber, #17727)
[Link] (5 responses)
Posted Jun 3, 2005 15:09 UTC (Fri)
by giraffedata (guest, #1954)
[Link] (4 responses)
I think those aren't the only options. For a 64 bit signed address space, I'd probably go with two files -- /dev/kmem where offset = address and /dev/kmem_minus where offset = -address .
Note that the negative offset solution Linus chose is not compatible with the full architectural specification either, because it doesn't work for addresses -1 - -4095, and not only the architecture but actual existing CPUs have such addresses.
(Reminder: -1 - -4095 don't work because in the loff_t type, those are error codes. I'm told it's not an immediate issue because Linux doesn't use those addresses).
Posted Jun 7, 2005 7:20 UTC (Tue)
by xoddam (subscriber, #2322)
[Link] (1 responses)
Posted Jun 7, 2005 17:09 UTC (Tue)
by giraffedata (guest, #1954)
[Link]
Posted Jun 7, 2005 7:31 UTC (Tue)
by farnz (subscriber, #17727)
[Link] (1 responses)
Two files is dangerous in terms of future compatibility, as AMD64
sign-extends addresses; thus, at the moment, I can read from
0xffffxxxxxxxx, where x is any hex digit, and get kernel addresses. In
future, this becomes a userspace address (when the virtual space is
extended to 49 or more bits), and my code suddenly breaks on me.
Linus's current solution works fine for as long as AMD64 processors
have (or use) no more than 52-bits of virtual address space. I think the
only long terms solution is to switch to 128-bit offset types, but that's
going to be painful.
Posted Jun 7, 2005 17:31 UTC (Tue)
by giraffedata (guest, #1954)
[Link]
No one said any CPU currently has a 64-bit virtual address space.
I did say current AMD64 CPUs have addresses -1 - -4095. All of them do.
The reason this Linux architectural gap doesn't matter for the /dev/kmem case is that Linux chooses not to use the address range -1 - -2^20. (I'm not sure why -- I just looked up a memory map just now to verify what I was told about Linus' solution working in spite of the -1 -4095 hole is true).
It's the other way around. The sign extension is what makes it work the same regardless of the virtual address space size. (In fact, it's evident to me that that's the whole point behind AMD64's negative address weirdness/innovation). The 48 bit address 0xffff fffffff0 is the address
-16, which is identical to the 52 bit address 0xfffff fffffff0 or the 64 bit address 0xffffffff fffffff0. When you read address -16, you will never be reading user space.
In your C program, you should be using a pointer data type, which is 64 bits encoded in two's complement pure binary, which means that even today, your pointer is encoded with the 64 bits 0xffffffff fffffff0. Nothing would change.
I don't see 128 bit file offsets ever happening. /dev/kmem isn't what the POSIX file interface is for; what it is for won't need 128 bit offsets enough to justify the cost of the change.
I think in this case, I think negative numbers can be taken to mean "from the end of the address space", so to say they are "entirely meaningless" isn't really true.Files with negative offsets
Files with negative offsets
The pain here is that AMD64 defines a 64-bit virtual address space; Opteron and Athlon64 only have a 48-bit virtual address space (40-bits physical), and define their own mappings from 48-bits to 64-bits. The choice is therefore to break the offset semantics and keep to AMD's documentation, or to define a new mapping for 48-bit to 63-bit, and have to rework it for a later processor with a 64-bit virtual address space.
Files with negative offsets
Files with negative offsets
The choice is therefore to break the offset semantics and keep to AMD's documentation, or to define a new mapping for 48-bit to 63-bit, and have to rework it for a later processor with a 64-bit virtual address space.
> /dev/kmem_minus where offset = -address . gnikniht drawkcab
Now that's a bit silly. You'd have to *reverse* the contents of the
memory in the device driver when reading and writing!
Well, OK. Then:
/dev/kmem_upper, where offset = 2^63 + address .
offset = -address silly
Which CPUs currently have a 64-bit virtual address space? To the best of
my knowledge, all AMD64 CPUs to date have a 48-bit virtual address space.
Files with negative offsets
Files with negative offsets
Which CPUs currently have a 64-bit virtual address space?
Two files is dangerous in terms of future compatibility, as AMD64 sign-extends addresses; thus, at the moment, I can read from 0xffffxxxxxxxx, where x is any hex digit, and get kernel addresses. In future, this becomes a userspace address (when the virtual space is extended to 49 or more bits), and my code suddenly breaks on me.
