why not use low bit instead ?

Posted Mar 30, 2022 16:00 UTC (Wed) by ballombe (subscriber, #9523)
Parent article: Pointer tagging for x86 systems

Why not use low bit instead ?
After all unaligned accesses are not supported anymore so all pointers start with 3 zero bits.

why not use low bit instead ?

Posted Mar 30, 2022 17:17 UTC (Wed) by mathstuf (subscriber, #69389) [Link] (4 responses)

That means you need to manually mask off the bits before any actual dereference instead of the hardware supporting "I ignore the upper bits" support.

why not use low bit instead ?

Posted Apr 3, 2022 9:57 UTC (Sun) by dcoutts (guest, #5387) [Link] (2 responses)

This is exactly how GHC's pointer tagging works.

https://gitlab.haskell.org/ghc/ghc/-/wikis/commentary/rts...

why not use low bit instead ?

Posted Apr 3, 2022 12:08 UTC (Sun) by mathstuf (subscriber, #69389) [Link] (1 responses)

That's fine in a language which hides raw pointers from you in the first place. I think I'd prefer using the upper bits in C if the CPU had support for ignoring them for me rather than asking "umm, did I remember to mask this out properly?" before any pointer dereference.

why not use low bit instead ?

Posted Apr 4, 2022 12:01 UTC (Mon) by dcoutts (guest, #5387) [Link]

Indeed, it would be a nightmare in C. I guess it'd be doable in C++.

This hardware feature is for almost certainly for performance though, not convenience. My guess is that it's primarily aimed at JVMs and similar.

I don't know for sure, but I'd guess that doing pointer tagging in software (and thus having to untag before dereferencing) is cheaper to do for the low bits than the high bits. That is, cheaper in terms of the extra instructions and their sizes. But then when doing it in hardware, a hardware impl can do it cheaply either way, and given that there's more bits available at the high end, it makes sense to use the high bits.

why not use low bit instead ?

Posted Apr 6, 2022 6:51 UTC (Wed) by anton (subscriber, #25547) [Link]

In many cases you know when using the address what the tag is, and then you can just use an offset at no or very low extra cost. E.g., if tag 3 means that we have a pointer to a cons cell, then car (aka head) accesses the machine word at offset -3, while cdr (tail) accesses the word at offset 5.

Low-bit tagging is used when 3, maybe 4 bits of tags are enough. If you need more, it becomes impractical, and you use high-bit tagging.

why not use low bit instead ?

Posted Mar 30, 2022 23:25 UTC (Wed) by neilbrown (subscriber, #359) [Link]

When accessing a single-byte (e.g. ASCII character) the low bit is a meaningful bit.

why not use low bit instead ?

Posted Apr 1, 2022 8:13 UTC (Fri) by marcH (subscriber, #57642) [Link]

> After all unaligned accesses are not supported anymore

Says who?

> so all pointers start with 3 zero bits.

Yes as long as you use only 64 bits values.