An ioctl() call to detect memory writes
The driving purpose for this feature, it seems, is to enable an efficient emulation of the Windows GetWriteWatch() system call, which is evidently useful for game developers who want to defend against certain kinds of cheating. A game player who is able to access (and modify) a game's memory can enhance that game's functionality in ways that are not welcomed by the developers — or by other players. Using GetWriteWatch(), the game is able to detect when crucial data structures have been modified by an external actor, put up the modern equivalent of a "Tilt" indicator, and bring the gaming session to a halt.
Linux actually provides this functionality now by way of the pagemap file in /proc. The current dirty state of a range of pages can be read from this file, and writing the associated clear_refs file will reset the dirty state (useful, for example, after the game itself has written to the memory of interest). Accessing this file from user space is slow, though, which runs counter to the needs of most games. The new ioctl() call is meant to implement this feature more efficiently. The Checkpoint/Restore In Userspace (CRIU) project would also be able to make use of a more efficient mechanism to detect writes; in this case, the purpose is to identify pages that have been modified after the checkpoint process has begun.
Soft dirty deemed insufficient
The kernel's "soft dirty" mechanism, which provides the pagemap file, would seem to be the appropriate base on which to build a feature like this. All that should be needed is a more efficient mechanism to query the data and to reset the soft-dirty information for a specific range of pages. According to the cover letter, though, this approach ended up not working well. There are various other operations, such as virtual memory area merging or mprotect() calls, that can cause pages to be reported as dirty even though they have not been written to. That, in turn, can lead to a game concluding, incorrectly, that its memory has been tampered with.
That can lead to undesirable results. One rarely sees the level of anger that can be reached by a game player who has been told, at the crux point of a quest, that cheating has been detected and the game is over.
Fixing this false-positive problem, evidently, is not an option, so the decision was made to work around it instead via an unexpected path. The userfaultfd() system call allows a process to take charge of its own page-fault handling for a given range of memory. The patch set adds a new operation (UFFD_FEATURE_WP_ASYNC) to userfaultfd() that changes how write-protect faults are handled. Rather than pass such faults to user space, the kernel will simply restore write permission to the page in question and allow the faulting process to continue.
Thus, userfaultfd(), which was designed to allow the handling of faults in user space, is now being used to modify the handling of faults directly within the kernel with no user-space involvement. This approach does have some advantages for the use case in question, though: it allows specific ranges of memory to be targeted, and the use of write protection to trap write operations provides for more reliable reporting of writes to memory. To see which pages have been written, it is sufficient to query the write-protect status; if the page has been made writable, it has been written to.
The ioctl() interface
With that piece in place, it is possible to create an interface to query the results. That is done by opening the pagemap file for the process in question, then issuing the new PAGEMAP_SCAN ioctl() call. This call takes a relatively complex structure as an argument:
struct pm_scan_arg { __u64 size; __u64 flags; __u64 start; __u64 end; __u64 walk_end; __u64 vec; __u64 vec_len; __u64 max_pages; __u64 category_inverted; __u64 category_mask; __u64 category_anyof_mask; __u64 return_mask; };
The size argument contains the size of the structure itself; it is there to enable the backward-compatible addition of more fields later. There are two flags values that will be described below. The address range to be queried is specified by start and end; the walk_end field will be updated by the kernel to indicate where the page scan actually ended. vec points to an array holding vec_len structures (described below) to be filled in with the desired information.
The final four fields describe the information the caller is looking for. There are six "categories" of pages that can be reported on:
- PAGE_IS_WPALLOWED: the page protections allow writing.
- PAGE_IS_WRITTEN: the page has been written.
- PAGE_IS_FILE: the page is backed by a file.
- PAGE_IS_PRESENT: the page is present in RAM.
- PAGE_IS_SWAPPED: the (anonymous) page has been written to swap.
- PAGE_IS_PFNZERO: the page-table entry points to the zero page.
Every page will belong to some combination categories; a calling program will want to learn about the pages in the range of interest that are (or are not) in a subset of those categories. The usage of the masks is described, to a point, in this patch, though it makes for difficult reading. The code itself does the following to determine if a given page, described by categories, is interesting:
categories ^= p->arg.category_inverted; if ((categories & p->arg.category_mask) != p->arg.category_mask) return false; if (p->arg.category_anyof_mask && !(categories & p->arg.category_anyof_mask)) return false; return true;
Or, in English: first, the category_inverted mask is used to flip the sense of any of the selected categories; this is a way of selecting for pages that do not have a given category set. Then, the result must have all of the categories described by category_mask set and, if category_anyof_mask is not zero, at least one of those categories must be set as well. If all of those tests succeed, the page is of interest; otherwise it will be skipped.
After the scan, those pages are reported back to user space in vec, which is an array of these structures:
struct page_region { __u64 start; __u64 end; __u64 categories; };
The return_mask field of the request is used to collapse the pages of interest down to regions with the same categories set; for each such region, the return structure describes the address range covered and the actual categories set.
Finally, to return to the flags argument, there are two possibilities. If PM_SCAN_WP_MATCHING is set, the call will write-protect all of the selected pages after noting their status; this is meant to allow checking for pages that have been written and resetting the status for the next check. If PM_SCAN_CHECK_WPASYNC is set, the whole operation will be aborted if the memory region has not been set up with userfaultfd() as described above.
Checking for cheaters
To achieve the initial goal of determining whether pages in a given range have been modified, the first step is to invoke userfaultfd() to set up the write-protect handling. Then, occasionally, the application can invoke this new ioctl() call with both flags described above set, and with both category_mask and return_mask set to PAGE_IS_WRITTEN. If no pages have been written, no results will be returned; otherwise the returned structures will point to where the modifications have taken place. Meanwhile, the pages that were written to will have their write-protect status reset, ready for the next scan.
This series is in an impressive 27th 28th
revision as of this writing. It was initially proposed
in mid-2022 as a new system call named process_memwatch() before
eventually becoming an ioctl() call instead. There is clearly
some motivation to get this feature merged, but the level of interest in
the memory-management community is not entirely clear. The work has not
landed in linux-next, and recent posts have not generated a lot of
comments. There do appear to be use cases for the feature, though, so a
decision will need to be made at some point.
Index entries for this article | |
---|---|
Kernel | Memory management |
Kernel | Releases/6.7 |
Kernel | userfaultfd() |
Posted Aug 10, 2023 14:41 UTC (Thu)
by delroth (subscriber, #110092)
[Link] (12 responses)
Posted Aug 10, 2023 15:53 UTC (Thu)
by DemiMarie (subscriber, #164188)
[Link]
Posted Aug 10, 2023 19:37 UTC (Thu)
by dullfire (guest, #111432)
[Link] (10 responses)
I can understand the frustration of a gamer who's game is malicious. However I don't think a reasonable solution to that is help game developers be more abusive[1].
However the use for dolphin sounds awesome.
[1] Authors of software have no business tell the owners of a computer system what memory regions/values they can write to. And to be honest, unless there is net play, one person cheating... is irrelevant. It's like cheating at solitaire.
If there is net play, and your game isn't designed such that the server alone is sufficient to detect and prevent cheating... then you have always lost.
Posted Aug 10, 2023 23:03 UTC (Thu)
by edeloget (subscriber, #88392)
[Link] (7 responses)
There's also the fact that not all cheating techniques rely on /checks/. Some only rely on informations that must be displayed by the client. For example, when an ennemy is visible on your screen then an auto-aim can get it and help you to get a kill. Auto-aiming might not be easy to catch on the server side (if your kill stats does a visible jump then there is a problem ; but if you're already a good player who wants to win more (for various reasons: fame, ranking... ; for this, see the unusually high number of already very good streamers that were caught red handed. At some pint, competitive gaiming is really... competitive).
(Frankly, if I ever had to device an aimbot, I would market it as 'progressive', giving you better and better results as you use it in order to mimic a real player progression. How would a server detect that without risking to ban players that are really progressing fast?)
Moreover, some cheats /are/ playing with the network. Some small variation in the rate at which your game clients send packets might not be detected by the server (because of natural jitter) but may provide you important advantages in some situations. This was the case in Minecraft for example, where a slight acceleration of the packet sending speed would allow you to run a bit faster -- nothing dramatic, but since all players are running at the same speed... This can be detected if overused but it's very difficult to detect if used in a clever way (for example to get a small boost during a few seconds here and there) unless you want to detect a lot of false positives.
Expecting a server to always catch dubious things and rapidly act on them is not going to work well in practice. Game clients will always have to do some important checks in order to be usable, and of course they will always need to display information to the user.
Posted Aug 11, 2023 1:27 UTC (Fri)
by KJ7RRV (subscriber, #153595)
[Link] (4 responses)
Posted Aug 11, 2023 3:34 UTC (Fri)
by farnz (subscriber, #17727)
[Link]
This is the ultimate end state of a lot of game cheats; the cheat acts to make you suddenly get much better in a way that's humanly plausible. It's thus impossible to distinguish cheaters from humans who learn from their mistakes, since the cheat improves your play over time in exactly the same way as would happen if you learnt over time.
In the really extreme case, someone builds a machine that operates the same controls as you, in parallel to you, and that has machine vision and listening - this is now undetectable from within the game, since the inputs and outputs are the same as used by the human player.
Posted Aug 11, 2023 8:22 UTC (Fri)
by edeloget (subscriber, #88392)
[Link] (2 responses)
Yes, of course.
Anyone can have a gotcha moment when playing a video game, resulting in a detectable jump in their playing abilities. Some players might just change their glasses and become suddenly better. Or you might start playing a game while being temporarily disabled (broken arm for example) and then become visibly better when that condition disapear.
There are tons of reasons why a player might legitimately become better so the game server cannot act on this information alone (and in most case, that's only what it gets).
Posted Aug 11, 2023 10:22 UTC (Fri)
by paulj (subscriber, #341)
[Link] (1 responses)
Posted Aug 11, 2023 13:40 UTC (Fri)
by rincebrain (subscriber, #69638)
[Link]
You aren't going to stop everything - if there's an enormous market for your project, people are going to throw themselves at it and eventually some portion are going to find ways around your mitigations.
But if you, say, filter out 95% of the people trying to circumvent you by making the barrier high enough, that may result in a good enough result that the thing you're trying to optimize for (protecting the huge upfront hump of initial sales, preventing your multiplayer experience from having a reputation for being instakill aimbots with no stopping them) might be achievable.
Of course, unlike that kind of use-until-burned DRM, cheat detection becomes an ongoing cat and mouse game for the game's lifecycle even if you are doing deeply invasive monitoring, so at some point you're (probably) going to end up having a tradeoff between using it as a varyingly weak signal to get human attention to look at a player and go "...are they obviously doing impossible things" and playing whack-a-mole with even more weird heuristics (and then, if you scale enough that you can't justify humans to do a priori review, just ban on the signal and clean up reports of false positives...).
I'm not, to be clear, claiming that any of the above invasive monitoring is good, nor that even the arguments about using e.g. Denuvo for protecting week 1 sales are accurate, but that that is my understanding of the rationales and tradeoffs involved.
Posted Aug 11, 2023 12:07 UTC (Fri)
by dullfire (guest, #111432)
[Link] (1 responses)
I understand exactly why many modern games use client side anti-cheat. However I think my point was not conveyed. If anti-cheat has to be enforced on "not your hardware" then you have always lost.
For example, unless your going to require people never use USB devices (somehow... seems immensely impractical to me), it will always be trivial (especially $$$-cost wise, but also in low in engineering effort) to simple stand up a beaglebone black (or any of the other oodles of commodity hardware out there) as a "normal"[1] input gadget. And that's just one way. There are endless way to circumvent that... because the "hostile"(cheating) party controls the whole system of relevance.
My point is: if your game design requires some amount of anti-cheat be done client size, then your game is very fragile. You have no way to actually enforce that.
[1] Or a normal game pad, or microphone, or what ever else you are expecting that uses standard hid drivers.
Posted Aug 11, 2023 14:29 UTC (Fri)
by excors (subscriber, #95769)
[Link]
Sure, but there are different options for responding to that situation:
1) Stop making games like that.
The problem with option 1 is that it includes basically all competitive action games (since they're inherently susceptible to client-side aimbots etc), which are some of the most popular games, with hundreds of millions of players and tens of billions of dollars of revenue. Players want those games and developers want to make those games.
The problem with option 2 is that rampant cheating will kill a multiplayer game. If you're in a match with 12 or 24 or 100 players, and even one of them is cheating, it's usually no fun. A small percentage cheating can mean you'll encounter one in almost every match, and then you'll stop playing the game. And that makes the cheater-to-non-cheater ratio worse for the remaining non-cheaters, so they're increasingly likely to quit too. We don't want the games to die for the same reason as point 1.
In practice, option 3 works. There are popular competitive action games on PC where cheating is rare enough that typical players won't be bothered by it. They use a combination of server-side and client-side techniques, plus other design techniques (like requiring a substantial investment of money and/or time before an account is allowed into competitive modes, so people can't trivially start cheating on a new account whenever their old one gets banned), and lawsuits against people selling cheats, etc. It's fragile and messy but it seems to be good enough, and I don't think I've seen any better ideas.
Posted Aug 11, 2023 8:50 UTC (Fri)
by excors (subscriber, #95769)
[Link] (1 responses)
For many types of game, that's simply impossible. Of course the server can and should prevent the player teleporting or giving themselves infinite ammo (and some games even fail at that step, which is just bad design), but there's no way the server can prevent aimbots (where the client analyses the video output and simulates mouse movements to more accurately aim at enemies) or wall-hacks (where the client messes with the rendering so they can see enemies through solid walls), which can be a serious problem for competitive shooters.
(Well, the server could avoid telling the clients about enemies that are meant to be completely obscured - but if the enemy has a nose or a shadow that's barely visible around the wall, or is making footstep sounds, then the client really needs to know about their position and cheat developers can make that enemy unnaturally visible. Some games do that and it helps but it's not perfect. See e.g. https://technology.riotgames.com/news/demolishing-wallhac...)
Game developers can add player reporting mechanisms and server-side heuristics to detect unnatural movements, but there's a significant risk of false positives, and cheat developers can make their cheats behave much more subtly and barely distinguishable from a highly skilled player. After that, it's just an arms race between client-side cheat detection and cheat-detection-avoidance. Game developers can't win that race, but they don't need to, they just need to make it sufficiently expensive for cheat developers to keep up that they have to charge large amounts of money for their cheats, so the total number of cheaters remains low enough that regular players are able to tolerate it and still enjoy the game.
Posted Aug 11, 2023 12:04 UTC (Fri)
by excors (subscriber, #95769)
[Link]
To add some actual numbers here: there was a Destiny 2 cheat seller who charged $13-$19 per day or $105-$169 per month, justifying the prices based on "the complex anti-cheat this game has . . . which means that high-quality cheats are expensive to create and maintain". They made about $150K over two years from 5,848 transactions, while the game developer claims to have "spent more than $2,000,000 on cheat mitigation (including staffing and software)" (though the mitigations were for all cheat sellers, not just this one). (https://thegamepost.com/wp-content/uploads/2023/02/bungie...)
Another seller charged $90 per month or $500 for lifetime access to Valorant cheats, and was believed to have sold "tens or hundreds of thousands of dollars". (https://www.polygon.com/2021/1/11/22224696/riot-bungie-de...)
Those sound like quite high prices that will discourage many players from casually cheating, and reasonable but not huge incomes for cheat developers. It seems plausible that a modest increase in the difficulty of defeating anti-cheat techniques will change the economics enough for some of the cheat developers to give up and find something more valuable to spend their time on.
Posted Aug 10, 2023 19:27 UTC (Thu)
by KJ7RRV (subscriber, #153595)
[Link] (5 responses)
Posted Aug 10, 2023 19:39 UTC (Thu)
by dullfire (guest, #111432)
[Link]
Posted Aug 10, 2023 21:28 UTC (Thu)
by comex (subscriber, #71521)
[Link] (3 responses)
The difference is that in the latter case it doesn’t matter how easy the mechanism is to bypass. It might even be acceptable to trivially implement the API to always report that memory has not changed – if that was enough to get games to work. Except it might not be. Some games might be specifically on the lookout for a dummy implementation – not because they had Wine in mind, but because they thought some cheat running natively on Windows might hook GetWriteWatch. So they might deliberately perform a write on some watched memory and verify that it’s reported.
And of course, some Windows applications may be using GetWriteWatch functioning correctly for purposes other than anti-cheat, so a dummy implementation might break them.
Posted Aug 11, 2023 13:48 UTC (Fri)
by stevie-oh (subscriber, #130795)
[Link] (2 responses)
Java and .NET are a garbage-collected languages. This means that memory is not explicitly freed by code that executes in the JVM/CLR. Instead, whenever the process starts to run low on memory, everything is paused and the garbage collector is run in order to free memory no longer in use.
Determining which memory is in use is straightforward:
- First, there are all the "roots": the memory immediately accessible to any thread in the process. To access memory, (valid) code needs a reference to it. Such references will necessarily live inside the current CPU registers of threads, variables on threads' stacks, and global variables.
From there, that memory can contain references to other memory. The GC (garbage collector) traces through all of these references recursively and keeps track of all memory that it can reach. Any memory it _doesn't_ reach, therefore, is free.
One inefficient thing about this is the global variables. Those tend to be static data that was initialized at (or near) program startup; they never change. But every time memory reclaim is needed, the GC needs to check through all those objects again.
However, there's a trick here: old allocations can't refer to newer allocations unless they've been updated. So the .NET GC uses GetWriteWatch on older allocations as a optimization: If the old allocations haven't been written, then it knows these things:
- all memory that was reachable from those older allocations is still around. don't free it.
Posted Aug 11, 2023 14:05 UTC (Fri)
by paulj (subscriber, #341)
[Link]
Each successive heap has a longer and longer scan time. I.e., X_0 < X_1 < ..., and X_i < X_(i+1) for all i in N_0. So young objects get checked quickly, longer lived objects get checked less and less.
Posted Aug 11, 2023 17:41 UTC (Fri)
by Cyberax (✭ supporter ✭, #52523)
[Link]
.NET used to use the dirty bits to track the changes, but they switched to card marking ( https://mattwarren.org/2016/02/04/learning-how-garbage-co... ) a while ago. It turns out that games with virtual memory are slow, and the page-level granularity is a bit too big.
They have never used GetWriteWatch though, but something different. There's a way in Windows to query a list of pages that have a "dirty" bit set, but the function name eludes me right now.
Posted Aug 17, 2023 2:52 UTC (Thu)
by irogers (subscriber, #121692)
[Link] (2 responses)
[1] https://developer.apple.com/documentation/apple-silicon/p...
Posted Aug 17, 2023 10:24 UTC (Thu)
by farnz (subscriber, #17727)
[Link] (1 responses)
Reading the page you linked suggests that there's no optimizations involved in MAP_JIT. Instead, it's about compulsory code signing; by default, macOS will not let you mark a page as PROT_EXEC unless it has a verified code signature that traces back to a suitable root of trust. MAP_JIT changes this behaviour, and says that a page marked with MAP_JIT can be either PROT_EXEC, or PROT_WRITE, but not both at once, and that such pages do not need to have a code signature.
You still need to call the appropriate TLB + cache management code yourself (wrapped up in sys_icache_invalidate) even if you use MAP_JIT; that's unchanged from before. The benefit of MAP_JIT is that it hardens applications against attackers; if you're not flagged as using a JIT, or if the attacker tries to make an existing data page executable (as opposed to one that was allocated with MAP_JIT), the page table update needed to mark the page executable will not happen. This reduces the exploit surface to gadgets already present in the application binary (see also return-oriented programming), and increases the chances that Apple will see enough telemetry showing attempts to execute code in a non-executable page with a signed binary that they can revoke your signature until you fix the bug.
Posted Aug 23, 2023 17:19 UTC (Wed)
by anton (subscriber, #25547)
[Link]
Posted Aug 24, 2023 9:31 UTC (Thu)
by jepsis (subscriber, #130218)
[Link]
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
2) Keep making the games but give up trying to prevent cheats.
3) Put a substantial amount of effort into trying to imperfectly reduce cheating.
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
Or worse case, a patched ld.so (and an strace run to identify raw call sites)
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
- no newer allocations can possibly be referenced and thus be kept alive by those older allocations.
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes
Another "benefit" is that the development version of Gforth does not work on MacOS on Apple Silicon, and when I spend the time to work around this breakage, the workaround will result in Gforth running significantly slower on MacOS than on Linux (on the same hardware).
An ioctl() call to detect memory writes
An ioctl() call to detect memory writes