Mozilla on the coming version-100 apocalypse
Mozilla on the coming version-100 apocalypse
Posted Feb 16, 2022 19:52 UTC (Wed) by nybble41 (subscriber, #55106)In reply to: Mozilla on the coming version-100 apocalypse by flussence
Parent article: Mozilla on the coming version-100 apocalypse
A *really* badly coded web scraper would just send a User-Agent header matching a popular web browser, in which case the header doesn't add any value. (And the more you rely on the User-Agent header to determine your response the most likely this scenario becomes, as scrapers are forced to make themselves look as much like regular browsers as possible.)
I can see a compatibility argument against removing the header entirely, but IMHO the actual agent string should be locked to a single value matching one of the popular browsers and never updated again. The same goes for JS APIs to probe the user agent. Servers and client-side code should treat all user agents equally.
Posted Feb 17, 2022 7:52 UTC (Thu)
by taladar (subscriber, #68407)
[Link] (1 responses)
Posted Feb 17, 2022 15:31 UTC (Thu)
by nybble41 (subscriber, #55106)
[Link]
That isn't actually a disagreement—you're just not seeing a lot of "*really* badly coded web scrapers". I never said that *most* bots did this today. The point was just that you can't rely on a client-selected User-Agent string to filter out bots reliably. It's an easy thing to implement so long as it's not over-used, so scraper authors don't have any reason to work around it, but if identifying as a bot (or an old browser) will get a scraper blocked or throttled then correcting the problem will take a few minutes of the scraper developer's time at best. And in the meantime, for non-scrapers, we ought to be targeting web standards and not implementing workarounds for specific browsers. *That* is the point of freezing the User-Agent string: force sites to serve the same versions of their resources to everyone so that they don't break or degrade when someone comes along with a standards-compliant user agent the site simply can't identify.
Mozilla on the coming version-100 apocalypse
Mozilla on the coming version-100 apocalypse