LWN: Comments on "Enhancing screen-reader functionality in modern GNOME"

accessibility getting some resources

linuxrocks123 — Sat, 21 Jun 2025 02:56:15 +0000

> Just like the web, GTK has authoring practices that application developers have to follow

No, just like the W3C, GTK has written some words that almost nobody actually pays any attention to.

accessibility getting some resources

Cyberax — Fri, 20 Jun 2025 18:02:26 +0000

Windows GDI does not participate in the accessibility, it's too low for that. For the standard controls (implemented in user32.dll) Windows has built-in a11y support, and if you're making a custom control or don't use Windows controls, then you need to implement the IAccessible interface. It allows Windows to walk through the control hierarchy.

accessibility getting some resources

ebassi — Fri, 20 Jun 2025 17:49:10 +0000

> Isn't Windows the gold standard for screen readers, and doesn't Windows accessibility almost exclusively rely on the GDI, which is their toolkit level? I'm not arguing; I'm asking.

The toolkit is involved, of course, and can fill in the basics for you; but GTK, or Qt, or any other toolkit cannot come up with a textual description for your own application's UI components. Or the relations between those components, especially once you write your own custom UI elements to group things, for instance. A toolkit cannot read your mind, or read the mind of the people using your application. We can have educated guesses about it, and encode them into the API, but failing an educated guess is worse than not putting anything in: the latter is, at least, invisible, while the former can lead to data loss.

Just like the web, GTK has authoring practices that application developers have to follow: https://docs.gtk.org/gtk4/section-accessibility.html#auth...

accessibility getting some resources

linuxrocks123 — Fri, 20 Jun 2025 15:41:49 +0000

> you're still facing the fact that application developers rarely spend time on making their projects accessible in the first place; toolkits can only do so much.

Is that true? Isn't Windows the gold standard for screen readers, and doesn't Windows accessibility almost exclusively rely on the GDI, which is their toolkit level? I'm not arguing; I'm asking. My impression when researching dish was that making the toolkits accessible by default was ingenious and was probably the only realistic option for making most of the Linux GUI accessible, precisely because only a very small number of application developers are going to bother to support or even think about accessibility.

If you do need application developers to care about something, my suggestion in this comment later in the thread may help with that: https://lwn.net/Articles/1026300/

Display Shell

linuxrocks123 — Fri, 20 Jun 2025 15:29:51 +0000

I've discovered that AT-SPI2 is useful for programmatically interacting with GUI applications, so I've implemented it as an alternative backend for my dish tool:

https://github.com/linuxrocks123/dish

See dish-atk-backend.c in that repo.

I wrote dish as a way to script interactions with GUI applications -- start this web browser, log into this website, download this HTML, etc. -- and that's currently what I use it for. However, I also wrote it as a backend for accessibility software that allows control of the computer through the spoken word, for people whose hands don't like using the keyboard or mouse, or who don't have hands. sonic.cpp in that repo implements a very crude prototype of such software, taking as its input the output of voice2json transcribe-stream.

Unfortunately, the reason AT-SPI2 is an alternative backend to dish, rather than the primary backend, is because not every application supports ATK or AT-SPI2. I would like it very much if every application did support one of those interfaces. I am sure blind people would like that even more. But, since some applications do not register, the primary backend for dish is instead an AI OCR nightmare called paddleocr.

My suggestion for the accessibility community is to make the AT-SPI2 layer as useful for GUI scripting and UI testing as possible. Explicitly support and advocate for that use case in addition to the primary use case of providing data for screen readers and similar tools. Shell scripting the GUI is cool, but the real reason to promote it is that the number of Linux users who would like to shell script the GUI is probably about two orders of magnitude larger than the number of Linux users who are blind.

Right now, if you break something that blind people need to use Linux, you'll get a few people complaining, and maybe it'll get fixed next year. But, if you break Firefox's CI, that thing you broke is going to get fixed within the hour. Make the AT-SPI2 layer the foundation of every major Linux GUI application's CI, as well as the foundation for lots of hackers who want to write GUI shell scripts, and you'll get a much larger outcry when some project's AT-SPI2 support breaks or degrades, and a lot more people asking for AT-SPI2 support as a new feature for software that currently doesn't have it.

Strength in numbers.

accessibility getting some resources

raven667 — Fri, 20 Jun 2025 14:15:26 +0000

> the problem was that redesigning a whole accessibility stack—protocols, integration with toolkits, compositors, and assistive technologies—requires resources and time, and nobody was up for sponsoring this work.

I may have misunderstood or spoke poorly but this is what I was trying to communicate, volunteers are out there burning themselves out, taking flak from angry users (and the internet peanut gallery) but there aren't enough resources being allocated to make this stuff first-class, especially when it needs more than the minimal maintenance. It's great that Redhat/IBM is sponsoring some work, and it's too bad that the kind of work needed doesn't fit with the customer-base of the Steam Deck very well, as Valve has done great work in parts of the stack that affect their product, in the same way Sun did when they made a go of selling GNOME desktops.

accessibility getting some resources

ebassi — Thu, 19 Jun 2025 11:24:41 +0000

> the 2020 decision to base GTK4 accessibility API on X11 when the default output mode was already Wayland by that point

There was no such decision.

Back in early 2020, almost 10 months before the GTK 4.0 release, we were looking at ways to redesign the accessibility stack with various stakeholders, even before rewriting the implementation of ATSPI in GTK4. Of course, the problem was that redesigning a whole accessibility stack—protocols, integration with toolkits, compositors, and assistive technologies—requires resources and time, and nobody was up for sponsoring this work.

The closest thing we've got was the STF-sponsored exploratory work on Newton, a new accessibility protocol; it's still very much experimental, and as far as I know there is no grant to actually make it work.

The harsh truth is that the accessibility stack has been limping along for nearly 20 years, after the initial inception during the Sun days. We've had *some* changes, like rewriting ATSPI using D-Bus instead of CORBA in 2010-2011 (in time for GNOME 3) and the current work to move it towards a Wayland-only environment (mainly done, once again, by GNOME). Of course, even after you fix the protocol and its implementations and integration with toolkits and system components, you're still facing the fact that application developers rarely spend time on making their projects accessible in the first place; toolkits can only do so much.

accessibility getting some resources

khim — Thu, 19 Jun 2025 11:11:43 +0000

> I hope that desktop Linux is seen as worth investing time and effort in to do all the professional work which isn't hot and exciting and isn't the minimum necessary, such as accessibility, documentation, scientific UX testing, etc.

You can be absolutely sure that's the case! There are a lot of work around desktop Linux, accessibility, documentation, UX testing, everything…

Hardware vendors are really excited, too.

We would see how everything would work out when Google-baked Android would, finally, arrive on desktop next year. Wonder if any preview would be released this year, though.

eTests

sthibaul — Wed, 18 Jun 2025 11:24:01 +0000

The difficult part in whole-stack testing is that while you can produce testing scenarii that you can run, each time the end-user application interface changes, you have to update the test, it's quite tedious. This is somewhat done in e.g. firefox, but tedious to maintain.

Some systematic testing can be done, for instance gla11y is run on libreoffice to make sure a minimum level of accessibility of the interface.

I have ideas about some automatic keyboard-reachability testing that would be independent from the interface (essentially check that all widgets are somehow reachable with some shortcut, and that e.g. tab-browsing is consistent), but never managed to take the time to write something.

eTests

taladar — Wed, 18 Jun 2025 07:32:13 +0000

End-to-end might be hard to automatically test but I wonder if we could at least automate the bit up to the text generation that forms the input to the text-to-speech part.

eTests

tyrylu — Wed, 18 Jun 2025 06:17:46 +0000

Making tests for the a11y stack is definitely possible, and, yes, some A.I. approaches will certainly be useful for these tests. Doing tests on the a11y objects tree is certainly doable and is actually done, but it would seem that we also need end to end tests for the whole stack.

accessibility getting some resources

raven667 — Wed, 18 Jun 2025 04:18:50 +0000

Just the other day I was saying that infrastructure like accessibility needed resources beyond volunteers "scratching their own itch' to refactor it to fit in the current system design as it was clear that the resources which existed could only do the minimal amount of maintenance, eg the 2020 decision to base GTK4 accessibility API on X11 when the default output mode was already Wayland by that point, punting until now when someone had the budget and priority to perform and more thorough reengineering and coordination effort. I hope that desktop Linux is seen as worth investing time and effort in to do all the professional work which isn't hot and exciting and isn't the minimum necessary, such as accessibility, documentation, scientific UX testing, etc.

How about the others?

willy — Tue, 17 Jun 2025 19:06:29 +0000

> He said that the same interface would be included with the KWin compositor in KDE 6.4 and hoped that other compositors would include it as well.

How about the others?

smurf — Tue, 17 Jun 2025 18:17:31 +0000

That's interesting but it's limited to GNOME. Is there some resource that tells me how things work with KDE, and/or wlroot-based compositors, and/or what happens when you run a KDE program within Gnome's a11y environment (or wlroot and/or vice versa)?

Tests

SLi — Tue, 17 Jun 2025 18:00:07 +0000

I realize this is probably not the easiest thing to have automated tests for, but how much is actually possible? I'd assume the normal software engineering rule of "things that don't have a test break all the time" applies to this too.

Conversely, having tests documents some of the things that at least someone expects to hold, and applies that all important subtle pressure to at least think about things and possibly ask someone if a change you make causes a test to fail and you are not sure of the implications.

Of course there are things to catch that very traditional tests do not manage well (someone adding text as an image, things like logical order of buttons). But I'd imagine it would be at least possible nowadays to flag changes for further inspection. There are modern ways to get an answer to the question of "is all the text in the screenshot present in this text".