Hands free browsing – an interview with Kim Patch

I first got to know Kim Patch when we worked together on the W3C User Agent Accessibility Working Group. As a resident of the Web since it’s inception and a long term voice input / non mouse user I was always fascinated to hear what she had to say about making web content, and how a browser interprets that content, accessible.

In this chat Kim discusses barriers of access and highlights what we as web designers and developers should be thinking about when building websites and apps for people who may not use a mouse.

Henny: Tell us a little about yourself and your involvement with the web both personally and professionally.

Kim: My first involvement with the web was as a journalist. I founded the Internet beat for PC Week back in 1993, and was at the the second annual World Wide Web conference in 1994, when there were only 400 attendees.

My first look at the web was largely via e-mail queries, because it took some time for me to get a web account at work. It wasn’t obvious to everyone that the web was going to be a big deal.

Back then Eric Schmidt was at Sun Microsystems, and I was interviewing him for another story when he mentioned something about Sun experimenting with implementing an internal web and I ended up writing about it. This was before the word “Intranet” was coined.

Then disaster struck for me personally — I got a bad case of repetitive strain injuries, didn’t take time off right away, and ended up very injured. It took me several years to get better. You can read the gory details here.

Once I started working again, I used speech recognition software and was very disappointed with it. It was okay for getting words on the page, but very tedious for controlling the computer. I did a lot of customization to make computer command-and-control — things like moving a window, opening a folder or navigating around a file — tolerable. The first step was building the commands I needed to do the necessary programming using speech input. This eventually turned into an add-on for Dragon: Utter Command. [Watch a demo and transcript of Utter Command with Firefox here].

Working with speech input and with other people who need to use speech input made me painfully aware of how inaccessible a lot of software is. I was still following the web and very excited about all the possibilities it was opening up, but in terms of accessibility it was bittersweet. People with RSIs were tempted to hurt themselves trying to access the web, because the accessibility tools weren’t quite there.

Henny: It’s interesting you say that because I’ve found when recruiting for disabled user testing it’s really hard to find someone who is purely a keyboard or voice input user.

Kim: The mouse is very compelling – it’s hands-on. It also has a lower barrier to entry than the keyboard, where you have to memorize keystrokes.

Henny: I guess there are so many types of ergonomic mice and scratch pads out there now that it reduces the number of purely keyboard only users.

Kim: The sad thing about this is the mouse is generally less efficient. We could change the mix to more efficient input in general if we could lower the barriers to the other input methods. The accessibility community, which is more familiar with the other methods, could be leveraged to increase general efficiency.

Henny: Agreed.

What’s your typical set up for browsing the web?

Kim: I use Dragon NaturallySpeaking Professional with Utter Command to access everything including the web. I test a lot of browsers, but primarily use Firefox with a key extension: Rudolf Noe’s Mouseless Browsing extension, which puts tiny numbers on every link. This makes browsing the web using speech input very fast and safe.

A screen shot of Mouseless Browsing numbering links on a page from this blog.

The other way to browse the web by speech is to say the names of links, but there’s a nasty catch to this method – if you’re writing a paragraph in a web form or document and happen to say the name of a link, you’re whisked off to that link, and what’s worse, when you come back your data may be gone. Gotchas like this are disappointing in accessibility software.

I use several other extensions that make my computer more accessible simply because they make it more efficient: an ad blocker that cuts down on needless distractions, and Jorge Rumoroso’s HeadingsMap, a tool that allows me to navigate long documents more efficiently.

The key thing that holds me back in using a browser regularly is a way to directly navigate everything by speech – the mouseless browsing extension enables this.

Henny: How users navigate with a keyboard may vary depending on if they are sighted or if they are a screen reader user. Do you ever see a conflict between the two?

Kim: In general, I think the more we can get different types of users to share information about how they browse the web, the better it gets for everyone. We have some silos now – keyboard only, mouse, speech, and to some extent touch. And we are not always using those differences to our advantage. At the same time, this will be increasingly important as the different types of input are used and as gesture becomes a more important interface.

Henny: What are the main barriers for voice input users when browsing?

Kim: First, focus problems have gotten worse with the last couple iterations of Windows. Sometimes it isn’t clear where the focus is, and sometimes the focus changes without warning. If you’re issuing a command that counts on the focus being in a particular place, something completely unexpected may happen. Even if you are aware what happened, having to adjust the focus before issuing a command cuts your productivity in half.

Mouse users aren’t affected by inconsistent focus – when you click the mouse you automatically bring the focus to whatever you’re clicking on. This is fortunate for mouse users, but unfortunate for everyone else, because sites are often tested assuming the user can use the mouse, and focus issues that affect speech and keyboard users are completely missed.

Second, Single-key shortcuts that the user cannot adjust. This is a huge problem that, fortunately, has a simple, elegant potential solution: give the user a facility to adjust single key shortcuts.

If you’re a speech user using Gmail, for instance, which has a nice set of single key shortcuts, at some point you’ll not realize where the focus is or not realize your microphone is on or be surprised by someone walking into your office and you’ll say something — a word or two – and the letters in those words will execute all kinds of things. For instance, if I walk into the office and my colleague says “Hey Kim”, not realizing that the microphone is on, and the focus is on his Gmail inbox, the “y” will archive whatever he’s on. Then the “k” will move the cursor up one message, and the “m” will mute that conversation.

And no, it’s not a solution to turn off the shortcuts. Then you don’t have shortcuts. The solution is to allow the user to adjust the shortcuts. Gmail does have a basic facility to adjust shortcuts, but many programs do not. The Gmail facility could be much improved, I’ve written at length about getting Gmail working well with speech commands.

Adding ‘+’ before Gmail’s single key shortcuts via Settings prevents common words triggering unwanted behavior.

Henny: How useful are skip links to sighted keyboard only users on desktop?

Kim: Depends on how the user navigates, and how the skip links are implemented. They can be useful.

Henny: How useful are skip links to sighted keyboard only users on mobile?

Kim: Possibly even more useful than the desktop because they could save the user from scrolling, and on smaller screens this potential is larger.

Henny: How familiar are keyboard only users with the design patterns outlined in the Web Accessibility Initiative Accessible Rich Internet Applications specification?. For example if navigating a tab panel is it more intuitive to only have the active tab in the tab panel is in the tab order and then use the arrow keys to navigate the other tabs OR have all tabs are in the tab order with content in the active tab following after the tab panel.

Kim: I think #1 is much more useful for both keyboard only and speech users. This is akin to organizing a document with two levels of headings rather than just one.

Henny: Do you browse on mobile much, if so what do you use?

Kim: I use an iPhone and often use Siri for the initial search. Trouble is, there’s no way to browse by voice beyond the initial search. So I have the search results, but then I have to use touch to navigate through the results, activate the site I want, then navigate through the site. Once I get to a text field I can use the microphone button on the keyboard to switch to speech, but If I want to correct or change the text, I’m back to touch again.

Henny: What are the main barriers on mobile for you?
Kim: Speech is not implemented fully. For people with repetitive strain injuries, it’s a tease – you can save some keystrokes, but are still tempted to do too much.

Henny: If you could list the top 3 things you’d want a website to do to make your experience what would they be?

  1. Use space to save keystrokes. One of the things I spend a fair amount of time doing is convincing people who are hurt to get large screens so they can limit scrolling. Especially in this context, it’s very frustrating to see drop-down menus that only have a few lines in them. This not only adds unnecessary navigation steps, it’s also cognitively more difficult because you’re only seeing a part of the picture.
  2. This is for web apps that implement single key shortcuts. Go the extra step of allowing users to change the shortcuts by adding characters. Single key shortcuts are a disaster for speech users, but you can turn the disaster into a plus simply by providing a mechanism for the user to change the shortcuts. If I can put one more key in front of every shortcut — for instance, +a, +b, etc., I can prevent stray words from making all kinds of unexpected things happen and at the same time allow speech users to use the shortcuts as well. Better yet, allow me many characters and I can simply change the shortcuts to words. And in addition, allow users to save and share the shortcuts. Think of this as a way to get free advertising. Gmail has a labs extension (keyboard shortcuts) that goes the first step – lets you control all the shortcuts and gives you three characters to do so.
  3. Make it obvious where the focus is. And keep in mind that mouse and touch users don’t have to pay attention to focus because the focus is always where the mouse arrow or touch point of contact is. So in making sure it’s obvious where the focus is make sure to test with keyboard only and speech users.

Henny: Kim, thank you so much, this has been really helpful as I worry that voice input and keyboard only users get lost in the crowd. I’m definitely going to think about customisable single-key shortcuts on future projects that have them, good food for thought.

Got a question for Kim? Then let us know.

More interviews

One thought on “Hands free browsing – an interview with Kim Patch

  1. Pretty section of content. I just stumbled upon your weblog and in accession capital to assert that I get
    actually enjoyed account your blog posts. Anyway I will be subscribing to your augment and even I achievement you access consistently
    rapidly.

Comments are closed.