Advances in machine vision, like the astonishingly powerful image-recognition capabilities of modern A.I., are erasing even these human actors from the equation. This year, Be My Eyes released a beta version of a service called the Virtual Volunteer, which replaces the human at the other end of the line with A.I. (powered by OpenAI’s GPT-4 model). A blind beta tester pointed his camera at a frozen meal, and the A.I. read him the description of the contents on the package, including the date of expiration and the size of the meal.
But the pitfalls of artificial intelligence are as present in the assistive-tech sphere as they are in the rest of society. As delighted as blind beta testers of OpenAI’s visual interpreter were, it also made some obvious mistakes: As Kashmir Hill recently reported in The New York Times, OpenAI confidently described a remote control for a blind user, including descriptions of buttons that weren’t there. When another beta tester showed the tool the contents of a fridge, asking for recipe ideas, it recommended “whipped cream soda” and a “creamy jalapeño sauce.” And OpenAI recently decided to blur people’s faces in the photos that the blind beta testers were uploading, severely limiting the Virtual Volunteer’s social utility for a blind user.
The visual world of information that is inaccessible to blind people is impossibly vast — think of every image and video and text that’s uploaded to the internet, let alone all the information that fills our offline world. (According to the World Blind Union, 95 percent of the world’s published knowledge is “locked” in inaccessible print formats.) This infinitely refreshing storehouse of information, most of it difficult if not impossible for people with visual or print disabilities to get access to, makes a universal technological solution seem like the only path forward. But in spite of technology’s well-documented power to transform the lives of people with disabilities, it cannot be the only solution.
Machine-vision bots have begun to automatically describe images online, but the results are still wildly variable — on Facebook, when my screen reader encounters photos of my friends and family, it invariably offers howlers like “image may contain: fruit.” If people wrote their own image descriptions, I’d get a much clearer sense of what was going on, with far more context. Likewise, companies such as accessiBe and AudioEye have amassed millions of dollars offering “accessibility overlays” and widgets that claim to automatically fix websites that are broken for its disabled users (and thus help the sites avoid costly A.D.A. lawsuits) with a few lines of A.I.-generated code. But frequently, accessibility overlays have made websites even more difficult to navigate for blind users. The solution, many advocates suggest, is to rely less on A.I., and instead to hire human accessibility experts to design websites with disability in mind at the outset. Again, people must remain part of the process.
Waiting in line for dinner this summer, I felt unwilling to pull out my phone to use any of the cybernetic solutions available to help me decipher the menu. I decided to just ask my wife, Lily, to tell me about the taco options. Our son, Oscar, who’s 10, interrupted her: Let me do it! He proudly read the various taco descriptions to me, and we both set to discussing which ones sounded good. Relying on Oscar to read the menu didn’t feel anything like a loss of independence. It was a fun, affectionate dialogue — a shared experience with a loved one, which was, beyond basic sustenance, the real reason we were there. His eyes and ears and brain had far superior sensors than any assistive device out there, and he’s far more charming to interact with.