Today marks Global Accessibility Awareness Day (GAAD), and as in previous years, many tech companies are marking the occasion by announcing new accessibility features for their ecosystems. On Tuesday, Apple kicked things off, and now Google has joined in. To begin with, the company has made TalkBack, the screen reader built into Android, more useful. With one of Google’s Gemini models, TalkBack can now answer questions about images displayed on your phone, even if they don’t have any alternative text describing them.
“This means that the next time a friend sends you a picture of their new guitar, you can get a description and ask additional questions about the make and color, or even what else is in the picture,” Google explains. The fact that Gemini can see and understand images is due to the multimodal capabilities that Google has built into the model. In addition, the Q&A feature works across the screen. So, for example, if you are shopping online, you can first ask your phone to describe the color of the clothes you are interested in, and then ask if it is on sale.
Separately, Google is launching a new version of its expressive signatures. First announced at the end of last year, this feature generates subtitles that try to convey the emotions of what is being said. For example, if you’re video chatting with your friends and one of them groans after you make a bad joke, your phone will not only play the subtitles, but also add the words “[groan]” to the transcription. With the new version of Expressive Captions, subtitles will show up when someone drags out the sound of their words. This means that the next time you’re watching a live soccer match and the announcer shouts “goal!”, his excitement will be correctly deciphered. In addition, there will now be more labels for the sound of someone clearing their throat.
The new version of Expressive Captions is available to English-speaking users in the US, UK, Canada, and Australia who are running Android 15 and above on their phones.