Speech-to-text is an incredibly nifty feature when you’re driving or when your hands aren’t available to type. The usefulness of such a feature is largely dependent on the software processing your voice and how fast it is, because in some instances it can take too long to the point where you might be faster just typing it out.
Alternatively, it could also be highly inaccurate which results in you spending more time editing it. For Gboard users out there who think that the app’s speech recognition feature could do with some improvements, you’re in luck because Google has announced that they have started to roll out “end-to-end, all-neural, on-device” speech recognition to the app.
“Today, we’re happy to announce the rollout of an end-to-end, all-neural, on-device speech recognizer to power speech input in Gboard. In our recent paper, ‘Streaming End-to-End Speech Recognition for Mobile Devices’, we present a model trained using RNN transducer (RNN-T) technology that is compact enough to reside on a phone. This means no more network latency or spottiness — the new recognizer is always available, even when you are offline.”
Typically what happens when speech recognition is used is that the information is being sent to remote servers who will do all the processing to figure out what you’re trying to say. As we said, this can take a while depending on the server’s traffic and also how fast your connection is, but the latest update to Gboard will take this entire process offline and put it into your phone.
Ultimately this should reduce the amount of time spent processing the data, and in turn will also allow for real-time output character-by-character, which according to Google will feel as though someone was typing out what you’re saying as you’re saying it.
That being said, the bad news is that the feature will only be available on Google’s Pixel devices for now, but hopefully, it will eventually find its way to other handsets in the near future.