Member-only story
Web Speech API: What Works, What Doesn’t, and How to Improve It by Linking It to a GPT Language Model
Part of a series on how modern AI and other technologies could assist more efficient human-computer interactions
I am of the idea that modern technologies enable today much simpler and natural human-computer interactions than what current software actually proposes. Indeed, I think technologies are ripe enough that we could just go without traditional interfaces and move forward with a revolution in user experience.
Large language models have certainly triggered one stage of this revolution, particularly in how we ask for information. However, I think technologies can still provide much more. For example, we are still largely stuck with flat screens despite the decreasing costs of VR headsets; we are still using mouse, keyboard, and touch gestures to operate devices despite the level of advancement of technologies like eye-gazing, speech-recognition and body limb tracking; we are still reading out a lot despite great advances in speech synthesis.
I feel current technologies are ripe enough to offer human-computer interactions almost like those in Star Trek (if you don’t know…