*Just Speak*: Controlling Android By Voice
1 JustSpeak: Controlling Android By Voice
JustSpeak is an Android Accessibility Service that enables voice control of your Android device. Once enabled, you can activate on-screen controls, launch installed applications, and trigger other commonly used Android actions using spoken commands.
1.1 Enabling Just Speak
Once installed, you can enable the Just Speak service under Settings->Accessibility on your Android device. Enabling the service is an action that only needs to be performed once as Just Speak will be automatically restarted when the phone is rebooted.
Just Speak is an Accessibility Service that uses Acessibility APIs on the platform to augment Android's user interface. Just Speak augments the Android user interface with voice-input control; other Accessibility Services like TalkBack provide spoken feedback. Note that Just Speak can be used either by itself, or in conjunction with other platform Accessibility Services such as TalkBack.
1.2 Initiating Voice Recognition
Once Just Speak is installed and enabled on your device, you can initiate voice commands by either performing an up-swipe from the Home button (if your device has soft keys) or by performing multiple taps on the Home button (if your device has hard keys). Note that this is the same gesture that activates Google Now on devices running the stock version of Android; When using Just Speak , you can get to Google Now by saying Launch Google Now.
Successful invocation of Just Speak starts voice recognition; this is indicated by playing an auditory icon (accompanied by vibration if available) and a visual overlay. Depending on the settings enabled in Just Speak, as well as other Accessibility Services, received voice input may be spoken and/or displayed after voice recognition has completed. As an example, TalkBack can work in conjunction with Just Speak to speak the recognized commands. Just Speak supports both local and global commands as described in later sections of this document.
1.3 Cancelling Voice Recognition
Voice recognition can be stopped by performing the same action used to initiate voice recognition (either performing an up-swipe from the Home button or by performing multiple clicks of the Home button). Just Speak is also programmed to stop listening if no voice input is received after a specific amount of time. Finally, an overlay, that covers the entire screen, is displayed whenever voice recognition is active; Clicking anywhere on this overlay will dismiss the overlay and terminate voice recognition.
1.4 Global Voice Commands
Just Speak supports a set of global commands that are available on any screen. These global commands include:
Command | Utterance | Synonym | Action |
Open | Open <Installed Application> | Launch, Run | Launch Application |
Recent | Recent Apps | Recent Applications | Recent Applications |
Quick Settings | Quick Settings | Open | Quick Settings |
Toggle WiFi | Switch WiFi (On/Off) | Toggle WiFi | |
Toggle Bluetooth | Switch Bluetooth (On/Off) | Toggle Bluetooth | |
Toggle Tethering | Switch Tethering (On/Off) | Toggle Tethering | |
Home | Go Home | Return to home screen | |
Back | Go Back | Return to previous screen | |
Notifications | Open Notifications | Open notifications shade | |
Easy Labels | Easy Labels | Display Easy Labels |
Note that voice commands are flexible in that action words may be substituted with synonymous verbs (eg: “launch” in place of “open”). In addition, voice commands can be formulated as complete sentences, e.g., "Please open GMail".
In addition, Just Speak provides the following spoken aliases as a means of triggering commonly used applications:
Utterance | Action |
Browser | Launch default Web browser |
Web | Launch default Web Browser |
OK Google Now | Launch Google Now |
Search | Launch Voice Search |
Voice Search | Launch Voice Search |
1.5 Local Voice Commands
In addition to the global commands that are available all screens, Just Speak allows you to interact with on-screen controls in a variety of ways.
Command | Utterance | Synonym | Action |
Activate | Click <control name> | Click, Tap | Activate control by its on-screen name |
Scroll | Scroll Up/Down | Forward, Backward | Scroll e.g., Lists. |
Switch | Switch On/Off Toggle | Toggle Switches | |
Long Press | Long Press | Long Click, Long Tap | Long Press on on-screen controls |
Check | Check | Check, Uncheck | Toggle CheckBox values |
1.6 Labeling On-Screen Control
When using Just Speak, the text labels that appear next to on-screen controls determine the set of available local commands. For many controls, such as images, checkboxes, and switches, there may be no visible text to associate with the control. In these instances, Just Speak uses underlying Accessibility Metadata provided by the application developer to construct relevant labels; note that this metadata is also used by Accessibility Services such as TalkBack to meaningfully speak on-screen controls.
We leverage the visual overlay that indicates that Just Speak is active to visually display this additional metadata — this serves as a hint as to what you can say to activate the available controls. These overlay labels take one of two forms:
- Controls receives a centered label in the simple case
where there are no actionable child controls that need additional labeling.
- Where the control itself has actionable children,
Just Speak displays a labled frame around the actionable children.
1.7 Chaining Commands
Just Speak can be configured to take multiple voice commands at once and perform them sequentially. This chaining works with both local and global commands, performing the preceding action after the previous action has been executed. Commands are chained via simple connectives such as “and” and “then”. An example of this would be
“Click confirm and then go home”.
In this case, Just Speak would click the “Confirm” control (assuming it’s present) and upon completion of that task, return to the Home screen.
1.8 Easy Labels
Another configurable setting in Just Speak is the ability to replace the labels associated with on-screen controls with easy labels. These are phonetic labels that are designed to be unambiguous and are displayed in the overlay over their respective controls, easily identifying what to say to interact with a specific control. Note that phonetic labels can be temporarily activated or deactivated via the Just Speak global command Easy Labels.
1.9 Persistent Overlay
To aid users with limited dexterity, Just Speak also provides the option to make the overlay persistent, capturing all touch events received by the Android device. This effectively removes all traditional interactions a user can have with their device, replacing it with Just Speak functionality. The benefit of this is that the entire device essentially becomes a button, toggling between initiating and terminating voice recognition. In addition to this, overlay labeling is always present, allowing you to constantly be aware of what you can say to Just Speak.
1.10 Alternative Means Of Initiating Voice Recognition
We are continuing to experiment with alternative ways of initiating voice recognition. Toward this end, Just Speak enables you to configure an NFC tag that can then be tapped to initiate voice control; see “Just Speak -> Settings” for configuring an NFC tag.