Skip to content

Mobile ICP 6

Evan Wike edited this page May 7, 2019 · 1 revision

Mobile ICP 6 (Medical Assistant)

Name: Evan Wike (#32)


I. Introduction

For this ICP we were tasked with creating an Android app that could utilize Android's Text to Speech and Speech to Text capabilities. The app should be able to recognize, and respond to, several commands. As is often the case when dealing with Android, you also have to deal with permissions - i.e. checking to make sure the permission to use the microphone has been granted and asking for it if it doesn't. Once you have permission to use the mic, you need to set a Text to Speech intent and make sure the device supports it and the language being used.

II. Objectives

To create a "Medical Assistant" Android app that can handle a few basic questions and generate appropriate responses:

Questions and Responses

  • "I am not feeling good. What should I do? - "I understand, what are your symptoms?"
  • "Thank you, Medical Assistant!" - "No, thank you, {name}. Take care!"
  • "What time is it?" - "It's {current time}"
  • "What medicines should I take?" - "I think you might have a fever, you should probably take Aspirin."

III. Methods

private void startVoiceInput() {
        Intent intent = new Intent(RecognizerIntent.ACTION_RECOGNIZE_SPEECH);
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE_MODEL, RecognizerIntent.LANGUAGE_MODEL_FREE_FORM);
        intent.putExtra(RecognizerIntent.EXTRA_LANGUAGE, Locale.getDefault());
        intent.putExtra(RecognizerIntent.EXTRA_PROMPT, "Hello, How can I help you?");
        try {
            startActivityForResult(intent, REQ_CODE_SPEECH_INPUT);
        } catch (ActivityNotFoundException a) {
            Toast.makeText(this, "Activity not found", Toast.LENGTH_SHORT).show();
        }
    }

This function creates a new intent to start receiving and processing speech from the user. It will attempt to start the activity if it can, resulting in the invocation of the onActivityResult callback function.

@Override
    public void onInit(int status) {
        switch (status) {
            case TextToSpeech.SUCCESS: {
                int result = mTts.setLanguage(Locale.US);

                if (result == TextToSpeech.LANG_MISSING_DATA || result == TextToSpeech.LANG_NOT_SUPPORTED)
                    Toast.makeText(this, "English (US) not supported", Toast.LENGTH_SHORT).show();
                else
                    mTts.speak("Hello", TextToSpeech.QUEUE_ADD, null);
                break;
            }
            case TextToSpeech.ERROR: {
                Toast.makeText(this, "TTS initialization failed", Toast.LENGTH_SHORT).show();
                break;
            }
        }
    }

This function is called when the Text to Speech intent is first set. It checks to make sure the device is capable of Text to Speech and that the device supports the language being used.

@Override
    protected void onActivityResult(int requestCode, int resultCode, Intent data) {
        super.onActivityResult(requestCode, resultCode, data);

        if (requestCode == REQ_CODE_SPEECH_INPUT) {
            if (resultCode == RESULT_OK && null != data) {
                ArrayList<String> result = data.getStringArrayListExtra(RecognizerIntent.EXTRA_RESULTS);
                mVoiceInputTv.setText(result.get(0));
                respond(result.get(0));
            }
        }
    }

This function is invoked each time the user's speech has been recorded and processed. If it's successful, an "ok" code will be sent and the user's processed speech is received as text. The next step is to do something with it.

private void respond(String input) {
        mHintView.setText("Tap on mic to start");

        if (input.equals("hello")) {
            mTts.speak("What's your name?", TextToSpeech.QUEUE_FLUSH, null);
            mHintView.setText("Tap the mic and say \"My name is <name>\"");
        } else if (input.contains("my name is") || input.contains("my name's")) {
            setName(input.split(" ")[3]);
        } else if (input.contains("not feeling good")) {
            mTts.speak("I understand, what are your symptoms?", TextToSpeech.QUEUE_FLUSH, null);
        } else if (input.contains("thank you") || input.contains("thanks")) {
            mTts.speak(String.format("No problem, %s! Take care.", preferences.getString("name", "user")),
                    TextToSpeech.QUEUE_FLUSH,
                    null);
        } else if (input.contains("medicines")) {
            mTts.speak("I think you might have a fever... you should probably take some Aspirin.", TextToSpeech.QUEUE_FLUSH, null);
        } else if (input.contains("time")) {
            Calendar now = Calendar.getInstance();
            int hour = now.get(Calendar.HOUR_OF_DAY) == 0 ? 12 : now.get(Calendar.HOUR_OF_DAY);
            String min = now.get(Calendar.MINUTE) == 0 ? "o'clock" : String.format(Locale.US,"%d", now.get(Calendar.MINUTE));
            String half = "AM";
            if (hour > 11 && hour < 23) {
                hour = hour - 12;
                half = "PM";
            } else if (hour == 0) {
                hour = 12;
                half = "AM";
            }
            mTts.speak(String.format(Locale.US,"It's %d:%s %s", hour, min, half), TextToSpeech.QUEUE_FLUSH, null);
        }
    }

This code is the actual meat and potatoes of the program - where the questions and responses are handled. As far as I can tell so far, the Android SDK's Text to Speech utility doesn't support the more advanced features of the other big Natural Language Processors (NLP), such as "intents" and "utterances" - so you have to handle them by manually searching the returned string and using either a switch statement or a giant if - else if - else block.

private void setName(String name) {
        preferences = getSharedPreferences("prefs",0);
        editor = preferences.edit();
        editor.putString("name", name).apply();

        mNameView.setText(String.format("Hello, %s!", name));
        mNameView.setVisibility(View.VISIBLE);

        mTts.speak(String.format("Hello, %s!", name), TextToSpeech.QUEUE_ADD, null);
    }

This function is invoked when the user says something along the line of "My name's {name}." It sets the user's name within the Android preferences, as well as displaying "Hello, {name}!" within the actual activity on the screen.

V. Conclusion

This ICP was interesting, though a bit easy. It was interesting to see how Android handles speech processing - though I ended up being a bit disappointed - mainly because I've worked with other Natural Language Processors before (Amazon, Mycroft AI) and I loved how easy they were to use and how flexible they were. A large program, with hundreds of possible responses, would be near-impossible using manual string matching, or would require you to be a regular expression Ninja. Though, I'm sure there's some library out there that supports this, I probably just haven't found it yet.

VI. Screenshots

Clone this wiki locally