by Shaunie Shammass, speakingpal.com
I would like to tell you about our journey of innovation, creativity and implementation in creating a self-help teaching mobile tool for the speaking skill. I would like you to follow our thoughts and decisions as mobile app developers and see what the production of such a tool entails. I am a former teacher, and know the dream list. But as a member of a team trying to build the dream, we run into reality. It is this bump into reality that I would like to share with you here.
What’s the Problem?
In a global context, the asset of being able to communicate well in English, the international language, translates into tangible economic and societal benefits. There is such a strong need, and yet the speaking skill is one of the hardest skills to teach, monitor, and improve. Why? Well, how many of you have taken phonetics? How many of you have naturally good ‘ears’ that let you hear the difference between foreign accented sounds and native ones?
And if you can hear these differences, how many opportunities would you, as teachers, be able to provide your non-native students with effective feedback? Do you stop your student at every mistake? How do you encourage them to practice? How can you motivate them to speak out loud and get over the feelings of embarrassment?
What’s the Dream List?
What would you like? Here is what I think is on the dream list. And please feel free to add some suggestions of your own.
- Motivate my students to talk openly and freely that feel less ‘contrived’
- Monitor my students’ speech output because I cannot possible concentrate on every student, every sentence, and every word that is uttered
- Provide timely feedback
- Provide scores that are meaningful
- Show my students how to improve
- Monitor my students’ improvement and show progress in a meaningful and understandable way
- Provide customized teaching for each student
- Make pronunciation practice less drill-like
The Choices Made
We decided to build a self-help mobile tool that has a video simulation of talking to a person, where you actually talk to a video character in a role play situation and your voice is recognized automatically by automatic speech recognition (ASR). You are then given stars, like in a game, on your speaking performance, and you can see each word that you said color-coded (green=good, red=you can improve, black=ok) so that you can zero in on which words need to be improved. In some cases, you drive the dialog by having two choices to read, and the video character responds accordingly, based on the ASR.
You can review your recorded speech and compare it to a video of a character saying the same utterance. We upped the content by providing vocabulary help, quiz games, translations of the scripts and muppet-style video demos of how to make English sounds, performed by a pantomime artist!
Hello Real World!
We wished that we could put everything in the dream list in the first version, but realistically, there was so much content and technological development right from the outset that we were forced to make choices. So, here are some of our considerations:
- Get something out to market that works with a reasonable time frame and cost structure but is highly scalable
- Use technology to help, but look realistically at the constraints
- Create content that is made for mobile from the get-go
- You don’t have to copy what is already out there
- Think of the user experience at all times
- Keep up with the trends
1. Go to Market
We decided to go the simulation route because handling this problem using real teachers is simply not scalable. For the simulation, we decided to use videos of real actors with ASR in order to provide a highly interactive user experience. Videos needed to be made compact and yet not lower the quality. ASR needed to work within the limits of a natural conversation so that user experience wasn’t compromised and yet be scalable to handle all of the users’ interactions. Much work was done on maintaining the look and feel of a natural conversation. It was paramount that the app flow should run smoothly. In addition, we had to cope with a fragmented mobile market by trying to support as many devices as possible.
2. Real Technology, Not a Wish List
With all this in mind, it was decided that set utterances would be used rather than allow for free speech. This eased the ASR task and allowed us to monitor the app more easily. So, scripts for the actors were made based on forced choice, and not open ended questions. We thought of using text-to-speech with avatars, but decided against this – again thinking of how to recreate a simulation of a conversation that would be felt as being more ‘natural’ and create a better and more interactive user experience.
We decided to provide feedback on every sentence and word, but not on every phoneme or sound. The UI for doing so would have been complicated and we felt that color coding each letter or phonetic symbol would not be understandable to students. If anything, we wanted to create instant scoring that was as simple as possible. This meant creating a layer of scoring logic over the ASR output that reflected broader ranges of acceptable/non-acceptable utterances, and not constrain users to speak with a pure native accent in order to get a good score. Our scoring logic was pre-tested on non-native speakers and tweaked to reflect ‘intelligible’.
3. Content – Bring on the Games, the Impatience Factor, and Give Me More
Bring on the Games!
We knew that a new type of content needed to be created. We are a video driven society now. Multimedia is taking over print and gaming is taking over everything. Interactivity is key and user experience is king. Our thoughts on this as educational app developers were very clear – we can get students to talk openly and freely if they feel that they are in some kind of interactive, game-like scenario which is fun, has some humor and doesn’t take a long time. And we can leverage this for educational purposes. We used real actors in videos so that the user would feel like as though he/she were talking to a person, and not simply interacting with content.
The Impatience Factor
Simply put, nobody wants to wait these days. Immediate responses, immediate feedback, immediate interaction – these are how our impatient users and learners want things to be. And heaven forbid it should take a long time. From a design point of view, this means that we needed to take into account several things:
- The lessons must be packaged so that they upload fairly quickly, and yet do not compromise on video/audio quality
- Each lesson must be able to be completed in a few minutes, but maintain flexibility to expand in ‘play-time’ as long as the student wishes to learn (just like in a game)
- The ASR must respond in time, no later than the natural response time of a real person
- The feedback must be provided quickly and be easily understood. Our first version used traffic circle colors (green=good, yellow=ok and red=you can improve) on each utterance and on each word. Our new version has gone even more game-like and provides one star (improve), two stars (ok) or three stars (good) for the utterance, but maintains the color-coding on each word.
Give Me More
The more we give, the more you get. In the first version, there was no vocabulary help or translations. The second version provided both, while upping the number of quiz questions. In our third version, we tried to provide a more gaming experience. In the process of doing so, some of the quizzes needed to be postponed for later versions, where we want to add them ‘into the game’. For each of these developments, the tech staff needed to seriously revamp things, and each change that you may think is minor can be quite major in terms of the actual app development. So, while we want to keep adding, sometimes there are compromises to make, and in the end all, we do want to provide as much as possible given our own limitations. Of course, everyone wants it all for free, which of course is impossible. But we do provide some free lessons from each package, along with unlimited automatic feedback for a full year. And we are constantly thinking of what we should add to the app for the next version.
4. You Don’t have to be Copycats
It is hard to define how to be innovative. In a sense it is like being unruly, and never wanting to follow the rules. If all the other English learning courses do something in a certain way, you do not have to be like them. It is good to pick out the best points and use some of them, but we decided to ‘go our own way’ and basically define the product as we saw it without imitating other English language learning solutions.
Our focus is different – speaking. Our UX is different – interacting with the look and feel of a person, not content. Our approach is different – get right in to speaking without first going through learning all of the words and sentences (but have aids to help you like vocabulary and translations if you need them). Finally, our teaching methods are different. For example, we never explicitly teach grammar, but have ample examples and quizzes where you can learn correct structures indirectly (like an immersion style).
And for heaven’s sake, don’t reuse things that hardly looked promising in books or haven’t worked well online. A good example of this is the repeated use of anatomically-based graphs (pictures or animations) of sagittal head sections for demonstrating phonetic sounds, or displaying acoustic waveforms that look neat but are undecipherable by students. We decided on a wholly new approach. Why not use a pantomime artist who uses his hands ‘muppet-style’, with a pink sock representing ‘the tongue’, to demonstrate how English sounds are made? It makes it fun, and as I observe people watching the videos, I notice that they immediately try to reproduce the sound. The play is everything.
5. UX, UX, UX
Our whole design is based on creating the best user experience. It is not the technology alone. It is not the content alone. It is the judicious coupling of the two with the user experience kept in mind at all times. Want a good UX? It better respond in time. Want a ‘sticky’ UX? It better be fun and easy to use. And a little humor added in never hurt. Want a motivating UX? It better seem natural and let you feel like you are playing a game. Want an educational UX? It better deliver real learning content that you can understand, practice, get feedback, review and see your progress.
Like I said before, UX is king.
6. Trendy ?
Ok, so education has had trends in the past. But now everything is trend based. And the new trend is for games. The new trend is to learn without feeling that you are actually learning. The new trend is to make your own motivation and share and pass it on to friends. Find out things yourself. Be social. Play. Communicate. I can’t say that we have been able to keep up with all the trends, but if things used to change every 10 years, and then every 5, now they are changing every few months or so. New apps. New ways. New competition. New designs.
Here is a simple example regarding design. Uploading used to be designated with a filling bar. Now it is a filling circle. It sounds little, but in this world of trends, if you don’t take note of these details, you’re not in touch. If the look and feel of your app looks even a few years old, with graphics and sounds that were popular then, you are not ‘in’. Where does that leave us in the educational context? We need to make careful choices. We need to play the game of going with the trends but not at the expense of compromising on real educational and learning value.
And One Last Word
This whole process of defining, creating, designing, implementing, and getting a product out to market, while constantly adding and creating new versions is a fascinating journey. Many choices. Many compromises. Many decisions. But the end result is very satisfying and I can, in all honesty, say that it’s been a very enjoyable ride!
Dr. Shaunie Shammass is VP Linguistic Innovation at SpeakingPal Ltd and is responsible for content development and pedagogical innovation. As a trained phonetician, she has expertise in creating linguistic resources for automatic speech recognition and e-learning applications, along with many years of university teaching experience; How To Make An Educational App: What I Learned