Smartphone is arguably one of the most successful electronic devices in modern times, and it should be safe to say that its triumph has a lot to do with the appearance of the first iPhone announced on June 29, 2007, in which the way how people interact with their mobile devices has been completely reinvented. As a matter of fact, such way based on our index finger and a touchscreen is so intuitive and works so efficiently that the design of the smartphones today remains essentially unchanged – a thin and tiny rectangular pad with most of its area covered by a touchscreen.
Another efficient way to interact with electronic devices that haven’t changed much is the use of (QWERTY) keyboards and mice, which were first invented in 1860s and 1960s, respectively. Although many new concepts have come out and are aiming at replacing these traditional input gadgets (e.g., Project Precog by ASUS or TAP the wearable keyboard), the dominant position of QWERTY keyboards and mice remains for most computer users.
Video 1. TAP the wearable keyboard (source: TAP official YouTube channel).
Video 2. Project Precog by ASUS (source: Engadget official YouTube channel).
To a fast-changing field like technology industry, such invariance in design has inevitably bored many manufacturers and customers, who are now wondering: what will a future electronic device be like? And with the development of some new techniques (artificial intelligence especially), the answer to the question may be just around the corner.
In this article, we will break down the advantages and disadvantages of all the methods we can think of with respect to interacting with an electronic device. Then, building from such discussion, we will try to picture the relationship between users and their devices in the future. But before that, we must first talk about what are the important factors that define a good interaction experience, and we will do that by learning from the champions today: keyboards, computer mice and touchscreens.
What define a good interaction experience?
So, what make “touching with a finger (especially index finger)”, “typing on a keyboard” and “clicking with a mouse” such a preferable way for interacting with our electronic devices? The answer lies in the following five aspects:
For starters, it’s a natural tendency for us to manipulate objects with our hands, and touchscreens / keyboards / mice go along with such natural habit of ours. This is especially true for touchscreens and mice, in which all the pointing and clicking can be done in the most intuitive way. And although one must undergo some training before he or she can type with a keyboard rapidly and smoothly, the learning process is fairly straightforward (in comparison with methods such as some brainwave controllers that require their users to learn how to switch between the states of “focus” and “relax”); and even for those without any training, a keyboard is not something so complicated that one cannot figure out how to use by themselves.
Second, they are very convenient. Touchscreens, once again, have significant advantage over the others: an individual does not have to use anything but his or her finger(s), one of the most natural pointing devices for human being, to interact with it. As for the cases of keyboards and mice, despite the fact that a medium device is needed, such device can be obtained in a rather low price, and their design allows one to use them effortlessly and instinctively.
Third, they are very precise. In other words, whatever you touch / type / click is whatever you will get; there is no room for fuzziness and speculations (i.e., the computer does not have to guess what your command actually means). This is not true for certain interacting methods such as voice control, where the device must first transform the waveforms coming from your month into words and then find out the correct interpretation of them; all the above processes must rely on some degree of inferences, and it can decrease the reliability of the method since these inferences may go wrong.
Forth, they are insensitive to noises. Since a touch screen, keyboard and mouse will only respond when you physically touch them in a particular location, there is only a little chance for their input to be seriously affected by outside disturbances. The methods such as BCI and voice recognition, on the other hand, are taking in both signals and noises from outside within a certain time window, and that unavoidably decreases the chance for the machine to correctly interpret the user’s command.
Last but not least, they are very safe. You don’t have to implant anything into your body in order to use a touchscreen, keyboard, or a mouse. But for certain approaches, like brain computer interface, this may not necessarily be true.
Now that we have a better understanding about why touchscreens, keyboards and mice are such a powerful idea, we can now turn our attention to other potential methods for interacting with a device, including voice control, brainwave detection, brain computer interface (BCI), eye-tracking, gesture control, and approach based on muscular signals.
Voice Control – Very useful in special occasions
While manual-based input devices such as a touchscreen, keyboard and mouse are convenient for most of the situations, they may become impossible to use in some special cases; for example, both of your hands are occupied with groceries, or you’re behind the wheel of a non-self-driving vehicle, etc. For scenarios like these, you do need an alternative way to interact with your device without using your hands, and that’s where voice control comes in.
The biggest advantages of voice control are that, as using a touchscreen with fingers, this approach is also very intuitive (on the premise that the devices involved can process natural language), and one can control his or her devices without interruption even when their hands are currently engaging in another task. Nevertheless, due to the following reasons, voice control is significantly less convenient than touchscreens / keyboards / mice in other occasions (i.e., when users’ hands are free), and therefore it cannot be a perfect substitution for them:
Number one, it’s harder to keep your privacy with voice control. By definition, with this technique, you must speak out your command in order to trigger a particular action or get certain information from your device. Thus, if you’re in a public area, you may find yourself having difficulties dealing with your private matters.
Number two, one may encounter trouble when typing professional terms, people’s names, and extremely infrequent words. You may have the following experience: your device in voice-control mode got what you said right in the first place, but it then supersedes the result with a wrong but more common word or phrase. The reason for this to happen is that voice control is usually backed up by a database that records word frequency as well as how often words are used together. This database allows a device to infer what the user actually said if the speech data isn’t processed or received properly. And since the database is based on prevalence, one can anticipate that it will work in most conditions, but not in time when rare or unusual data is involved.
Number three, voice control may not work well with loud background noises since it involves cocktail-party problem (aka multi-talker speech separation, that is, to extract the speech from a particular individual from a cacophony), which is not an easy issue to conquer.
Number four, voice control may not be able to perform well when the user has very heavy accent or unique way of pronunciation. Granted, there are deep-learning models out there that can manage this issue quite well (e.g., Google Duplex), but note that models like Google Duplex are very sophisticated, and most of the voice recognition algorithms today may not be at that level just yet.
Number five, since making sound is a must when using voice control, this technique is not for places where silence is preferred or required. Such places include library, office, some laboratories, and so on.
Finally, voice control may not be good at handling incorrect input. Imagine writing an article with voice control, and you accidentally misspoken a word (or the voice recognition algorithm misunderstands your speech). It’s very likely that you will have to turn to a keyboard in order to correct the mistake since there is no easy way to do so with nothing but voice command.
Brainwaves – Either a very good or very bad idea
As we know, our mental functions are realized by neurons exchanging electrical impulses within our brain, and such neuronal activities will give rise to something called brainwaves that can be captured by special sensors (i.e., electrodes) tapping to our head.
Brainwaves come in different kinds. There are at least five most common waveforms, and they are named by Greek alphabet: gamma (which is related to an extremely active brain), beta (active brain), alpha (restful brain), theta (drowsy state) and delta (sleeping). The simplest brainwave controller takes advantage of these distinct waves (alpha and beta in particular because they’re related to a waking brain that are not in an extreme state), turning them into triggers of different device actions. For instance, whenever the controller detects alpha waves from the user’s head, it makes a remote-control car move forward; and whenever it detects beta waves, it has the car stop moving, etc.
To us, whether brainwave control will become a dominant way for users to interact with their devices in the future or not depends on the following factors:
First of all, how sophisticated the commands can one send through brainwaves? The previously mentioned go / no-go controller is obviously too simple for most practical use cases. If such technique continues to function on the basis of detecting distinct brainwaves, then its applications, we believe, will be very limited. However, if one can send complex information (such as a sentence, a melody, or an image) by using nothing but his or her brainwaves, such technique may be more promising than brain computer interface because it’s not intrusive. Although this seems unrealistic, the idea is not totally far-fetched since there is study indicates that reconstructing a person’s mental image from his or her EEG data is possible.
Video 3. Researchers from University of Toronto Scarborough (UTSC) have developed a method to reconstruct mental images from one’s EEG recordings (source: UTSC official YouTube channel).
Second, can we make a marketable EEG recorder that one can comfortably wear? Note that we are not talking about those simple brainwave detectors that can only capture alpha or beta waves; we are talking about a fully functional EEG recording device similar to the one you see in Video 3. As you can see from the clip, the current EEG cap is not very aesthetically pleasing. Also, to ensure the quality of the recorded signals, researchers will usually have to put conductive gel on the head of the cap wearer, which I can assure you is not a very enjoyable experience. So, if brainwave-controlling technology wants to be prevalent in the future, such issues will have to be solved first.
Third, how sensitive the sensors for picking up one’s brainwaves can be? If the sensors, or electrodes, are not sensitive enough, the controller will be hard to use. Nonetheless, if the electrodes are too sensitive so that it is capable of picking up one’s thoughts without even making contact, it can induce privacy concerns.
If any brainwave-based I/O device can overcome the problems said, then it can be a capable, non-intrusive substitution of brain computer interface, which will be introduced in the next section. Otherwise, its usage will be very limited and might soon be replaced completely by BCI instead.
Brain Computer Interface (BCI) – The method for true futurists
Brain computer interface is considered by many, including us, to be one of the most futuristic ways to interact with an electronic device thus far. With this technology, one can connect a device directly with his or her cerebrum, and control it solely with his or her will. But for a technique like this, it’s only natural that people will need some time to feel comfortable around it.
To us, the most important value of brain computer interface, or BCI, is that it opens the possibility for a flesh-and-blood human being, such as ourselves, to significantly enhance our own abilities (both physical or mental) by merging with machines, which is an interesting solution for keeping artificial intelligence from hurting or supplanting mankind with some sort of superintelligence that they may acquire in the future. Of course, some may consider this to be dangerous or even “outrageous”. But don’t forget that many inventions were deemed to be menacing or “evil” when they were first created as well, including Galileo’s telescope (which was said to be an instrument of the Devil), trains (people at that time believed that traveling at the speed over 30 miles per hour would melt your body), telephones (again, an instrument of the Devil according to some preachers), and so on. Needless to say, they all turn out to be beneficial for mankind.
It’s likely that BCI will first take off in medical field, as it has already helped many patients who lost one of their arms or legs (see Video 4; be cautious that that there is a bloody scene in this clip). Then, maybe the ripple will spread to the regular people, starting from those true futurists, who will like to use such technology to “upgrade” their capabilities instead of treating existing conditions. That said, the fact that such technique is intrusive can discourage most people, and that may become the biggest roadblock for BCI to be widely accepted.
Video 4. The prosthesis developed by the researchers from California Institute of Technology (Caltech) was directly connected with its user’s brain so that the user can control it with his mind. Some prostheses even provide feedback signals to the brain, allowing one to actually “feel” the tactile sense (source: Wall Street Journal).
Stylus – For people in creative industries
When Steve Jobs introduced the first iPhone in 2007, he used the word “yuck” to describe a stylus. The message here is that our fingers are way more intuitive than a stylus (you’re born with them), and you can never lose them under normal circumstances. Also, a stylus needs to be charged, while your fingers don’t. However, we must say that there are occasions where using a stylus can be a better choice than using our own fingers, and those occasions are when extremely precise pointing is required (e.g., drawing), and when your hands are expected to be dirty (e.g., check online recipe during cooking). Unfortunately, except for the scenarios aforementioned, a stylus is totally dispensable. Although some have tried to give a smartphone’s stylus extra functions to expand its usefulness (like the S Pen from Samsung), the idea is not strong enough to shift the paradigm at least for now.
Eye-Tracking – From medical aid to subtle (yet potentially helpful) applications
In the past, if you are unable to carry out any voluntary action and make any voice with your vocal cord due to certain medical conditions (e.g., locked-in syndrome: patients are fully conscious but unable to move all of his or her voluntary muscles except for those controlling eyes’ movements and blinking), an eye-tracking interface may become your only hope to communicate with your surroundings. With the development of brainwaves, BCI and the muscular-signal-based measures that will be introduced soon, however, eye-tracking may gradually fade away from the stage of healthcare applications.
Due to reasons below, eye-tracking is destined to have very limited usage. First, it will take some time to get used to. For instance, most eye-tracking interfaces take an eye blink as a click of a mouse. As we know, however, eye blinks can be very hard to control sometimes. Second, you may encounter trouble while there are more than one person in the detecting range of an eye tracker. Third, eye-tracking requires the user’s eyes to focus on the screen for a long period of time, and it may cause one’s eyes to fatigue very easily (it may ruin your vision as well).
With that being said, if used properly, one can achieve the effect that feels like true “mind-reading” with an eye tracker. For example, it has been demonstrated that our gaze tends to stay longer subconsciously on the objects that we’re interested in. Thus, by allowing a device to record the user’s gaze data, we may be able to build an app, say, that can accurately predict to whom in a group of people that you are most fond of by showing you a picture with that particular group of people and monitoring how your eyes examine each person.
Gesture Control – The brother of touchscreen with more possibilities
Gesture control is a very interesting idea, and the movie, Minority Report, has arguably one of the best demonstrations of how this technology can work:
Video 5. Gesture control interface using a pair of specially designed gloves in the 2002 movie, Minority Report.
Usage-wise, gesture recognition is very similar to a touchscreen: they both allow you to interact with your device using your hands in the most natural ways. However, the former one obviously takes more information (e.g., the shapes or actions of hand, depth data, and so on; some gesture control even detects the pose of your whole body) into consideration so that it enables more complicated interaction between the users and their devices. It’s also important to realize that gesture control may provide more freedom to the users by giving them a giant, 3-dimensional space to work with instead of a tiny, 2-dimensional plane.
One of the biggest values of gesture control to us is that it can be combined with virtual reality (VR) or augmented reality (AR), whose importance is anticipated to grow in the future, and correspondingly upgrading your virtual experiences (you will no longer need any controller, joystick, touchscreen, or other sort of input device).
Despite having these benefits, gesture control shares the similar major drawback with the touchscreen-based method, that is, it’s hard to use when your hands are occupied by another task. Also, the lack of tactile feedback when grabbing a virtual object under VR environment can make some people feel unnatural. Although manufacturers, such as HTC, tries to use an additional device (e.g., a pair of gloves as shown in Video 6) to solve the problem, but the solution doesn’t seem to be very promising so far.
Video 6. A person was attempting to interact with virtual apples using a VR helmet and a pair of specially made gloves. As you can see, despite the fact that the gloves can provide tactile feedback, the interaction still looks a little bit awkward.
Muscular Signal Detection – A potential (and powerful) alternative of BCI
This last method, which is based on the detection of muscular signals (i.e., the tiny electrical signals generated when your muscles contract), is often neglected by the public eye. To us, however, it’s arguably the most promising method out there, which can potentially achieve what BCI is capable of without being intrusive at all.
This technique can go popular even further with the thrive of wearable technologies, such as smart clothes or smart gloves, which can detect muscular signals from the whole body of their users, hence creating more possibilities for how an individual can interact with his or her machines.
Here, we’d like to introduce two very impressive applications of this technique. In the first application, a man who lost his left lower arm is able to control a robotic prothesis by using two armbands wrapped on his left upper arm. The armbands can record the electric signals generated by the muscle contraction, and accordingly allowing him to control the prothesis by “thinking about it with normal, intuitive thoughts”.
Video 7. A man who lost his left lower arm was controlling a robotic arm with his “thought” by using two armbands that can detect the electrical signals generated by the muscle contraction in his left upper arm.
The second application, which is a device called AlterEgo, is even more astonishing, and it’s accomplished by a group of researchers from Massachusetts Institute of Technology (MIT). The idea of this device is as follows: Even when a person is silently speaking to himself or herself without any observable motion, the muscles around their cheek and jaw will still mildly contract and correspondingly release tiny electrical signals in certain patterns. The device, which is attached to the user’s face, is able to pick up these signals and discern the patterns, then translating such patterns back into words. Therefore, the user wearing AlterEgo can send his or her command to a computer through internal verbalizations only; after the computer processes the command, the response, if any, will be sent back to the users through a bone conduction headphone. The video below shows how the device in action looks like:
Video 8. A demonstration of how MIT’s AlterEgo works (source: Fanatical Futurist official YouTube channel).
The reason we deem such method most promising is that it can create the “control with your mind” effect without being intrusive as BCI, and at the same time it’s more realistic than the brainwave-based methods. However, the question remains: is this approach going to replace touchscreens / keyboards / mice and become the major way we interact with our devices in the future? We are staying optimistic here, but the key, we believe, will be that can the new method provide “significant extra values” comparing to touchscreens, keyboards and mice? If it works simply as a substitution of touchscreens / keyboards / mice without providing any additional benefit, it may not be able to create enough momentum to shift the current paradigm.
Conclusions: What will a future device be like?
We have introduced many alternative approaches that one may use to interact with his or her devices in the future. But which one is the best? Well, we believe it’s a wrong question to ask.
As we know, the ultimate goal that most technology companies are pursuing is to create a device giving its user the impression that the device “understands” what he or she really wants. And to achieve such goal, the key, we think, should not be that which I/O method we should use, but whether the device has sufficient intelligence to “sympathize” its user. Note that this will not only require the device to have powerful artificial-intelligent feature but also the ability to capture and process different kinds of signals (e.g., eye movement and gaze, gesture, voice, etc.) from its user instead of relying on a single interacting method, and that’s because the more information that a device can get, the better it can learn what its user really needs, and the less difficulty the users may encounter when using the device (e.g., one can interact with his or her smartphone through a touchscreen in the normal circumstances and resort to voice control when their hands are full). So, rather than thinking that which of the above approaches will be dominant, it may be more reasonable to conceive a way to combine different I/O methods with AI to create a device that can seemingly empathize its owner.
Footnote: Neurozo Innovation provides viewpoints, knowledge and strategies to help you succeed in your quest. If you have any question for us, please feel free to leave a comment below, or e-mail us at: neville@neurozo-innovation.com. For more articles like this, please join our free membership. Thank you very much for your time, and we wish you a wonderful day!
Comments