top of page

Multimodal Interaction / Eye Movement / Gesture / Body Motion / Haptic Feedbak

Beyond interface

multimodal interaction design projects 

Why beyond interface?

After VR 3D Sketch project, I realized that current GUI interactive system — Finger-level interaction with visual feedback returned by the small screen devices — can’t satisfy the interaction needs from the human with emerging technologies, like VR and AR. I didn’t mean that it is wrong to keep using these finger-level and visual-rely interaction techniques in VR or AR or MR. What I claim here is that we can find better interaction techniques to enjoy these technologies because they, VR/AR/MR/Tangible Interfaces, have broken the wall between the digital world and the physical one.  They have extended the interfaces from desktop to the whole world. The Interaction limitation caused by the small screen doesn’t exist. We are able to go back to our most natural way to interact with the digital contents. 

 

For the past two years, I did several projects around the topic - beyond the interface in order to explore a natural interactive system where we can go back to our nature to interact with the digital content and virtual world. In these projects, I jumped out of the box of the current interfaces and corresponding interactive system, designed and developed several HCI products that support human’s rich bodily skills, from eye movement, to hand movement and then to body movement. Moreover, I tried out different tracking technologies - audio tracking, inertial tracking, and visual tracking to explore the viability of these interaction techniques. Besides the input part, I also joined to Alibaba NHCI Lab 2018 Summer to explore the power of Mid-air touch and corresponding haptic feedback. 

 

PROJECT

 

2017 VR 3D modeling application (eye tracker + HTC VIVE)

2018 Alibaba DAMO NHCI Lab

2018 (UIST) Gazed-controlled Fetching Robot 

2019 Multisensory Interfaces With Prof. Sile O’Modhrain

 

Eye Movement

Eye Movement interaction

 

Vision is the most important way for our human to access information.

89% - 90% information about the outside world is accessed via human eyes.

Vision data could be extremely helpful for computers to better understand users’ intention.

 

In HCI, Eye-movement-based interaction has the following pros and cons:

Pros

  • Natural and Efficient. What You Look at is What You Get.

  • Easy to learn.

  • Private

Cons

  • Randomness

  • Low tracking accuracy

  • High-level semantic analysis is difficult

 

Typical Eye Moments are Blink, Gaze, Saccade ( discrete ), and Smooth stalking ( continuous ). Based on different types of eye movement, we have three basic interaction techniques: Blink Input, Gaze Input, and Gaze Gesture. You can see from the following diagram how these differentiate with each other.

gaze interaction.png

As you can see from the diagram, the most popular technique is Gaze Input and the one has the biggest potential is Gaze Gesture.  To understand the meaning of eye movement in real HCI products, I did two projects related to Gaze Input and Gaze Gesture respectively. 

  • 2017 VR 3D Sketch application

  • 2018 (UIST) Gazed-controlled Fetching Robot 

 

VR 3D Sketch application

ROLE

Researcher

Designer

Developer

TOOL

HTC VIVE

Tobii Eye Tracker

Unity3D ​

INTERACTION

Gaze to select + Gesture to confirm

In this VR project, I explored the meaning of Gaze Input for selection task. And instead of purely relying on gazing to finish the whole selection task, which could be too overloaded in terms of cognitive burden, I used Gaze Added (Ken Pfeuffer 2016): Gaze to select + Gesture to confirm

  • Gaze provide location information

  • Gesture reduce selection time

gaze vr.gif

The problem of such technique is if the gaze detection is always on, the user may feel overloaded because not all eye movements related to a selection behavior. A potential solution would be providing a gate to control the gaze detection mode. In the next project, I've added an initial gaze state to trigger the gaze detection mode.

 

HERE to see more details of the project.

 

Gazed-controlled Fetching Robot

ROLE

Researcher

Designer

Developer

TOOL

ARKit

iPhone Camera

Ardunio Module

Makeblock Robot

Xcode

INTERACTION

Using eye movement to control the movement of a robot.

In this project, we used the iPhone camera to gather the direction information of eye movements. Beyond the limitation of the small screen, users are allowed to use their Gaze gesture to control a Robot to fetch a snack in the physical world. The following images show you how the interactions work:

interaction.png
gaze state.png

For avoiding endless gaze detection that could be overwhelming for users, this interaction solution defines three gaze states, as shown in the following figure:

One of the takeaway from this project is that such gaze interaction can not only improve the functional part of the interaction but also add some humanity flavors to the intelligent devices we interact with by empowering it to understand human's eye contact.

HERE to see more details of the project.

Body Movement

body Movement interaction

 

Even though vision is the most direct way for the computer to understand our intentions, our current HCI system is too rely on the visual part and thus neglect the power of our natural bodily movement. It may be caused by the following reasons: 

  1. Most devices don’t allow too many freedoms for us to move.

  2. Tracking technologies are not powerful enough to allow computers to understand such rich human bodily expression. 

However, the development of tracking and machine learning technologies speed up the progress of computer understand us and the advent of VR/AR technologies triggered the desire of our digital users to have conversations with the computer via different parts of their bodies.

 

In the following project, I explored different body-involved interaction techniques using different tracking technologies to figure out how can we get around with the virtual world via our functional, semantic and aesthetic body movements. These applications make me realize that it is a little bit challenging for human to learn body movements compared to pointing, touching or scrolling on a touchpad. But the outcome of such interactions is highly rewarded because they don’t require high accuracy and could become totally subconscious after proper training. What’s more, there are some methods to smooth the learning curve of such body-involved interactions: 

  1. Provide a good metaphor that helps users connect this new movement to their existing mental model. Thus, they can understand it quickly.

  2. Provide sufficient feedback and instruction. Interaction design is like designing a conversation. The computer we interact with should not be a passive message receive. Instead, it would be better if it can actively give us feedbacks or actions. The similar case is you will learn Yoga movements quicker if you go to a Yoga class rather than watching some tutorials on Youtube videos.

This system used shape drawing behaviors to trigger different sounds by capturing different audio signal caused by stick scratching on the surface.

 

Human input / Feedback: Drawing line will trigger cello sound playing while triangle cause bell knocking. It maps the feeling of shape with the vibe of the sound; Line feels mild while the triangle feels tinkling.

Metaphor: Drawing some simple shapes on a surface is a movement that we’re all familiar with. Thus, users can easily learn and replicate the interaction. 

shape-aware surface interaction

ROLE

Designer

Developer

TOOL

Max

Microphoe

INTERACTION

Drawing different shapes on table surface will cause different sound feedback

drawing metaphor.png
shape.gif

HERE to see more details of the project.

Mid-air drawing

ROLE

Designer

Developer

TOOL

Unity 3D

IMU of phone

INTERACTION

Drawing on a digital canvas via natural drawing behaviors in the mid-air

The system supports the natural movement of drawing with flat brushes. I utilized the IMU system on the mobile phone to capture the movement of the human hand. The phone is attached with a flat brush, which provides a good affordance with users so that they can effortlessly know how to use this tool to paint on a digital canvas. 
 

Human input/Feedback: Move the brush to draw. The system will simultaneously render the movements of the user’s hand on a digital canvas, so it seems like you are painting. What’s more, the system provides sound feedback to remind users of the direction of movement.

Metaphor: With the knowledge of how to use the flat brush on the real canvas, users can easily learn how to use this system to draw in the mid-air with the most natural body movement without the limitation of a small screen. The system will simultaneously map the movements of the human hand on the digital canvas. 

mid-air.gif

HERE to see more details of the project.

Gesture controlled concert

ROLE

Designer

Developer

TOOL

Unity 3D

Leap Motion

INTERACTION

Lighting band is a gesture-controlled music and lighting generation application. In the project, I focused on the natural gestures interaction design; the gestures are not only functional but also expressive.

One of the biggest challenges of building gesture interfaces is on high-level interpretation. The interface system isn't necessarily easy to learn but should effortlessly replicate after users learned it. The approach to building a comprehensive gesture interface system is to find a proper gesture mental model and apply it to proper tasks. Therefore, instead of directly using these conducting gestures into my application, I first did some researches on the mental model behind the conducting gestures system. My question is why the conductor uses this gesture and how I can abstract these gestures and apply them to a general HCI scenario.

 

The core function of these conducting gestures is to offer clear signals to the musicians about important playing-related points in real-time music playing: the starting point, the end pointing and the beats. These gestures are required to be obvious enough among the sequential movements of the conductor’s hands so that these musicians can differentiate and then interpret them correctly even from a long distance away from the conductor. The tricks conductors used to differentiate these gestures from others are using different hand shape, like palm face front indicating starts while palm face up means stop. Thus, in the game interaction design, I used this trick for selection and on/off tasks; the user uses two different hand shapes of the left hand — palm face down and up — to control the beating music clip play and stop. These two hand shapes are so differentiated with each other so that users won’t easily get confused, so does the computer. Plus, there are three beating clips that are located in three different locations, which are visually indicated by three spheres with different colors. The user needs to use the direction of the left hand to tell the system which one they want to play or stop. These gestures as a system can still be naturally learned and performed by users in real-time.

 

Besides these functional gestures, the conductor performs some expressive gestures while interacting with the ensembles. For example, beyond the basic hand movements for music beat, they will use dynamic hitting stress for each beat to communicate their understanding of the music with the musicians: whether the music is edgy(hit hard between beating points) or smooth. Moreover, their hand movement range will indicate the overall feeling of music playing like vigorous or elegant. Similarly, in this music and lighting generation game, the right hand takes in charge of speed control. The movement speed of the right hand will cause the speed of music playing and the intensity of lights changed correspondingly. A bigger range of the hand movement will lead to more intensive lighting change, which creates a vibe that matches the player’s potential emotional expression.

HERE to see more details of the project.

Birdman is a flying simulation application incorporating an optical motion tracking system, which allows users to imitate bird flying movement to control the virtual camera movement. Big projection canvas in front of users displays a moving forest scene that provides players with the flying illusion.

 

Human Input / Feedback: Users can explore the virtual forest by controlling their flying speed, height, and orientation via changing their arms’ flapping speeds and height as well as the orientation of the body. The applications provides the first person view that would give you the flying illusion. Players can imagine themselves as a bird flying among trees in a big forest. The realistic forest scene will change dynamically as users are flying, which improve the enjoyment of the game playing process. 

Bird man

ROLE

Designer

Developer

TOOL

Unity 3D

Qualisys​ Motion Traking

INTERACTION

Flying like a bird

bird.gif

Metaphor: To make the interaction as natural as possible, we fully consider our natural spatial and movement system as well as the bird's flight mechanism. Finally, we designed the following interaction mechanism: 

  • The speed of arm flapping controls the flying speed. More quickly you move your arm up and down, higher speed you will get.

  • The range of arm waving controls the height of flying. The larger amplitude will lead to fly higher. 

  • The body direction controls the flying direction. Turning your body right (left) causes the flying direction turning right (left) simultaneously. 

Such seamless mapping makes the learning curve of our application is not steep at all. ​

bird interaction.gif

HERE to see more details of the project.

 

Takeaway

As a designer, we should respect users’ mental modal, which many of them are inherited from the 2D flat screen interaction system. But more importantly and meaningfully, we should encourage and support users to explore their potentials as a creature that has their whole body instead of only head and hands. These projects proved to me that good metaphor and sufficient feedback are good ways for users to overcome their resistant to the whole body involved interaction. 

 

In these projects, I also found it's important to find proper tracking technology to support certain body movement input. For example, the Microphone is good for shape detection but inertial and visual tracking have better performance in such behavior detection.  In the Bird project, the motion tracking is great but the system is limited to devices and location, which can't be used in daily life. Compared to such a motion tracking system, the camera on our phone could be a great daily life alternative.

 

These body movements could be functional, semantic and even aesthetic, and once you learn it, it will release some cognitive loads that are required by the current GUI system.

Haptic NHCI Lab

haptic and

naked eye 3d display

Alibaba nhci Lab

summer intern

 

It's not enough we as the human can be as natural as possible to express our thoughts to the computer. The computer should be able to express what it wants us to do and what it understands through proper feedbacks. 

 

"Interaction design is designing a conversation between users and computers."

 

2018 summer, I worked with Alibaba NHCI Lab in the Bay Area, as an engineer intern and research intern to explore the meaning of natural human computer interaction (NHCI) for a shopping application. 

屏幕快照 2019-04-10 上午8.48.58.png

The system includes three parts:

  • SeeFront 3D monitor: Autostereoscopic 3D monitor - See Front can track user’s eyes to project different views for left and right eyes, so the user could see 3D objects pop out the screen without wearing any glasses or headset.

  • Leap Motion: support gesture interaction.

  • Ultrahaptic: This device uses ultrasonic to provide mid-air haptic feedback.

 

From the study, we found that haptic feedback provided by Ultrahaptic device when the user touches the virtual object can improve their judgment of the position of the virtual object. However, they think the haptic feedback is weird to comprehend, it's like wind instead of touching something. Indeed, from the functional perspective, this feedback may increase the accuracy of selection, however, the interaction between the computational system and human are so unnatural that users don't like it at all. What's more, the system is powerful enough to support users using their hands and gaze to interact with digital content, but these features can't make people feel natural either. 

 

This project raises questions to me: 
What is a good interaction system? 

What is the aesthetic of the interaction system?

What is natural human-computer interaction?

 

For me, I think Interaction design is designing a conversation between users and computers. A good conversation should fully consider participants’ needs, not only for users but also for computers.  I think in the realistic 6DOF 3D world, we can go back to nature, release our nature and also more meaningfully, improve nature.

 

To be continued...

 

To figure out these questions, let's go back to the origin of human-computer interaction or human-product interaction.

⭐️2D Interface and Interaction Design

 

bottom of page