Xerox Alto, 1973, Developed by Xerox PARC
During computer research at the Xerox Palo Alto Research Labs, Douglas Engelbart discovered a need for the machine to understand his intentions on the graphical interface of the screen.
Doug Engelbart at his famous demo "The Mother of all Demos" in 1968 https://www.youtube.com/watch?v=yJDv-zdhzMY
"I want to interact with THIS object on this graphical user interface."
This is how I imagine his thoughts when using the prototypes in the foregoing GUI versions of the computers.
His thought originated from a personal need to want to do something. The solution ended up in the famous combination of a point-of-reference on the screen, and a natural/intuitive interaction method to alter the location of the point of reference. The translation of the input data enabled the interpretation of the kinetic movement to translate into 2D matrix movement.
Imagine how bad it would have been if we had to move on the x-axis first, and afterwards on the y-axis, to reach a location.
And thus, the first computer mouse was born.
Now concerning AR and VR, we are considering the same communication gap between machine and human intention. However this time, a spatial interface exists.
"I want this object I am staring at to show me the hidden virtual information it has stored."
This, for example, should be made possible without work-around interaction systems.
Could understanding intention manifested in the gaze of our eye lead to an understanding on how hidden virtual information is shown in spatial surrounding objects?
The eye tracking cursor was set at the exact point of where the eyes are assumed to look and appeared on the first surface it touched.
When testing, the cursor just jumped all over the place. Partially because the eye tracking isn't very precise, partly because we are continuously scanning the environment. Using an eye tracker as a cursor will lead to confusion and false-interpretations of intention. Well, at least in the case of how we have built and tested it.
We can assume that future AR applications will have a combination of 2D interfaces and a spatial environment to interact. Maybe not in 10 years, but definitely at some point in the not so distant future.
The problem of defining a cursor in 3D, is that the vector from your eyes to your focus point touches infinite positions on the way.
Where exactly is she looking?
In a perfect world, the perfect gaze point can be determined – but at the moment, technical capabilities are not accurate enough. Furthermore, the distance in most situations isn't much further than in the illustration. If a holographic 2D surface and a real surface are both within the gaze vector (which they do very often), understanding the user intention based on eye tracking cannot be done without a high amount of misinterpretations.
A big advantage of eye tracking intention concepts is the non-existent physical and mental efforts of human interaction. If no human interaction is required to allow the understanding of an intention, this would result in a fantastic component for a man-machine interface.
The head movement cursor was set at the exact outgoing vector. The cursor was designed to move with the head and also appeared on the first surface it touched.
The head rotation cursor appeared when the direct outgoing vector touched any real surface. A gyroscope provided the sensor data.
When testing, we as users immediately began to understand how the cursor behaves – and adjust our body accordingly to the cursor.
At first, such inadvertent reactive adjustment of behavior might sound like some sort of unnatural intervention, but in fact it is natural, and even empowering.
Such a cursor seems to only work spatially, and not with a 2D interface. It also requires a minimal distance of at least 2 meters and a maximum distance of 10 meters (or higher, depending on the size of the object to interact with). The minimum distance is needed because the sensibility and accuracy to move the cursor relates to the distance of the object.
In conclusion we suggest usage of spatial cursor via head movement tracking, but it all depends on the actual use case of what can be done with understanding the user intention to interact with a nearby object.
The potential for seamless interaction is very high, due to the relatively low amount of energy effort required: the user is already looking more or less in the “right” direction as a matter of behavioural habit. According to the Law of Laziness, this interaction method is highly recommended in the palette of spatial AR applications.
We noticed a difference in effort and precision between upwards and sideways head rotations when testing the head movement cursor.
For up and down nodding, the vertical movement is very even and precise. For head up movement, the energy effort is high due to our physiognomy as humans For sideways head movement, the effort is relatively low, but the horizontal movement is not even and requires for correction based on the visual feedback of the cursor. The amount of muscles that come into play for this activity are a lot higher than for an up-down movement of the head. This should be considered when setting expectations on head movement precision, and when moving the cursor into different directions and onto objects.
Once a valuable use case in spatial AR is defined, consider head movement cursors for a low-effort interaction.
The use of eye tracking as a spatial cursor is not recommended (for now).
AR interfaces can be a mix of spatial and 2D interfaces and the use of the cursor idea from 2D space is confronted with a more complex setting seems to be not as robust. However better alternatives are still rare.
Last Edited on April 25, 2019, 8:00 PM . Published by Daniel Seiler, edited by Camilla Burchill