How VR works – completed

Today, I completed University of California San Diego’s edX course „How Virtual Reality works“. I have worked on all 6 projects assignments – the last one was a very good one as it combined many things we’ve worked on and offered the possibility to explore and evaluate a current Google Cardboard app on the basis of a criteria catalogue. I picked the Android app „Apollo 15 Moon landing VR“ and didn’t regret my choice.

Below are my notes from the second part of the MOOC – as a reminder for me later on and maybe helpful for others.

  • week 4: Travel and Wayfinding
  • week 5: Menus and Text Input
  • week 6: VR Design Principles

week4 – Travel and Wayfinding

Navigation-components are travel (actual movement, searching a specific target, maneuvering or exploring the environment) and wayfinding (this is the cognitive component).

In VR, travel is a designer decision (teleport, active or passive role,…); the steps are choosing a destination and choosing the travel technique (among them physical locomotion, steering a vehicle, instantaneous travel). Physical locomotion techniques are intuitive, but limited to space and human physics – because of their high level of immersion the entertainment industry (The VOID, modal VR via backpacks, treadmills) and phobia treatment make use of them. Steering techniques via physical devices can cause motion sickness whereas target based techniques (point to location and user is moved there – markers on the floor or use of maps) are not intuitive and might be disorienting.

Wayfinding means defining a path from a to b: there is spatial knowledge (exocentric knowledge – location of user irrelevant, typically it is map-based knowledge) and there are visual cues (egocentric wayfinding cues)
To make wayfinding tasks easier, the environment design can include natural environment features (characteristic mountains etc.) and architectural objects.
The most important wayfinding aids are landmarks (tall structures or buildings which can be seen from far away, man-made or natural like rivers or mountains), signs and maps. Landmarks also can be local and help users as visual clues where to take a turn. Signs with their single purpose of wayfinding aids can look like in the real world (attached to a post), like directional arrows to indicate a destination or user specific as labels placed directly for finding an entrance etc. In VR, maps can be shown at varying levels of resolution or zoom, for orientation the approaches are north pointing up or forward/track up.
Disadvantages in VR wayfinding are the limited resolution of the screen (therefore text is hard to read) and limited field of view (typical HMDs only have 90-100 degrees).

week 5 – Menus and Text Input

VR-Interactions like selection, manipulation and navigation are mappings of real-world-things, but the interaction types „menus“ and „text input“ are more abstract.

In VR we should always try to use direct interaction with the objects of our environment (no slider to change size but grab object and pull). Menus are very often done as easy-to-learn 2D flat menus (vertically and oriented towards the user), they are placed in the environment (physical room the user is in = Oculus Home etc.), in user space (on his arm,…) or in object space (to resize, delete, move object). 3D menus would be more natural (maybe as cubic menus arranged around an object), but 3D menu items may be occluded or outside users view.

There are different kind of interfaces and input options for VR: tangible interfaces (mostly consist of physical objects with sensors like flight simulators, special control panels,…), gesture control interfaces (MS Kinect, Leap Motion,…  disadvantage: learning gesture commands and practicing them), voice commands (there are already systems like Amazon Echo, Google Home, Apple Siri, MS Cortana which are designed to recognize natural human speech – with the right software they could control a VR application). Text input in VR is difficult, but occasionally there is a need for that (security passwords, saving things per filename, numbers in 3D design and modeling apps, labels in engineering apps,…) – Virtual keyboards are an active research area.

week 6 – VR Design Principles

Feedback: A user should get feedback after interactions in VR systems to his visual sense, auditory sense and haptic sense (the more senses the better). If the VR hardware has no haptic output, substitution can help, e.g. show visually what can’t be felt.
Temporal compliance (head motion complies with image updates) is important, therefore latency (delays after input, when rendering an image) should be at a minimum – to avoid image jutter, the programmer can reduce the amount of data that’s displayed (less complex 3D models).
Spatial compliance (when a user moves an object, his finger motions need to comply with the position of the controller and its effect on the VR environment – dials instead of sliders are recommended) and nulling compliance (going back to original position and original value) are also important.

Constraints: Artificial Constraints should be used to increase the usability of VR apps (limit DOFs / use physics engines like NVIDIA’s PhysX SDK in Unity to make apps compliant with law of physics / use dynamic alignment tools like user input to grid or limit values).

Human factors: It is important to think about the VR target audience: their previous VR experience (to avoid feeling overwhelmed or getting motion sick), age (maybe limited vision – adjust use of text), height (when moving things physically), VR hardware (are 1 or 2 hand controllers necessary? often, one controller is to hold an object which you manipulate with the other one) etc.

Isomorphic or non-isomorphic approaches: Whereas isomorphic approaches for VR application designs are based on real world and used for simulation apps (testing cars with VR wheel etc.), with non-isomorphic approaches for VR application design we can overcome human limitations and laws of physics limitations („magic“). N.i. approaches often borrow ideas from literature and film, are enjoyable und easy to understand when based on well-known ideas – on the other hand they are difficult to create because of people’s expectations.

VR system evaluation needs planning and statistical analysis – the best way is to test on target users under controlled conditions. Tests of functionality would include: frame rate through all parts of virtual world (today >=90fps for high-end HMD) / latency&lag (could cause motion sickness) / test for network latency if it can impact latency of app’s rendering engine / test for user comfort (motion sickness, how well HMD fits the head, how app is used)

What could be future aspects of VR?
Whereas today’s VR apps can be quickly written with VR authoring tools (Unity 3D, Unreal Engine, Amazon Lumberyard – disadvantage: locked in to provider) with powerful built-in functions, in future, VR development might directly happen in VR. There will be massive changes in Audio for VR (audio not yet implemented in tools, just visual). Thermal issues with mobile VR based on smartphones might be solved. Maybe there will be „inside-out-tracking“ from the HMD (built-in sensors in HMD). Future tech specs (today devices render 1200×1080 pixel, 15 pix per degree across a 90 degree field of view with fixed focal distance) might offer a new level of feeling presence in VR. There are indications that VR and AR are going to merge. There are high revenue predictions for VR (among them half is software development for apps, games, theme park solutions) and AR.

Today, I also decided to upgrade to the verified track in order to get the course certificate for „How Virtual Reality works“: paying with credit card was easy but it remains to be seen if my (not so good) webcam photo in combination with the (not so good) webcam-photo of an ID (I tried my driver’s license) fits the requirements …

Back to my blog post: How VR works (week 1 – 3)

(Update July 2, 2017)
Verification worked, that’s my verified course certificate of achievement:

Street View 360° panorama shots

One of the tasks in the „How VR works“ MOOC included taking a 360° panorama shot with the Street View Smartphone App. At first I ignored the 360° version and just took a panorama shot by turning myself once around: the photo was quite good and done in seconds, but it wasn’t very immersive because the upper part and lower part were „missing“.
Doing it correctly wasn’t easy as I was standing in the Palatine forest and in the end, the trees consisted of many single pictures which had to fit – otherwise you get ghost trees coming out of nowhere …
This is the best of my attempts which I already uploaded to Google, so it can be found via this short url ( or you click on the full url:,8.1245107,0a,82.2y,80.2h,90t/data=!3m4!1e1!3m2!1sAF1QipN0f86wmY0QEELdJKbx0iR1-62RdoYOiSavywLv!2e10?source=apiv3


How VR works

Virtual Reality is a topic which is very interesting for me and I have invested a fair amount of my spare time since last summer in learning more about the subject. Right at this moment, I’m in the middle of a 6-week edX-MOOC from UC San Diego „How Virtual Reality works“.
For sure, it would be nice to get a certificate – but since my last edX MOOC, edX has changed its policy, so that now there are only verified participation certificates for a fee that varies by course (in this case 99 Dollars). In order to get it, there would be many assignments – among them quizzes, engagement tasks (video watching and marking as complete etc.), discussions and project tasks with peer review. The weekly short videos are very informative and don’t cost a lot of time whereas the project tasks did cost me a lot of time and I’m no big fan of peer reviews. I would have preferred forums where I could choose which answers I would like to read and engage with instead of getting 3 submissions with a lot of (formal) questions and points between 0-3.

What’s the content of this MOOC? In the beginning, I thought it was about WebVR, but until now WebVR is in the optional course sections. But there was already a lot of background theory relevant for VR:

  • week 1: Introduction to and History of VR  & The Human Visual System and Display Methods
  • week 2: Input Devices: Degrees of Freedom and Tracking
  • week 3: Selection and Manipulation

Regarding a definition of VR, there are 3 defining elements: Virtual World (can resemble reality or be fictional), Interaction, Immersion. Today, VR is present in many differenct industries, but gaming is still the  dominant driver. I was astonished to hear that the first HMD „Sword of Damocles“ goes back to the 1960s (it was so heavy it had to be attached with a mechanical arm from the ceiling). Until the 1970s tracking was done mechanically with arms anchored in floor or ceiling, 1979 followed the first electromagnetic system (tracking over longer distances and without mechanical arm) from the company Polhemus and the first data gloves came in 1985. In the 1990s, walk-in VR systems (caves) were invented and mostly used electromagnetic or ultrasonic tracking. 2002, optical tracking systems became available (thanks to Hollywood) and the development of smart phones led to their use as screens for consumer VR devices many years later. In 2016, there were HMDs like Oculus Rift CV1 and HTC Vive, Sony Playstation VR and Microsofts Hololens.
The part „Human Visual System“ was a little bit harder to follow: Monocular depth cues like occlusion, linear perspective, shadows, motion parallax and accomodation aren’t a topic you often hear or think about. Regarding human eye specs, there are some parameters of the eye which are important for VR: color and field of view (220 degrees from both eyes, but binocular vision for 3D stereo is just 120 degrees). We also heard about active stereo, passive stereo and autostereos optic displays (wearing glasses not necessary).

Input devices for 3D VR environments are special (you can’t just use a mouse) and you have to think about Degrees of Freedom (DOF), e.g. the ways an object can move within a space. There are 6 DOF in a 3D space which can be divided in 2 categories: translational (left/right, forward/backward, up/down) and rotational (pitch, yaw, roll) movements. You have also to distinguish between position and orientation and know about relative and absolute DOF – absolute positioning devices use a direct mapping between control space and virtual space.
There are different tracking systems for VR: Mechanical tracking (can’t be used for walk-around VR applications), electromagnetic tracking (also with range limitations), ultrasonic & inertial tracking (components gyroscopes and accelerometers), optical tracking (became popular for motion capturing in Hollywood movies). The least expensive way to do VR is with smartphones but that doesn’t allow for positional tracking (only the head orientation is tracked).

The course section „selection and manipulation“ was easier to follow for me because of personal experiences with VR. Selection means the option of picking one of many objects in VR, manipulation means the action afterwards which modifies an object or makes changes to the virtual world. Both depend heavily on the capabilities of the interaction device which is used.
Important for effective selections are: how far away is the object? what size is the object? the density of objects (and are there obstacles) around the object which shall be selected? the accuracy of the tracking device?
Manipulation of VR objects can mean positioning them (moving them around) or rotating them – in the real world we are used to 6 DOF, but few controllers are able to do that.
Selection and manipulation interaction techniques for VR can be isomorphic (mimic the real world) or non-isomorphic („magic“ – not limited to laws of physics but therefore might not be intuitive). There is also the distinction between egocentric (user is at the centre, real hand = virtual hand, Go-Go technique, Laser pointer or Flashlight technique etc.) and exocentric (world in miniature approach, voodoo doll approach etc.) interaction methods. As low-cost VR solutions like smartphone & Cardboard don’t have controllers, selection and manipulation track the user’s head direction – the head gaze typically controls a cursor in the center of the screen and via head movement the cursor can be moved to select a predefined object. Manipulation often is simply done by hovering the cursor over an object for a certain amount of time (2 seconds or so) which then triggers an action. Hand gestures would be the most natural way to interact in VR (devices like Leap Motion or Microsoft Kinect which can detect finger pinches), but there is still the issue of not providing haptic feedback when touching an object.

This covers my impressions of week 1 to week 3; hopefully there are no big mistakes in my notes.