How a Metaverse masters Realism in their world of alternate existence
You watch a movie rich in visual effects and computer graphics. What gets you so involved in the imaginary story world and the characters? The sense of realism they carry; isn’t it? The same applies to three dimensional virtual worlds and metaverse. To get a user totally immersed in the virtual world, they have to be that realistic and believable. And in the world of the metaverse, realism is concerned with a user’s psychological and emotional engagement with the environment. The extent to which a user is transported to the environment and the lucidity of a user’s action to his or her avatar determines the believability of that environment. So let us see how metaverse effectively carry out this task.
Sight – It comes as no surprise that sight is one of the direct links to a metaverse. One should be able to see and relate the environment within a metaverse as something he is used to. In fact, even the earliest visual mediums for virtual worlds were written as plain texts to form images inside the mind’s eye. However, these words and symbols were indirect. They just illustrated the world and left the rest to a user’s imagination. A constructive virtual world eliminates this indirectness and makes the environment as information rich to our eyes as the real world is. This richness is obtained mostly due to the efficient use of real time computer graphics. With the growth in graphics hardware and algorithms, virtual worlds have moved on from flat polygons to smooth shading, texture mapping and programmable shaders.
In the earlier days, methods such as more and smaller polygons, higher resolution textures, interpolation techniques such as anti-aliasing and smooth shading were used to generate quality visual impact. But, with the arrival of programmable shaders and their subsequent growth as a new functional baseline for 3D libraries such as OpenGL, virtual worlds and computer graphics in general began to expand their potential. When programmable shaders became common, many features of 3D models gained an algorithm expression and thus it eliminated the need for larger polygons or textures. At the same time, this also expanded the variety and flexibility with which objects can be rendered or presented.
Thus, with the growth of processing power, display hardware and graphics algorithm, immersion has expanded from pure detail to specific visual elements. This is absolutely quintessential as the modern day 3D and Virtual Reality techniques allow users to interact and even make simulations within a virtual world. 3D and Virtual Reality do have a hindrance that you need glasses and headsets to be completely immersive. Such gadgets are becoming less obstructive day by day and will eventually come to a stage where it is completely unnecessary!
Sound – Speaking and listening is our most common form of verbal communication and virtual worlds and metaverse cannot exist without them. It is that sort of communication that instantly engage us with other residents of the virtual world and it is even more appealing than reading virtual face, posture or movement. But other than such direct verbal sounds, there is another kind of sound that is absolutely important. Imagine you are watching a movie where the protagonist is chasing the baddie across the city. Will you be satisfied with just the panting and screaming sounds of the two main characters? No… You need to get the feel of the city. The sound of the vehicles, shops, the chattering of people.. Everything. All the ambience. This plays a crucial role in Metaverse too. That is, when a user is inside a pub, the environment cannot be numb or else he will exit at the first chance! It is such sounds that give us a feel of placement inside a particular situation.
For the first part of direct verbal communication, we know that there are several high end voice chat technologies that accurately capture the nuances of spoken words. Such sense of hearing plays a pivotal role in conveying the avatar frame of mind as this enables effective interaction. But different virtual worlds call for different kind of interactions. You would want to feel like you blend into the virtual world as you interact. And thus, with the modern voice masking and modulation technologies in virtual worlds, you can even control your voice tone and speak the way you want. So, while communicating, you can actually hint if you like the other person or not!
Ambience sound is a much broader concept and the extremely pleasing news is that today most of us are aware what quality sounds can achieve. From our smartphones to games, P.C and home theaters, most of us demand a sound system that seamlessly involves us. Our longing for “surround sound” is a clear indicator of the desire to be completely immersed within an audio environment. Traditional stereo systems have already given way to 2.1, 5.1, 7.1 and even beyond. Today, Virtual Reality and gaming researchers have dedicated themselves to study about three dimensional spatialized sound to know how to replicate the paths sound takes before reaching our ears. Before a sound reaches our ears, the waves undergo a series of changes depending upon the environment, the position of the sound, the location of the listener and even the shape of the listener’s head. Researchers are already developing algorithms to manoeuvre the sound sources so that these changes will be reflected before they reach the listener. Such 3D sound researchers mostly focus on the shape of the listener’s head to analyse how the sound form itself. This study is mainly carried out using a mathematical model called Head-Related Transfer Function (HRTF) and it is in turn part of a larger field of study known as binaural rendering, that aims at producing a kind of surround sound that not only produces sound from left and right but also from above and below. And just like fingerprint and prescription glasses vary from individual to individual, HRTF also is different for each person. Today, most games and VR deliver generic HRTF, but the problem here is that people whose HRTF is not close to the generic one would not get a proper feel of realism. To counter this, the creative brains are already in the process of developing sound propaganda that would enable an accurate HRTF.
Binaural rendering can be put to best use with headphones as each side represents an ear in the sound simulation setup. The sound will be rendered in speakers too but the listener’s environment outside virtual world will influence how sound reaches the ears and this will spoil the effect. Now, imagine a virtual world where every little interaction of avatars generate unique and believable sound, which is then rendered through sound propagation and binaural rendering to accurately depict spatial relationship with the user avatar. This is where we are headed for.
Touch – From walking to sitting and sleeping, you constantly make use of the sense of touch to control your movements so that you can complete the task you are working on. This is why touch is an empirical aspect of virtual world realism. To completely immerse in a virtual world, you need to get a physical stimuli of the objects inside the environment. The most common technology concerned with touch in virtual worlds is haptic or force feedback. The very idea behind haptic technology is to convert virtual contacts into physical ones. Whereas force feedback is rather a simpler form of haptic that has physical devices push against or resist the user’s body; especially hands or arms.
Haptics are predominantly classified into two; Kinesthetic and Tactile. Kinesthetic consists of things you feel from sensors in your muscles, joints, tendons etc. Imagine you have a mug of coffee in your hand. Kinesthetic feedback will inform your brain the approximate size of the mug, its weight and how you hold it with respect to your body. Tactile haptics is the things you feel on your fingers or on the surface. The tissue in your fingers have variety of sensors implanted in and underneath the skin. This helps your brain to feel vibration, pressure, touch, texture etc.
In the 2009 Electronic Entertainment Expo (E3), Microsoft unveiled their ‘Project Natal’ which demonstrated a controller free gaming environment. The project later came to be known as Kinect and it was the first popular “come-as-you-are” motion system. The users didn’t require a controller to regulate the movement of characters or objects inside the game. They could easily fight or drive a car exclusively with their body movements. Such type of controller free haptic feedbacks became common in the following years but it still lacked the quality of being totally immersive. For example, during the collision with objects or firing weapons, most of them had a sensory mismatch which made it hardly more realistic than a beep or icon on the screen. This is why targeted haptic feedback, body sensations that integrated more naturally to the virtual environment become an interesting area of study in the recent past.
Tactai, an American Company introduced a touch controller that could be clipped onto your finger. The device can be worn in any or every finger, but only the index finger is required to feel things. Its applicability depends on what users are doing. For example, shopping in Virtual Reality can be done with one index finger while opening a virtual bottle of water will require a device on each index finger. What if you want to pat a virtual dog or play a virtual keyboard? You can have five devices per hand which would give you a fantastic feel of what you are interacting with. The tactai software regularly reads the position of your finger in a virtual space using headset camera or external measurement system. Every time it receives a new measurement, it determines where your finger is in the virtual space. If it is in a free space, the software instructs the device in the platform to move away from your fingertip so that you won’t feel anything. And if your finger is on a virtual object, the software calculates a Dynamic Tactile Wave which in turn sends a vibration onto your finger that closely matches the feel you will get if you moved around the object in real space.
Lamsaptics, a Los Angeles based startup and Maseeh Entrepreneurship Prize Competition team used ultrasound arrays and predictive learning based algorithms to generate touch sensations for VR and AR applications. Their research is mainly on a mobile tool that makes use of ultrasound waves, air pressure fields, high-frequency sound waves and ultrasound transducers to trigger the neurons in your hand, much like how they are used in medicines to create high-resolution images. In other words, it can be thought of as a similar process to how bats echolocate using sound waves or how radars spot planes using electromagnetic waves. Their algorithms are designed to match human experience, that is, how humans perceive shapes, textures etc. Lamsaptics aims to document all human sensations so that they can generate a more enlightened experience for users.
Gestures and Expressions – The more natural and expressive an avatar gets, the more is the perception of reality in the virtual environment. Customary avatars have already given way to broader cues such as poses, incidental sounds, and facial expressions. Such cues facilitate intricate details such as blinking and even the movement of lips while talking. Anything that makes the avatar look alive contributes significantly to the sense of immersion. Jim Blascovich and Jeremy Bailenson, two of the pioneering experts in Virtual Reality describes this feature as a human social agency. They identify the main components as; movement realism- that deals with gestures, expressions, postures etc, anthropometric realism- that deals with recognizable human body parts and photographic realism that deals with closeness to actual human appearance.
MindMaze introduced a VR headset ‘Mask’, that can track a wearer’s expression and transfer them to their Avatar. As a result, a user can use his or her own face to enact in VR that helps in improved personalization and humanization. Mask uses foam electrodes to track a user’s face that detect facial electrical impulses which are then scrutinized by algorithms that generate a neural signature of an individual’s expression. The advanced machine learning and biosignal processing of MindMaze help to decode and translate expressions, tens of milliseconds before they actually arrive on the wearer’s face! This early detection is what makes real time simulation of the expression on the avatar possible.
Veeso is another such VR headset that reads your face and transmits expressions onto a virtual avatar in real time. This smartphone based VR headset is equipped with two infrared cameras to obtain the user’s facial expressions. One of them is located between the eyes to obtain pupil movement, eyebrows, eyelids etc and the other is at the bottom of the unit to read jaw, lips and mouth. The company aims to provide better emotional connections through social games and chat applications.
A locator standard helps in finding places and landmarks across Virtual Worlds. The internet is already familiar with this technology in the form of URLs and the same can be completely adapted for virtual landmarks. Linden Lab already used that for their Second Life locations.
An identity standard gives a user unique credentials that can be used across virtual world boundaries. This could be equivalent to our real life license numbers, social security numbers, passport etc.
A currency standard will define the value of virtual objects and creations, thus enabling their trade and exchange. Different virtual worlds already feature their own unique virtual currency and very soon there will be developments in Open Metaverse Currency that will serve as a universally accepted virtual form of currency.