Posts Tagged VR
If you have read Michael Lewis book “Flash Boys”, I will have absolutely no problem convincing you that a few milliseconds improvement in transport time (i.e., already below 20 ms) of a valuable signal (e.g., containing financial information) can be of tremendous value. It is all about optimizing transport distances, super efficient & extremely fast computing and of course ultra-high availability. The ultra-low transport and process latencies is the backbone (together with the algorithms obviously) of the high frequency trading industry that takes a market share of between 30% (EU) and 50% (US) of the total equity trading volume.
In a recent study by The Boston Consulting Group (BCG) “Uncovering Real Mobile Data Usage and Drivers of Customer Satisfaction” (Nov. 2015) study it was found that latency had a significant impact on customer video viewing satisfaction. For latencies between 75 – 100 milliseconds 72% of users reported being satisfied. The user experience satisfaction level jumped to 83% when latency was below 50 milliseconds. We have most likely all experienced and been aggravated by long call setup times (> couple of seconds) forcing us to look at the screen to confirm that a call setup (dialing) is actually in progress.
Latency and reactiveness or responsiveness matters tremendously to the customers experience and whether it is a bad, good or excellent one.
The Tactile Internet idea is an integral part of the “NGMN 5G Vision” and part of what is characterized as Extreme Real-Time Communications. It has further been worked out in detail in the ITU-T Technology Watch Report “The Tactile Internet” from August 2014.
The word “Tactile” means perceptible by touch. It closely relates to the ambition of creating a haptic experience. Where haptic means a sense of touch. Although we will learn that the Tactile Internet vision is more than a “touchy-feeling” network vision, the idea of haptic feedback in real-time (~ sub-millisecond to low millisecond regime) is very important to the idea of a Tactile Network experience (e.g., remote surgery).
The Tactile Internet is characterized by
- Ultra-low latency; 1 ms and below latency (as in round-trip-time / round-trip delay).
- Ultra-high availability; 99.999% availability.
- Ultra-secure end-2-end communications.
- Persistent very high bandwidths capability; 1 Gbps and above.
The Tactile Internet is one of the corner stones of 5G. It promises ultra-low end-2-end latencies in the order of 1 millisecond at Giga bits per second speeds and with five 9’s of availability (translating into a 500 ms per day average un-availability).
Interestingly, network predictability and variation in latency have not been receiving too much focus within the Tactile Internet work. Clearly, a high degree of predictability as well as low jitter (or latency variation), could be very desirable property of a tactile network. Possibly even more so than absolute latency in its own right. A right sized round-trip-time with imposed managed latency, meaning a controlled variation of latency, is very essential to the 5G Tactile Internet experience.
It’s 5G on speed and steroids at the same time.
Let us talk about the elephant in the room.
We can understand Tactile latency requirements in the following way;
An Action including (possible) local Processing, followed by some Transport and Remote Processing of data representing the Action, results in a Re-action again including (possible) local Processing. According with Tactile Internet Vision, the time of this whole even from Action to Re-action has to have run its cause within 1 millisecond or one thousand of a second. In many use cases this process is looped as the Re-action feeds back, resulting in another action. Note in the illustration below, Action and Re-action could take place on the same device (or locality) or could be physically separated. The processes might represent cloud-based computations or manipulations of data or data manipulations local to the device of the user as well as remote devices. It needs to be considered that the latency time scale for one direction is not at all given to be the same in the other direction (even for transport).
The simplest example is the mouse click on a internet link or URL (i.e., the Action) resulting a translation of the URL to an IP address and the loading of the resulting content on your screen (i.e., part of the process) with the final page presented on the your device display (i.e., Re-action). From the moment the URL is mouse-clicked until the content is fully presented should take no longer than 1 ms.
A more complex use case might be remote surgery. In which a surgical robot is in one location and the surgeon operator is at another location manipulating the robot through an operation. This is illustrated in the above picture. Clearly, for a remote surgical procedure to be safe (i.e., within the margins of risk of not having the possibility of any medical assisted surgery) we would require a very reliable connection (99.999% availability), sufficient bandwidth to ensure adequate video resolution as required by the remote surgeon controlling the robot, as little as possible latency allowing the feel of instantaneous (or predictable) reaction to the actions of the controller (i.e., the surgeons) and of course as little variation in the latency (i.e., jitter) allowing system or human correction of the latency (i.e., high degree of network predictability).
The first Complete Trans-Atlantic Robotic Surgery happened in 2001. Surgeons in New York (USA) remotely operated on a patient in Strasbourg, France. Some 7,000 km away or equivalent to 70 ms in round-trip-time (i.e., 14,000 km in total) for light in fiber. The total procedural delay from hand motion (action) to remote surgical response (reaction) showed up on their video screen took 155 milliseconds. From trials on pigs any delay longer than 330 ms was thought to be associated with an unacceptable degree of risk for the patient. This system then did not offer any haptic feedback to the remote surgeon. This remains the case for most (if not all) remote robotic surgical systems in option today as the latency in most remote surgical scenarios render haptic feedback less than useful. An excellent account for robotic surgery systems (including the economics) can be found at this web site “All About Robotic Surgery”. According to experienced surgeons at 175 ms (and below) a remote robotic operation is perceived (by the surgeon) as imperceptible.
It should be clear that apart from offering long-distance surgical possibilities, robotic surgical systems offers many other benefits (less invasive, higher precision, faster patient recovery, lower overall operational risks, …). In fact most robotic surgeries are done with surgeon and robot being in close proximity.
Another example of coping with lag or latency is a Predator drone pilot. The plane is a so-called unmanned combat aerial vehicle and comes at a price of ca. 4 Million US$ (in 2010) per piece. Although this aerial platform can perform missions autonomously it will typically have two pilots on the ground monitoring and possible controlling it. The typical operational latency for the Predator can be as much as 2,000 milliseconds. For takeoff and landing, where this latency is most critical, typically the control is handed to to a local crew (either in Nevada or in the country of its mission). The Predator cruise speed is between 130 and 165 km per hour. Thus within the 2 seconds lag the plane will have move approximately 100 meters (i.e., obviously critical in landing & take off scenarios). Nevertheless, a very high degree of autonomy has been build into the Predator platform that also compensates for the very large latency between plane and mission control.
Back to the Tactile Internet latency requirements;
In LTE today, the minimum latency (internal to the network) is around 12 ms without re-transmission and with pre-allocated resources. However, the normal experienced latency (again internal to the network) would be more in the order of 20 ms including 10% likelihood of retransmission and assuming scheduling (which would be normal). However, this excludes any content fetching, processing, presentation on the end-user device and the transport path beyond the operators network (i.e., somewhere in the www). Transmission outside the operator network typically between 10 and 20 ms on-top of the internal latency. The fetching, processing and presentation of content can easily add hundreds of milliseconds to the experience. Below illustrations provides a high level view of the various latency components to be considered in LTE with the transport related latencies providing the floor level to be expected;
In 5G the vision is to achieve a factor 20 better end-2-end (within the operators own network) round-trip-time compared to LTE; thus 1 millisecond.
So … what happens in 1 millisecond?
Light will have travelled ca. 200 km in fiber or 300 km in free-space. A car driving (or the fastest baseball flying) 160 km per hour will have moved 4 cm. A steel ball falling to the ground (on Earth) would have moved 5 micro meter (that’s 5 millionth of a meter). In a 1Gbps data stream, 1 ms correspond to ca. 125 Kilo Bytes worth of data. A human nerve impulse last just 1 ms (i.e., in a 100 millivolt pulse).
It should be clear that the 1 ms poses some very dramatic limitations;
- The useful distance over which a tactile applications would work (if 1 ms would really be the requirements that is!) will be short ( likely a lot less than 100 km for fiber-based transport)
- The air-interface (& number of control plane messages required) needs to reduce dramatically from milliseconds down to microseconds, i.e., factor 20 would require no more than 100 microseconds limiting the useful cell range).
- Compute & processing requirements, in terms of latency, for UE (incl. screen, drivers, local modem, …), Base Station and Core would require a substantial overhaul (likely limiting level of tactile sophistication).
- Require own controlled network infrastructure (at least a lot easier to manage latency within), avoiding any communication path leaving own network (walled garden is back with a vengeance?).
- Network is the sole responsible for latency and can be made arbitrarily small (by distance and access).
Very small cells, very close to compute & processing resources, would be most likely candidates for fulfilling the tactile internet requirements.
Thus instead of moving functionality and compute up and towards the cloud data center we (might) have an opposing force that requires close proximity to the end-users application. Thus, the great promise of cloud-based economical efficiency is likely going to be dented in this scenario by requiring many more smaller data centers and maybe even micro-data centers moving closer to the access edge (i.e., cell site, aggregation site, …). Not surprisingly, Edge Cloud, Edge Data Center, Edge X is really the new Black …The curse of the edge!?
Looking at several network and compute design considerations a tactile application would require no more than 50 km (i.e., 100 km round-trip) effective round-trip distance or 0.5 ms fiber transport (including switching & routing) round-trip-time. Leaving another 0.5 ms for air-interface (in a cellular/wireless scenario), computing & processing. Furthermore, the very high degree of imposed availability (i.e., 99.999%) might likewise favor proximity between the Tactile Application and any remote Processing-Computing. Obviously,
So in all likelihood we need processing-computing as near as possible to the tactile application (at least if one believes in the 1 ms and about target).
One of the most epic (“in the Dutch coffee shop after a couple of hours category”) promises in “The Tactile Internet” vision paper is the following;
“Tomorrow, using advanced tele-diagnostic tools, it could be available anywhere, anytime; allowing remote physical examination even by palpation (examination by touch). The physician will be able to command the motion of a tele-robot at the patient’s location and receive not only audio-visual information but also critical haptic feedback.” (page 6, section 3.5).
All true, if you limited the tele-robot and patient to a distance of no more than 50 km (and likely less!) from the remote medical doctor. In this setup and definition of the Tactile Internet, having a top eye surgeon placed in Delhi would not be able to operate child (near blindness) in a remote village in Madhya Pradesh (India) approx. 800+ km away. Note India has the largest blind population in the world (also by proportion) with 75% of cases avoidable by medical intervention. At best, these specifications allow the doctor not to be in the same room with the patient.
Markus Rank et al did systematic research on the perception of delay in haptic tele-presence systems (Presence, October 2010, MIT Press) and found haptic delay detection thresholds between 30 and 55 ms. Thus haptic feedback did not appear to be sensitive to delays below 30 ms, fairly close to the lowest reported threshold of 20 ms. This combined with experienced tele-robotic surgeons assessing that below 175 ms the remote procedure starts to be perceived as imperceptible, might indicate that the 1 ms, at least for this particular use case, is extremely limiting.
The extreme case would be to have the tactile-related computing done at the radio base station assuming that the tactile use case could be restricted to the covered cell and users supported by that cell. I name this the micro-DC (or micro-cloud or more like what some might call the cloudlet concept) idea. This would be totally back to the older days with lots of compute done at the cell site (and likely kill any traditional legacy cloud-based efficiency thinking … love to use legacy and cloud in same sentence). This would limit the round-trip-time to air-interface latency and compute/processing at the base station and the device supporting the tactile application.
It is normal to talk about the round-trip-time between an action and the subsequent reaction. It is also the time it takes a data or signal to travel from a specific source to a specific destination and back again (i.e., round trip). In case of light in fiber, a 1 millisecond limit on the round-trip-time would imply that the maximum distance that can be travelled (in the fiber) between source to destination and back to the source is 200 km. Limiting the destination to be no more than 100 km away from the source. In case of substantial processing overhead (e.g., computation) the distance between source and destination requires even less than 100 km to allow for the 1 ms target.
THE HUMAN SENSES AND THE TACTILE INTERNET.
The “touchy-feely” aspect, or human sensing in general, is clearly an inspiration to the authors of “The Tactile Internet” vision as can be seen from the following quote;
“We experience interaction with a technical system as intuitive and natural only if the feedback of the system is adapted to our human reaction time. Consequently, the requirements for technical systems enabling real-time interactions depend on the participating human senses.” (page 2, Section 1).
The following human-reaction times illustration shown below is included in “The Tactile Internet” vision paper. Although it originates from Fettweis and Alamouti’s paper titled “5G: Personal Mobile Internet beyond What Cellular Did to Telephony“. It should be noted that the description of the Table is order of magnitude of human reaction times; thus, 10 ms might also be 100 ms or 1 ms and so forth and therefor, as we shall see, it would be difficult to a given reaction time wrong within such a range.
The important point here is that the human perception or senses impact very significantly the user’s experience with a given application or use case.
The responsiveness of a given system or design is incredible important for how well a service or product will be perceived by the user. The responsiveness can be defined as a relative measure against our own sense or perception of time. The measure of responsiveness is clearly not unique but depends on what senses are being used as well as the user engaged.The human mind is not fond of waiting and waiting too long causes distraction, irritation and ultimate anger after which the customer is in all likelihood lost. A very good account of considering the human mind and it senses in design specifications (and of course development) can be found in Jeff Johnson’s 2010 book “Designing with the Mind in Mind”.
The understanding of human senses and the neurophysiological reactions to those senses are important for assessing a given design criteria’s impact on the user experience. For example, designing for 1 ms or lower system reaction times when the relevant neurophysiological timescale is measured in 10s or 100s of milliseconds is likely not resulting in any noticeable (and monetizable) improvement in customer experience. Of course there can be many very good non-human reasons for wanting low or very low latencies.
While you might get the impression, from the above table above from Fettweis et al and countless Tactile Internet and 5G publications referring back to this data, that those neurophysiological reactions are natural constants, it is unfortunately not the case. Modality matters hugely. There are fairly great variations in reactions time within the same neurophysiological response category depending on the individual human under test but often also depending on the underlying experimental setup. In some instances the reaction time deduced would be fairly useless as a design criteria for anything as the detection happens unconsciously and still require the relevant part of the brain to make sense of the event.
We have, based on vision, the surgeon controlling a remote surgical robot stating that anything below 175 ms latency is imperceptible. There is research showing that haptic feedback delay below 30 ms appears to be un-detectable.
John Carmack, CTO of Oculus VR Inc, based on in particular vision (in a fairly dynamic environment) that “.. when absolute delays are below approximately 20 milliseconds they are generally imperceptible.” particular as it relates to 3D systems and VR/AR user experience which is a lot more dynamic than watching content loading. Moreover, according to some recent user experience research specific to website response time indicates that anything below 100 ms wil be perceived as instantaneous. At 1 second users will sense the delay but would be perceived as seamless. If a web page loads in more than 2 seconds user satisfaction levels drops dramatically and a user would typically bounce.
Based on IAAF (International Athletic Association Federation) rules, an athlete is deemed to have had a false start if that athlete moves sooner than 100 milliseconds after the start signal. The neurophysiological process relevant here is the neuromuscular reaction to the sound heard (i.e., the big bang of the pistol) by the athlete. Research carried out by Paavo V. Komi et al has shown that the reaction time of a prepared (i.e., waiting for the bang!) athlete can be as low as 80 ms. This particular use case relates to the auditory reaction times and the subsequent physiological reaction. P.V. Komi et al also found a great variation in the neuromuscular reaction time to the sound (even far below the 80 ms!).
Neuromuscular reactions to unprepared events typically typically measures in several hundreds of milliseconds (up-to 700 ms) being somewhat faster if driven by auditory senses rather than vision. Note that reflex time scales are approximately 10 times faster or in the order of 80 – 100 ms.
The international Telecommunications Union (ITU) Recommendation G.114, defines for voice applications an upper acceptable one-way (i.e., its you talking you don’t want to be talked back to by yourself) delay of 150 ms. Delays below this limit would provide an acceptable degree of voice user experience in the sense that most users would not hear the delay. It should be understood that a great variation in voice delay sensitivity exist across humans. Voice conversations would be perceived as instantaneous by most below the 100 ms (thought the auditory perception would also depend on the intensity/volume of the voice being listened to).
Finally, let’s discuss human vision. Fettweis et al in my opinion mixes up several psychophysical concepts of vision and TV specifications. Alluding to 10 millisecond is the visual “reaction” time (whatever that now really means). More accurately they describe the phenomena of flicker fusion threshold which describes intermittent light stimulus (or flicker) is perceived as completely steady to an average viewer. This phenomena relates to persistence of vision where the visual system perceives multiple discrete images as a single image (both flicker and persistence of vision are well described in both by Wikipedia and in detail by Yhong-Lin Lu el al “Visual Psychophysics”). There, are other reasons why defining flicker fusion and persistence of vision as a human reaction reaction mechanism is unfortunate.
The 10 ms for vision reaction time, shown in the table above, is at the lowest limit of what researchers (see references 14, 15, 16 ..) find to be the early stages of vision can possible detect (i.e., as opposed to pure guessing ). Mary C. Potter of M.I.T.’s Dept. of Brain & Cognitive Sciences, seminal work on human perception in general and visual perception in particular shows that the human vision is capable very rapidly to make sense of pictures, and objects therein, on the timescale of 10 milliseconds (i.e., 13 ms actually is the lowest reported by Potter). From these studies it is also found that preparedness (i.e., knowing what to look for) helps the detection process although the overall detection results did not differ substantially from knowing the object of interest after the pictures were shown. Note that the setting of these visual reaction time experiments all happens in a controlled laboratory setting with the subject primed to being attentive (e.g., focus on screen with fixation cross for a given period, followed by blank screen for another shorter period, and then a sequence of pictures each presented for a (very) short time, followed again by a blank screen and finally a object name and the yes-no question whether the object was observed in the sequence of pictures). Often these experiments also includes a certain degree of training before the actual experiment took place. The relevant memory of the target object, In any case and unless re-enforced, will rapidly dissipates. in fact the shorter the viewing time, the quicker it will disappear … which might be a very healthy coping mechanism.
To call this visual reaction time of 10+ ms typical is in my opinion a bit of a stretch. It is typical for that particular experimental setup and very nicely provides important insights into the visual systems capabilities.
One of the more silly things used to demonstrate the importance of ultra-low latencies have been to time delay the video signal send to a wearer’s goggles and then throw a ball at him in the physical world … obviously, the subject will not catch the ball (might as well as thrown it at the back of his head instead). In the Tactile Internet vision paper it the following is stated; “But if a human is expecting speed, such as when manually controlling a visual scene and issuing commands that anticipate rapid response, 1-millisecond reaction time is required” (on page 3). And for the record spinning a basketball on your finger has more to do with physics than neurophysiology and human reaction times.
In more realistic settings it would appear that the (prepared) average reaction time of vision is around or below 40 ms. With this in mind, a baseball moving (when thrown by a power pitcher) at 160 km per hour (or ca. 4+ cm per ms) would take a approx. 415 ms to reach the batter (using an effective distance of 18.44 meters). Thus the batter has around 415 ms to visually process the ball coming and hit it at the right time. Given the latency involved in processing vision the ball would be at least 40 cm (@ 10 ms) closer to the batter than his latent visionary impression would imply. Assuming that the neuromuscular reaction time is around 100±20 ms, the batter would need to compensate not only for that but also for his vision process time in order to hit the ball. Based on batting statistics, clearly the brain does compensate for its internal latencies pretty well. In the paper “Human time perception and its illusions” D.M. Eaglerman that the visual system and the brain (note: visual system is an integral part of the brain) is highly adaptable in recalibrating its time perception below the sub-second level.
It is important to realize that in literature on human reaction times, there is a very wide range of numbers for supposedly similar reaction use cases and certainly a great deal of apparent contradictions (though the experimental frameworks often easily accounts for this).
The supporting data for the numbers shown in the above figure can be found via the hyperlink in the above text or in the references below.
Thus, in my opinion, also supported largely by empirical data, a good latency E2E design target for a Tactile network serving human needs, would be between 20 milliseconds and 10 milliseconds. With the latency budget covering the end user device (e.g., tablet, VR/AR goggles, IOT, …), air-interface, transport and processing (i.e., any computing, retrieval/storage, protocol handling, …). It would be unlikely to cover any connectivity out of the operator”s network unless such a connection is manageable from latency and jitter perspective though distance would count against such a strategy.
This would actually be quiet agreeable from a network perspective as the distance to data centers would be far more reasonable and likely reduce the aggressive need for many edge data centers using the below 10 ms target promoted in the Tactile Internet vision paper.
There is however one thing that we are assuming in all the above. It is assumed that the user’s local latency can be managed as well and made almost arbitrarily small (i.e., much below 1 ms). Hardly very reasonable even in the short run for human-relevant communications ecosystems (displays, goggles, drivers, etc..) as we shall see below.
For a gaming environment we would look at something like the below illustration;
Lets ignore the use case of local games (i.e., where the player only relies on his local computing environment) and focus on games that rely on a remote gaming architecture. This could either be relying on a client-server based architecture or cloud gaming architecture (e.g., typical SaaS setup). In general the the client-server based setup requires more performance of the users local environment (e.g., equipment) but also allows for more advanced latency compensating strategies enhancing the user perception of instantaneous game reactions. In the cloud game architecture, all game related computing including rendering/encoding (i.e., image synthesis) and video output generation happens in the cloud. The requirements to the end users infrastructure is modest in the cloud gaming setup. However, applying latency reduction strategies becomes much more challenging as such would require much more of the local computing environment that the cloud game architecture tries to get away from. In general the network transport related latency would be the same provide the dedicated game servers and the cloud gaming infrastructure would reside within the same premises. In Choy et al’s 2012 paper “The Brewing Storm in Cloud Gaming: A Measurement Study on Cloud to End-User Latency” , it is shown, through large scale measurements, that current commercial cloud infrastructure architecture is unable to deliver the latency performance for an acceptable (massive) multi-user experience. Partly simply due to such cloud data centers are too far away from the end user. Moreover, the traditional commercial cloud computing infrastructure is simply not optimized for online gaming requiring augmentation of stronger computing resources including GPUs and fast memory designs. Choy et al do propose to distribute the current cloud infrastructure targeting a shorter distance between end user and the relevant cloud game infrastructure. Similar to what is already happening today with content distribution networks (CDNs) being distributed more aggressively in metropolitan areas and thus closer to the end user.
A comprehensive treatment on latencies, or response time scales, in games and how these relates to user experience can be found in Kjetil Raaen’s Ph.D. thesis “Response time in games: Requirements and improvements” as well as in the comprehensive relevant literature list found in this thesis.
From the many studies (as found in Raaen’s work, the work of Mark Claypool and much cited 2002 study by Pantel et al) on gaming experience, including massive multi-user online game experience, shows that players starts to notice delay of about 100 ms of which ca. 20 ms comes from play-out and processing delay. Thus, quiet a far cry from the 1 millisecond. From the work, and not that surprising, sensitivity to gaming latency depends on the type of game played (see the work of Claypool) and how experienced a gamer is with the particular game (e.g., Pantel er al). It should also be noted that in a VR environment, you would want to the image that arrives at your visual system to be in synch with your heads movement and the directions of your vision. If there is a timing difference (or lag) between the direction of your vision and the image presented to your visual system, the user experience becomes rapidly poor causing discomfort by disorientation and confusion (possible leading to a physical reaction such as throwing up). It is also worth noting that in VR there is a substantially latency component simple from the image rendering (e.g., 60 MHz frame rate provides a new frame on average every 16.7 millisecond). Obviously chunking up the display frame rate will reduce the rendering related latency. However, several latency compensation strategies (to compensate for you head and eye movements) have been developed to cope with VR latency (e.g., time warping and prediction schemes).
Anyway, if you would be of the impression that VR is just about showing moving images on the inside of some awesome goggles … hmmm do think again and keep dreaming of 1 millisecond end-2end network centric VR delivery solutions (at least for the networks we have today). Of course 1 ms target is possible really a Proxima-Centauri shot as opposed to a just moonshot.
With a target of no more than 20 milliseconds lag or latency and taking into account the likely reaction time of the users VR system (future system!), that likely leaves no more (and likely less) than 10 milliseconds for transport and any remote server processing. Still this could allow for a data center to be 500 km (5 ms round.trip time in fiber) away from the user and allow another 5 ms for data center processing and possible routing delay along the way.
One might very well be concerned about the present Tactile Internet vision and it’s focus on network centric solutions to the very low latency target of 1 millisecond. The current vision and approach would force (fixed and mobile) network operators to add a considerable amount of data centers in order to get the physical transport time down below the 1 millisecond. This in turn drives the latest trend in telecommunication, the so-called edge data center or edge cloud. In the ultimate limit, such edge data centers (however small) might be placed at cell site locations or fixed network local exchanges or distribution cabinets.
Furthermore, the 1 millisecond as a goal might very well have very little return on user experience (UX) and substantial cost impact for telecom operators. A diligent research through academic literature and wealth of practical UX experiments indicates that this indeed might be the case.
Such a severe and restrictive target as the 1 millisecond is, it severely narrows the Tactile Internet to scenarios where sensing, acting, communication and processing happens in very close proximity of each other. In addition the restrictions to system design it imposes, further limits its relevance in my opinion. The danger is, with the expressed Tactile vision, that too little academic and industrious thinking goes into latency compensating strategies using the latest advances in machine learning, virtual reality development and computational neuroscience (to name a few areas of obvious relevance). Further network reliability and managed latency, in the sense of controlling the variation of the latency, might be of far bigger importance than latency itself below a certain limit.
So if 1 ms is no use to most men and beasts … why bother with this?
While very low latency system architectures might be of little relevance to human senses, it is of course very likely (as it is also pointed out in the Tactile Internet Vision paper) that industrial use cases could benefit from such specifications of latency, reliability and security.
For example in machine-to-machine or things-to-things communications between sensors, actuators, databases, and applications very short reaction times in the order of sub-milliseconds to low milliseconds could be relevant.
We will look at this next.
THE TACTILE INTERNET USE CASES & BUSINESS MODELS.
An open mind would hope that most of what we do strives to out perform human senses, improve how we deal with our environment and situations that are far beyond mere mortal capabilities. Alas I might have read too many Isaac Asimov novels as a kid and young adult.
In particular where 5G has its present emphasis of ultra-high frequencies (i.e., ultra small cells), ultra-wide spectral bandwidth (i.e., lots of Gbps) together with the current vision of the Tactile Internet (ultra-low latencies, ultra-high reliability and ultra-high security), seem to be screaming for being applied to Industrial facilities, logistic warehouses, campus solutions, stadiums, shopping malls, tele-, edge-cloud, networked robotics, etc… In other words, wherever we have a happy mix of sensors, actuators, processors, storage, databases and software based solutions across a relative confined area, 5G and the Tactile Internet vision appears to be a possible fit and opportunity.
In the following it is important to remember;
- 1 ms round-trip time ~ 100 km (in fiber) to 150 km (in free space) in 1-way distance from the relevant action if only transport distance mattered to the latency budget.
- Considering the total latency budget for a 1 ms Tactile application the transport distance is likely to be no more than 20 – 50 km or less (i.e., right at the RAN edge).
One of my absolute current favorite robotics use case that comes somewhat close to a 5G Tactile Internet vision, done with 4G technology, is the example of Ocado’s warehouse automation in UK. Ocado is the world’s largest online-only grocery retailer with ca. 50 thousand lines of goods, delivering more than 200,000 orders a week to customers around the United Kingdom. The 4G network build (by Cambridge Consultants) to support Ocado’s automation is based on LTE at unlicensed 5GHz band allowing Ocado to control 1,000 robots per base station. Each robot communicates with the Base Station and backend control systems every 100 ms on average as they traverses ca. 30 km journey across the warehouse 1,250 square meters. A total of 20 LTE base stations each with an effective range of 4 – 6 meters cover the warehouse area. The LTE technology was essential in order to bring latency down to an acceptable level by fine tuning LTE to perform under its lowest possible latency (<10 ms).
5G will bring lower latency, compared to an even optimized LTE system, that in a similar setup as the above described for Ocado, could further increase the performance. Obviously very high network reliability promised by 5G of such a logistic system needs to be very high to reduce the risk of disruption and subsequent customer dissatisfaction of late (or no) delivery as well as the exposure to grocery stock turning bad.
This all done within the confines of a warehouse building.
ROBOTICS AND TACTILE CONDITIONS
First of all lets limit the Robotics discussion to use cases related to networked robots. After all if the robot doesn’t need a network (pretty cool) it pretty much a singleton and not so relevant for the Tactile Internet discussion. In the following I am using the word Cloud in a fairly loose way and means any form of computing center resources either dedicated or virtualized. The cloud could reside near the networked robotic systems as well as far away depending on the overall system requirements to timing and delay (e.g., that might also depend on the level of robotic autonomy).
Getting networked robots to work well we need to solve a host of technical challenges, such as
- Jitter (i.e., variation of latency).
- Connection reliability.
- Network congestion.
- Robot-2-Robot communications.
- Robot-2-ROS (i.e., general robotics operations system).
- Computing architecture: distributed, centralized, elastic computing, etc…
- System stability.
- Power budget (e.g., power limitations, re-charging).
- Sensor & actuator fusion (e.g., consolidate & align data from distributed sources for example sensor-actuator network).
- Autonomy vs human control.
- Machine learning / machine intelligence.
- Safety (e.g., human and non-human).
- Security (e.g., against cyber threats).
- User Interface.
- System Architecture.
The network connection-part of the networked robotics system can be either wireless, wired, or a combination of wired & wireless. Connectivity could be either to a local computing cloud or data center, to an external cloud (on the internet) or a combination of internal computing for control and management for applications requiring very low-latency very-low jitter communications and external cloud for backup and latency-jitter uncritical applications and use cases.
For connection types we have Wired (e.g., LAN), Wireless (e.g., WLAN) and Cellular (e.g., LTE, 5G). There are (at least) three levels of connectivity we need to consider; inter-robot communications, robot-to-cloud communications (or operations and control systems residing in Frontend-Cloud or computing center), and possible Frontend-Cloud to Backend-Cloud (e..g, for backup, storage and latency-insensitive operations and control systems). Obviously, there might not be a need for a split in Frontend and Backend Clouds and pending on the use case requirements could be one and the same. Robots can be either stationary or mobile with a need for inter-robot communications or simply robot-cloud communications.
Various networked robot connectivity architectures are illustrated below;
I greatly acknowledge my wife Eva Varadi for her support, patience and understanding during the creative process of creating this Blog.
.WORTHY 5G & RELATED READS.
- “NGMN 5G White Paper” by R.El Hattachi & J. Erfanian (NGMN Alliance, February 2015).
- “The Tactile Internet” by ITU-T (August 2014). Note: in this Blog this paper is also referred to as the Tactile Internet Vision.
- “5G: Personal Mobile Internet beyond What Cellular Did to Telephony” by G. Fettweis & S. Alamouti, (Communications Magazine, IEEE , vol. 52, no. 2, pp. 140-145, February 2014).
- “The Tactile Internet: Vision, Recent Progress, and Open Challenges” by Martin Maier, Mahfuzulhoq Chowdhury, Bhaskar Prasad Rimal, and Dung Pham Van (IEEE Communications Magazine, May 2016).
- “John Carmack’s delivers some home truths on latency” by John Carmack, CTO Oculus VR.
- “All About Robotic Surgery” by The Official Medical Robotics News Center.
- “The surgeon who operates from 400km away” by BBC Future (2014).
- “The Case for VM-Based Cloudlets in Mobile Computing” by Mahadev Satyanarayanan et al. (Pervasive Computing 2009).
- “Perception of Delay in Haptic Telepresence Systems” by Markus Rank et al. (pp 389, Presence: Vol. 19, Number 5).
- “Neuroscience Exploring the Brain” by Mark F. Bear et al. (Fourth Edition, 2016 Wolters Kluwer).
- “Neurophysiology: A Conceptual Approach” by Roger Carpenter & Benjamin Reddi (Fifth Edition, 2013 CRC Press). Definitely a very worthy read by anyone who want to understand the underlying principles of sensory functions and basic neural mechanisms.
- “Designing with the Mind in Mind” by Jeff Johnson (2010, Morgan Kaufmann). Lots of cool information of how to design a meaningful user interface and of basic user expirence principles worth thinking about.
- “Vision How it works and what can go wrong” by John E. Dowling et al. (2016, The MIT Press).
- “Visual Psychophysics From Laboratory to Theory” by Yhong-Lin Lu and Barbera Dosher (2014, MIT Press).
- “The Time Delay in Human Vision” by D.A. Wardle (The Physics Teacher, Vol. 36, Oct. 1998).
- “What do we perceive in a glance of a real-world scene?” by Li Fei-Fei et al. (Journal of Vision (2007) 7(1); 10, 1-29).
- “Detecting meaning in RSVP at 13 ms per picture” by Mary C. Potter et al. (Attention, Perception, & Psychophysics, 76(2): 270–279).
- “Banana or fruit? Detection and recognition across categorical levels in RSVP” by Mary C. Potter & Carl Erick Hagmann (Psychonomic Bulletin & Review, 22(2), 578-585.).
- “Human time perception and its illusions” by David M. Eaglerman (Current Opinion in Neurobiology, Volume 18, Issue 2, Pages 131-136).
- “How Much Faster is Fast Enough? User Perception of Latency & Latency Improvements in Direct and Indirect Touch” by J. Deber, R. Jota, C. Forlines and D. Wigdor (CHI 2015, April 18 – 23, 2015, Seoul, Republic of Korea).
- “Response time in games: Requirements and improvements” by Kjetil Raaen (Ph.D., 2016, Department of Informatics, The Faculty of Mathematics and Natural Sciences, University of Oslo).
- “Latency and player actions in online games” by Mark Claypool & Kajal Claypool (Nov. 2006, Vol. 49, No. 11 Communications of the ACM).
- “The Brewing Storm in Cloud Gaming: A Measurement Study on Cloud to End-User Latency” by Sharon Choy et al. (2012, 11th Annual Workshop on Network and Systems Support for Games (NetGames), 1–6).
- “On the impact of delay on real-time multiplayer games” by Lothar Pantel and Lars C. Wolf (Proceedings of the 12th International Workshop on Network and Operating Systems Support for Digital Audio and Video, NOSSDAV ’02, New York, NY, USA, pp. 23–29. ACM.).
- “Oculus Rift’s time warping feature will make VR easier on your stomach” from ExtremeTech Grant Brunner on Oculus Rift Timewarping. Pretty good video included on the subject.
- “World first in radio design” by Cambridge Consultants. Describing the work Cambridge Consultants did with Ocado (UK-based) to design the worlds most automated technologically advanced warehouse based on 4G connected robotics. Please do see the video enclosed in page.
- “Ocado: next-generation warehouse automation” by Cambridge Consultants.
- “Ocado has a plan to replace humans with robots” by Business Insider UK (May 2015). Note that Ocado has filed more than 73 different patent applications across 32 distinct innovations.
- “The Robotic Grocery Store of the Future Is Here” by MIT Technology Review (December 201
- “Cloud Robotics: Architecture, Challenges and Applications.” by Guoqiang Hu et al (IEEE Network, May/June 2012).