Posts Tagged averaging
“Figures often beguile me” leading to the statement that “There are three kinds of lies: lies, damned lies, and statistics.” (Mark Twain, 1906).
We are so used to averages … Read any blog or newspaper article trying to capture a complex issue and its more than likely that you are being told a story of averages … Adding to Mark Twain’s quote on Lies, in our data intense world ” The Average is often enough the road to an un-intentional Lie” .. or just about “The Average Lie” .
Imagine this! Having (at the same time) your feet in the oven at 80C and you head in the freezer at -6C … You would be perfectly OK! On average! as your average temperature would equal 80C + (-6C) divided by 2 which is 37C, i.e., the normal and recommended body temperature for an adult human being. However both your feet and your head is likely to suffer from such an experiment (and therefore really should not be tried out … or left to Finns used to Sauna and Icy water … though even the Finns seldom enjoyed this simultaneously).
Try this! Add together the age of the members your household and divide by the number of members. This would give you the average age of your household … does the average age you calculated have any meaning? … if you have young children or grandparents living with you, I think that there is a fairly high chance that the answers to that question is NO! … The average age of my family”s household is 28 years. However, this number is a meaningless average representation of my household. It is 20 times higher than my sons age and about 40% lower than my own age.
Most numbers, most conclusions, most stories, most (average) analysis are based on an average representation of one or another Reality …. and as such can easily lead to Reality Distortion.
When we are presented with averages (or mean values as it is also called in statistics), we tend to substitute Average with Normal and believe that the story represents most of us (i.e., statistically this means about 68% of us all). More often than not we sit back with the funny feeling that if what we just read is “normal” then maybe we are not.
On mobile data consumption (I ll come back to Smartphone data consumption a bit later) … There is one (non-average) truth about mobile data consumption that has widely (and correctly) been communicated …
Very few mobile customers (10%) consumes the very most of the mobile data traffic (90%).
Lets just assume that a mobile operator make claim to an average 200MB monthly consumption (source: http://gigaom.com/broadband/despite-critics-cisco-stands-by-its-data-deluge/). Lets assume that 10% of customer base generating 90% of the traffic. It follows that the high usage segment has an average volumetric usage of 1,800MB and the low usage segment an average volumetric usage of only 22MB. In other words 10% of the customer base have 80+ times higher consumption than the remaining 90%. The initial average consumption (taken across the whole customer base) of 200MB communicated is actually 9 times higher than the average consumption of 90% of the customer base. It follows (with some use case exceptions) that the 10% high usage segment spends a lot more Network Resources and Time. The time the high usage segment spend actively with their device are likely to be a lot higher than the 90% low usage segment.
The 200MB is hardly normal! It is one of many averages that can be calculated. Obviously 200MB is a lot more “sexy” than to state that 90% of the customer base consumes typically 22MB.
Do Care about Measurement and Data Processing!
What further complicates consumptive values being quoted is how the underlying data have been measured, processed and calculated!
- Is the averaging done over the whole customer base?,
- Is the averaging done over active customers?, or
- A subset of active customers (i.e., 2G vs 3G, 3G vs HSPA+ vs LTE vs WiFi, smartphone vs basic phone, iPad vs iPhone vs Laptop, prepaid vs postpaid, etc..) or
- A smaller subset based on particular sample criteria (i.e., iOS, Android, iPad, iPhone, Galaxy, price plan, etc..) or availability (mobile Apps installed, customer approval, etc..). or …
Without knowing the basis of a given average number any bright analysis or cool conclusion might be little more than Conjecture or Clever Spin.
On Smartphone Usage
One the most recent publicized studies on Smartphone usage comes from O2/Telefonica UK (Source: http://mediacentre.o2.co.uk/Press-Releases/Making-calls-has-become-fifth-most-frequent-use-for-a-Smartphone-for-newly-networked-generation-of-users-390.aspx). The O2 data provides an overview of average daily Smartphone usage across 10 use case categories.
The O2’s Smartphone statistics have been broken down in detail by one of our industry”s brightest Tomi Ahonen (A Must Read http://www.communities-dominate.blogs.com/ though it is drowning in his Nokia/Mr. Elop “Howler Letters”). Tomi points out the Smartphone’s disruptive replacement potential of many legacy consumer products (e.g., think: watch, alarm clock, camera, etc..).
The O2 Smartphone data is intuitive and exactly what one would expect! Boring really! Possible with the exception of Tomi’s story telling (see above reference)! The data was so boring that The Telegraph (source: http://www.telegraph.co.uk/technology/mobile-phones/9365085/Smartphones-hardly-used-for-calls.html) had to conclude that “Smartphones Hardly Used for Calls”. Relative to other uses of course not really an untruth.
Though The Telegraph did miss 9or did not care) the fact that both Calls and SMS appeared to be what one would expect (and why would a Smartphone generate more Voice and SMS than Normal? … hmmmm). Obviously, the Smartphone is used for a lot of other stuff than calling and SMSing! The data tells us that an average Smartphone user (whatever that means) spend ca. 42 minutes on web browsing and social networking while “only” 22 minutes on Calls and SMS (i.e., actually 9 minutes of SMS sounds more like a teenager than a high-end smartphone user … but never mind that!). There are lots of other stuff going on with that Smartphone. In fact out of the total daily usage of 128 minutes only 17% of the time (i.e., 22 minutes) is used for Plain Old Mobile Telephony Services (The POMTS). We do however find that both voice minutes and legacy messaging consumption are declining faster in the Smartphone segment than for Basic Phones (which are declining rapidly as well) as OTT Mobile Apps alternatives substitute POMTS (see inserted chart from http://www.slideshare.net/KimKyllesbechLarsen/de-risking-the-broadband-business-model-kkl2411201108x).
I have no doubt that the O2 data represents an averaging across a given Smartphone sample, the question is how does this data help us to understand the Real Smartphone User and his behavior.
So how did O2 measure this data?
(1) To be reliable and reasonable, data collection should be done by an App residing in the O2 customer’s smartphone. An alternative (2) would be deep packet inspection (dpi) but this would only capture network usage which can (and in most cases will be) very different from the time the customer actively uses his Smartphone. (3) Obviously the data could also be collected by old fashion Questionnaires being filled in. This would be notoriously unreliable and I cannot imagine this being the source.
Thus, I am making the reasonable guess that the Smartphone Data Collection is mobile App based.
“Thousand and 1 Questions”: Does the data collected represents a normal O2 Smartphone user? or a particular segment that don’t mind having a Software Sniffer (i.e., The Sniffer) on the used device reporting his behavior? Is “The Sniffer” a standard already installed (and activated?) App on all Smartphone devices?, only on a certain segment? or is it downloadable? (i..e, which would require a certain effort from the customer), is the collection done for both prepaid & contract customers, both old and new smartphones (i.e., usage patterns depends on OS version/type, device capabilities such as air interface speed DL & UL, CPU, memory management, etc..) … is WiFi included or excluded?, what about Apps running in the background (are these included), etc…
I should point out that it is always much easier to poke at somebody else data analysis than it often is to collect, analyse and present such data. Though, depending on the answer to the above “1,000 + 1” questions the O2 data either becomes a fair representation of an O2 Smartphone customer or “just” an interesting data point for one of their segments.
If the average Smartphone cellular (i.e., no WiFi blend) monthly consumption in UK is ca. 450MB (+/-50MB) and if the consumer had on average cellular speed of 0.5Mbps (i.e., likely conservative with exception of streaming services which could be lower), one would expect that Time spend consuming Network Resources would be no more than 120 minutes per month or 5 minutes per day (@ R99 384kbps this would be ca. 6 min per day). If I would chose a more sophisticated QoS distribution, the Network Consumption Time would anyway not change with an order of magnitude or more.
So we have 5 minutes of Mobile Data Network Time Consumption daily versus O2’s Smartphone usage time of 106 minutes (wo Calls & SMS) … A factor 22 in difference!
For every minutes of mobile data network consumption the customer spends 20+ minutes actively with his device (i.e., reading, writing, playing, etc..).
So …. Can we trust the O2 Smartphone data?
Trend wise the data certainly appear reasonable! Whether the data represents a majority of the O2 smartphone users or not … I doubt somewhat. However, without having a more detailed explanation of data collection, sampling, and analysis it’s difficult to conclude how representable the O2 Smartphone data really is for their Smartphone customers.
Alas this is the problem with most of the mobile data user and usage statistics being presented to the public as an average (i.e., have had my share of this challenge as well).
Clearly we spend a lot more time with our device than the device spends actively at the mobile network. This trend has been known for a long time from the fixed internet. O2 points out that the Smartphone, with its mobile applications, has become the digital equivalent to a “Swiss Army Knife” and as a consequence (as Tomi also points out in his Blog) already in the process of replacing a host of legacy consumer devices, such as the watch, alarm clock, camera (both still pictures and video), books, music radios, and of course last but not least substituting The POMTS.
I have made argued and shown examples that Average Numbers we are presented with are notorious by character. What other choices do we have? Would it be better to report the Median? rather than the Average (or Mean)? The Median divides a given consumptive distribution in half (i.e., 50% of customers have a consumption below the Median and 50% above). Alternative we could report the Mode which would give us the most frequent consumption across our consumer distribution.
Of course if consumer usage was distributed normally (i.e., symmetric bell shaped) Mean, Median and Mode would be one and the same (and we would all be happy and bored). Not so much luck!
Most consumptive behaviors tends to be much more skewed and asymmetric (i.e., “the few takes the most”) than the normal distribution (that most of us instinctively uses when we are presented with figures). Most people are not likely to spend much thought on how a given number is calculated. However, it might be constructive to provide a %tage of the customers for which their usage is below the reported average. The reader should however note that in case the percentage figure is different from 50%, the consumptive distribution is skewed and
onset of Reality Distortion has occurred.