0
Average: I Do Not Think It Means What You Think It Means

The Princess Bride theatrical release posterLast month, I wrote a post titled, “Statistics: I Do Not Think It Means What You Think It Means.”  The title paid homage to one of my favorite sources of movie quotes, The Princess Bride (you can view this particular quote on YouTube).  Since that article, I have read no fewer than a dozen more posts from various bloggers who try to draw conclusions from data on social media by calculating averages and medians.  When it comes to using these calculations in social media all I can say is, “You keep on using that word. I do not think it means what you think it means.”

Primer (I promise this won’t hurt)

Most of the statistical jargon we use (like averages, medians, and standard deviations) apply to data populations that follow a normal distribution, also popularly known as a “bell curve.”  This describes data that is more or less centered around an average (or “mean”) value.  The median is a line of demarcation, having half of the values above and below this number.  The general shape of this bell curve is defined by the standard deviation.  A high standard deviation results in a short, wide bell while a low standard deviation results in a tall, narrow bell.

Bell Curve

Graph depicting a normal distribution and its mean.

The key to normal distributions is randomness.  Without a process that is truly random, you will not have a normal distribution and terms such as average and median can be thrown out the window.  This is a common pitfall in statistical process control, where processes controlled by machines are sometimes not as random as people think.  The caution here is that if you’re going to measure statistics like average, median, and standard deviation you must be sure that the data population forms a normal distribution.   Otherwise, the numbers will not mean what you think they mean.

Enter the power law distribution, also known as a Pareto distribution or the 80/20 rule.  Unlike a normal distribution, these values are not symmetrical, but highly weighted against one end of the graph.  This type of relationship typically represents social and economic data patterns, and is especially true in social media.

Power law graph

Graph of a power law (or Pareto) distribution and its mean

A power law distribution can represent participation rates in social media; for example, the number of contributions (Y axis) per participant (X axis).  It can also be used to describe a demand curve; for example, the price of a given product (Y axis) versus the demand for it (X axis).

What I Think It Means

This is very much a case of caveat emptor.  I’m not saying that the word “average” is meaningless in power law distributions, just that it has a very, very different meaning than most people understand it to have.  If you read, for example, an article that talks about the “average number of retweets/posts/likes” etc… then you must first determine whether or not the data population from which it was calculated is a normal distribution.  This can be done pretty easily through a mathematical calculation but many of the people touting these statistics don’t know how or even why they should.

So before you make any decisions based on what the “average” person does, make sure it means what you think it means.

  • Share/Bookmark

Continue Reading

0

Making membership free will not, in and of itself, build an effective army. First, they must be recruited. This is where the long tail comes into play. Next, they must be equipped with the latest technology, afforded competent and inspiring leaders, and trained in effective tactics.

Recruitment

The “Long Tail” is a phrase attributed to Wired magazine editor Chris Anderson, who wrote an article in 2004 about “Why the Future of Business is Selling Less of More.” Engineers and economists would already be familiar with the numerical component of this phenomenon, known as the power law distribution curve or more colloquially the “80/20” rule. The transaction costs historically associated with institutions’ ability to organize people required that the focus scarce resources (people and money) on the top 20% of people who would contribute 80% of the work, value, donations, revenue, content, etc…

I think the current strategies are more focused on maintaining the 20% than pulling in the other 80%. It can be done now, because (assuming a freemium model) the transaction costs to the organization and barriers to entry for the members is essentially nil. One example (though certainly not the only) is the reluctance to embrace strategies that rely exclusively on Internet-based technologies. The reasoning is that not all members have broadband access or are active enough, which may or may not be true (show me the numbers). But the point is that it doesn’t matter. I submit that losing a few thousand members in order to gain a hundred thousand is a good trade. Sometimes you need to fire your customer and go find better ones.

The benefits to a long tail may be self evident to anyone reading this, but in case they are not I will provide two anecdotes. The first is taken from Clay Shirky’s book “Here Comes Everybody” which is, incidentally, the most profound book you can read on the subject of the “power of organizing without organizations.” Shirky discusses the fundamental misunderstand of open source software by Microsoft. They long fought against the concept of open source, calling it a myth that it was developed by thousands of programmers because only a few hundred contributed more than a few lines of code. He goes on to say, “It’s easy to see, from McGrath’s [Microsoft U.K. executive] point of view, why the open source model is the wrong way to design an operating system: when you hire programmers, they drain your resources through everything from salary to health care to free Cokes in the break room.” Microsoft simply can’t afford to pay a programmer’s entire annual salary for a mere two dozen lines of code. But what if that code fixed a buffer overflow vulnerability that put millions of computers at risk? Borrowing from the cheesy world of informercials, now how much would you pay? The point here is that when the transaction costs of organizing are free, so are the failures. You can afford the hundreds or thousands of failures in exchange for one game-changing success.

The second anecdote is simply an interesting description of Amazon.com’s business model. As one employee described it, “We sold more books today that didn’t sell at all yesterday than we sold today of all the books that did sell yesterday.” That one takes a few moments to digest, but it’s the quintessential differentiator between atoms and bits, scarcity and abundance, costly and free.

Equipment

Once free membership and the long tail begin filling the membership hopper, the next step is to “arm” them with the latest technology. Much like the weaponry for a particular soldier is dependent on his or her mission, so to must our tool sets match the mission.

  • Listening – Our members must have state of the art tools for listening to ISA activity in their preferred communication channels. These may include any combination of RSS, email, Twitter, FriendFeed, Facebook, etc… And by the way, these will change from year to year.
  • Sharing – When members have news to share or interesting ideas, there need to be easy and efficient ways to share that information. Again, there is no single tool or technology but ISA must be connected with all of the common platforms. Automated social media channels could be set up to mash up mentions on various networks.
  • Collaborating – Every Department, Division, and standards committee must have its own shared workspace for effective collaboration. This could be done today for free using sites like Ning. They are currently being assembled ad-hoc, which is obviously inefficient and alienates ISA from the process.
  • Publishing – In order to leverage the contributions of the long tail, members must have a platform to easily publish their work, accomplishments, opinions, and (perhaps) open letters to ISA leaders. There are a multitude of technologies available to do this in extremely interesting and profound ways. One of the simpler tools is simply to enable a blogging platform where all members get to publish themselves in a common area, perhaps by subject area. This is not unlike traditional user forums with a few exceptions. Blogs allow complex content and multimedia (images, video, and presentations), they are more easily indexed by search engines, and they are an easier platform to monitor.

Leadership

It is important not to look at the membership as one big mob, but as a collection of interconnected networks. Member Sally may be a cybersecurity expert in HMI systems and wireless communication with zero interest in the finer points of flow measurement and calibration, while Sam may be a pump designer who does care about flow, while Dan is VP of marketing for a large automation distributor and needs to a little bit about everything. With the proper tools in place as already described, these small networks will organize themselves and the leaders will naturally rise to the challenge. ISA’s job will transition out of the planning and organization business and into the coordination business. It will be important to provide these leaders with the tools, support, and motivation to succeed at leading their respective tribes.

Tactics

Some members will be right at home in this new paradigm (they will be the first generation of leaders). Others will need some training, best practices, guidelines, tips, and hacks. The better ISA can train the leaders in effective tactics, the more value the members will be able to provide. It’s not unlike training office staff in using Word and Excel.

  • Share/Bookmark

Continue Reading