And not 3 or 13. Albert Einstein was known for distinguishing 5 different things with one view (whereas most of us dabble around 3). Yet the most successful number for distinguishing differences is 7. Take music, for example, the most widespread “language” across the world. Some 80% of all music is based on a 7-tone scale. There are other scales, like 5-tone (as in Irish music) or 12-tone (as in modern classic), but the lower the number, the more appealing the music is. 5-tone is very popular. But the majority of listeners wants more variation, modes, and melody complexity. So that’s why 7-tone music is the most popular. However, above 7 tones, the air gets a little thin. In 12-tone scales, very few listeners find an emotional core. Most simply switch off.
But if you write nice music using 7 tones, you can be sure you will not lack an audience. If, in addition, you play around with 7 different modes, you will be able to fulfil the musical dreams of nearly everyone.
It is kind of funny that this harmonizes with the results of our learning logic, as we use it in AI applications. Human perception goes like this: when it comes to finding the most efficient set of data, you are not at the end until you have reached something around 7. You may even be able to drive it down to 5, but never lower, and you rarely find a set of 12, and never above.
Funny, isn’t it? It doesn’t tell us anything about the reality around us, but something about the perception of the home computer that we all carry with us. It can do many things, but in terms of distinguishing, it is short sighted.
It also tells a story about Big Data. Dear CIO, if you haven’t identified your set of 7, you can’t manage your data – except for storing it. For instance, if you cannot decide what is trash (which, as everybody knows, is about 99% of data), you have to store it until someone decides. But because you need all your power to store the trash, you don’t find time to identify your value set of 7.
At this point, a friend of mine, who manages a data warehouse, interrupts: ‘All I can do is store the data that comes in. I am not paid to decide what to throw away. This is what others on a higher level need to take responsibility for.’ And who is that? There are usually many theories, but nobody knows for sure who is responsible. And in the end nobody takes a decision.
What sort of “creatures of the night” are hiding in the data? The non-walking undead, undiscovered and non-existing; the shadow world of trash in modern customer warehouses. A final comment from my data warehouse friend: ‘in 4000 years, archaeologists will dive into the data and take a look at us!’ Indeed, on the black market, few things are more appreciated than dinosaur droppings. Let’s produce more of it for the sake of generations to come!
Or let’s make peace with the fact that there is no sense where we don’t look for it; so we should be happy with our 7 because we simply can’t make sense of everything.