Sparse Coding in a Social System

July 2012

How do we understand our social environment? If you're like me, when you think about people you know, you probably think in terms of groups: immediate family, friends from high school, colleagues. Each group contains people who tend to show up together. If I try to remember, for instance, who attended my wedding, I first think of these groups. This type of organization makes sense in terms of something psychologists call "chunking." This is the idea that people tend to recall complex information by breaking it into logical parts. This is why you can more easily remember the letters of a long password if it consists of familiar words.

When I started collaborating with Jessica Flack and David Krakauer, then at the Santa Fe Institute, we wondered if similar ideas would be useful in the context of a social environment for which we had lots of detailed data: the outbreaks of conflict in a society of macaque monkeys. The fighting behavior of these 47 primates indicates that they have knowledge of an established social hierarchy and that they are able to make strategic decisions based on this knowledge. Given this, we wondered: If a monkey is trying to anticipate common patterns of conflict, would it make sense to memorize the groups who commonly appeared together? And would such a strategy work as well as, say, remembering how often every pair of individuals were simultaneously aggressive?

In answering these questions, we were inspired by the study of sparse coding in neuroscience. Like thinking in terms of groups at a wedding, it turns out that common visual scenes can be described more efficiently in terms of larger shapes and lines than by describing them pixel by pixel. Neuroscientists have discovered that neurons in the brain represent what you see in terms of this kind of sparse code. As you read this sentence, your visual cortex is responding to the lines that make up the text as one of the first steps in the process that allows you to recognize each word. Applying the same logic to the conflict data set, we used a machine learning algorithm to discover the groups of macaques that could most succinctly represent the fights that occurred.

We found that this was a good way of representing the patterns of conflict, and that remembering just the groups found by sparse coding could be used to make predictions about who was likely to show up in a given fight; the performace was as good as the alternate approach of remembering the relationships of every pair of invidiauals. The sparse groups were related to known categories, such as kin groups, and their size told us that most relevant social structure happened on a scale of around 2 or 3 individuals.

Finally, in terms of information theory, these representations are a form of compression. Instead of having to remember every possible pattern of conflict, an individual can remember just the sparse groups and still make accurate predictions. In fact, we can estimate how much information an individual would have to remember. By measuring the effects of varying the degree of compression, we estimate that a monkey would have to remember about 1000 bits of social information to make optimal predictions about future fights.

For more information, check out the University of Wisconsin–Madison press release and our full published paper [pdf].