Saturday, 16 November 2019

Maximum Entropy

On discussing Rossini's music with Loet Leydesdorff a couple of weeks ago (after we had been to a great performance of the Barber of Seville), I mentioned the amount of redundancy in the music - the amount of repetition. "That increases the maximum entropy," he said. This has set me thinking, because there is a lot of confusion about entropy, variety, uncertainty and maximum entropy.

First of all, the relationship between redundancy and entropy is one of figure and ground. Entropy, in Shannon's sense, is a measure of the average surprisingness in a message. That surprisingness is partly produced because all messages are created within constraints - whether it is the constraints of grammar on words in a sentence, or the constraints of syntax and spelling in the words themselves. And there are multiple constraints - letters, words, grammar, structure, meaning, etc.

Entropy is easy to calculate. There is a famous formula without which much on the internet wouldn't work.



Of course, there are lots of questions to ask about this formula. Why is the log there, for example? Just to make the numbers smaller? Or to give weight to something (Robert Ulanowicz takes this route when arguing that the log was there in Boltzmann in order to weight the stuff that wasn't there)

Redundancy can be calculated from entropy.. at least theoretically.

Shannon's formula suggests that for any "alphabet", there is a maximum value of entropy. It is called Maximum entropy. If the measured entropy is seen as a number between 0 and the maximum amount of entropy possible, then to calculate the "ground", or the redundancy, we simply calculate the proportion of the measured entropy to the maximum entropy and subtract it from 1.

Now mathematically, if the redundancy increases, then either the amount of information decreases (H) or the maximum entropy (Hmax) increases. If we simply repeat things, then you could argue that the entropy (H) goes down because it becomes less surprising, and therefore R goes up. If by repeating things we generate new possibilities (which is also true in music), then we could say that Hmax goes up.

No composer, and no artist, ever literally repeats something. Everything is varied (the variation form in music being the classic example). Each new variation is an alternative description. Each new variation introduces a new possibilities. So I think it is legitimate to say the maximum entropy increases. This is particularly true of "variation form" in music.

Now, away from music, what do new technologies do? Each of them introduces a new way of doing something. That too must be an increase in the maximum entropy. It's not an increase in entropy itself. So new technologies introduce redundant options which increase maximum entropy.

If maximum entropy is increased, then the complexity of messages also increases - or rather the potential for disorder and surprise. The important point is that in communicating and organising, one has to make a selection. Selection, in this sense, means to reduce the amount of entropy so that against however many options we have, we insist on saying "it's option x". Against the background of increasing maximum entropy, this selection gets harder. This is where "uncertainty" lies: it is the index of the selection problem within an environment of increasing maximum entropy.

However, there is another problem which is more difficult. Shannon's formula for entropy counts an "alphabet" of signals or events like a, b, c, etc. Each has a probability and each is added to the eventual number. Is an increase in the maximum entropy an increase in the alphabet of countable events? Intuitively it feels like it must be. But at what point can a calculation be made when at any point the full alphabet is incomplete?

This is the problem of the non-ergodic nature of life processes. I've attempted a solution to this which examines the relative entropies over time, considering new events as unfolding patterns in these relations. It's a bit simplisitic, but it's a start. The mechanism that seems to drive coherence is able, through the production of redundancies which increase maximum entropy, to construct over time a pattern which serves to make the selection and reduce the entropy to zero. This is wave-like in nature. So the process of increasing maximum entropy which leads to the selection of entropy to zero is followed by another wave, building on the first, but basically doing the same thing.

In the end, everything is zero.

No comments: