Tuesday 7 August 2018

Learning Analytics and the Ergodic Fallacy

Ergodicity is a statistical property of system whereby a measure of its behaviour over a particular time period is seen to be the same as a measure of its behaviour over a long period. It's often used in information theory as a way of describing the Shannon entropy of a system over one period of time to another. For example, Shannon measured the entropy of the English language as a distribution of letters. This measurement is ergodic to the extent that the shorter sample contains a representative sample of the grammar and syntax of the language, and that over a larger sample the value of the entropy of the language is no different.

The principles of ergodicity rely on some assumptions:
  1. the "alphabet" of what is counted is made explicit at the beginning and does not change. 
  2. the constraints which determine the distribution of symbols do not change
Shannon's reticence to apply information theory to the study of meaning, or the more profound features of living systems rested, I believe, on the fact that he knew that with living systems, neither of these assumptions about ergodicity are true. 

Living systems do not present a complete countable "alphabet" of things to count at the outset. Countable things are emergent: before you have fingers, you have cells.  Partly this is because the second assumption is wrong - the constraints within which living things grow and adapt are continually changing. Indeed, the very process of life is a process of auto-generating constraint - what the biologists call "autocatalysis".

Learning is part of a living process. Learning conversations are rather like dances where nobody quite knows the rules, and indeed, the rules change as it goes along. But somehow the whole thing has coherence. This is a mystery which I believe the analysis of music can help illuminate. Both music and learning conversations develop by creating new constraints and emerging new significant (countable) things. The dynamics and effectiveness of a learning conversation, as a piece of music, rests on how those constraints interact. It turns out that each new expression is a redundant expression - it says something that has already been said before but in a different way. 

So if learning is non-ergodic, why the hell do we think we can count it using Shannon formulae and other statistical devices? This point was missed by Kathy O'Neill's otherwise excellent "Weapons of Maths Destruction", where she instead focused on the latent (white male) agendas behind the analytic boom. She's right about the agendas, but the question is how they get there. 

Repressive agendas arise when a narrative is asserted about the way the world is which is wrong. Analytics is an excellent vehicle for this because the study of ergodicity and information theory is not widespread. But the agenda lies in the assertion of an alphabet being "fixed" which isn't fixed or universal, or that the "rules" of learning are universal and not emergent, or that learning conversations are indicated by clicks on a screen whose coherence is unimportant. 

The search for statistical techniques to examine non-ergodic phenomena is well underway - particularly in ecology (see for example https://www.amazon.co.uk/Third-Window-Natural-Beyond-Newton/dp/159947154X) or https://papers.ssrn.com/sol3/papers.cfm?abstract_id=3030525

In education, there simply isn't enough knowledge about complex systems. And in fact, nobody - yet - has an adequate statistical arsenal to shed any light on it in any field. So why are we behaving as if we do? Who's setting the agenda?

No comments: