Tuesday 5 November 2019

Non-Linear Dynamics, Machine Learning and Physics meets education

In my recent talk about machine learning (in which I've been particularly focussing on convolutional neural networks because they present such a compelling case for how the technology has improved), I explored the recursive functions which can be used to classify data such as k-means. The similarity between non-linear dynamics of agent-based modelling and the recursive loss functions of convolutional neural network training are striking. It is hard for people new to machine learning to understand that we know very little of what is going on inside. The best demonstration of why we know so little comes from demonstrating the non-linear dynamic emergent behaviour in an agent-based model. Are they actually the same thing in different guises? If so, then we have a way of thinking about their differences.

The obvious difference is time. A non-linear agent-based model's behaviour emerges over time. Some algorithms will settle on fixed points (if k-means didn't do this it would be useless), while other models will continue to feed their outputs into their inputs endlessly producing streams of emergent behaviour. The convolutional process appears to settle on fixed points, but in fact it rarely fully "settles" - one can run the python "model.fit()" function for ever, and no completely stable version emerges, although stability is established within a small fluctuating range.

I discussed this fluctuation with Belgian mathematician Daniel Dubois yesterday. Daniel's work is on anticipatory systems, and he built a mathematical representation of the dynamics that were originally introduced by biologist Robert Rosen. Anticipation, in the work of Dubois, results from fractal structures. In a sense, this is obvious: to see the future, the world needs to be structured in a way in which patterns established in the past can be seen to relate to the future. If machine learning systems are anticipatory (and they appear to be able to predict categories of data they haven't seen before), then they too will contain a fractal structure.

Now a fractal is produced through a recursive non-linear process which results in fixed points. This all seems to be about the same thing. So the next question (one which I was asking both Daniel Dubois, and Loet Leydesdorff who I saw at the weekend) is how deep does this go? For Loet, the fractal structures are in communication systems (Luhmann's social systems), and (importantly) they can be analysed using Shannon's information theory. Daniel (on whose work Loet has constructed his system), agrees. But when we met, he was more interested to talk about his work in physics on the Dirac equation and what he believes to be a deeper significance of Shannon. I don't fully understand this yet, but we both agreed that if there is a deeper significance to Shannon, then it was a complete accident because Shannon only half-understood what he was doing... Half-understanding things can be way forwards!

Daniel's work on Dirac mirrors that of both Peter Rowlands in Liverpool and Lou Kauffman in Chicago (and now Novosibirsk). They all know each other very well. They all think that the physical world is basically "nothing". They all agree on the language of "nilpotents" (things multiplying to zero) and quaternions (complex numbers which produce a rotational geometry) as the fundamental building blocks of nature. There is an extraordinary intellectual confluence emerging here which unites fundamental physics with technology and consciousness. Who could not find that exciting?? It must have significance for education!

What's it all about? The clue is probably in Shannon: information. And I think it is not so much the information that is involved in learning processes (which has always been the focus of cognitivism). It is the way information is preserved in institutions - from the very small institutions of friendship and family, to larger ones like universities and countries.

Our technologies are technologies of categorisation and they throw away information. Since the computer revolution, holes have appeared in our social institutions which have destabilised them. The anticipatory function, which is essential to all living things, was replaced with a categorising function. The way we use machine learning also tends to categorise: this would make things worse.  But if it is an anticipatory system, it can do other things - it can provide a stimulus for thought and conversation, and in the process put information back into the system.

That is the hope. That is why we need to understand what this stuff does. And that is why, through understanding what our technology does, we might understand not only what we do, but what our institutions need to do to maintain their viability.

Education is not really about schools and universities. Those are examples of institutions which are now becoming unviable. Neither, I think, is it really about "learning" as such (as a psychological process - which ultimately is uninspectable). Education is about "institutions" in the broadest sense: families, friendships, coffee bars, businesses, hospitals... in fact anywhere which maintains information. To understand education is to understand how the processes which maintain information really work, how they can be broken with technologies, and how they can be improved with a different approach to technology. 

No comments: