Tuesday 2 May 2017

Relative Entropy in the Analysis of Educational Video

Relative entropy is a calculation much used by quantum physicists to measure degrees of entanglement between subatomic particles. Its formal form is the Kullback-Leibler equation:
It isn't as scary as it looks (information theory rarely is!) - it's basically a metric of distance between a probability distribution P and a distribution Q. If two subatomic particles are entangled (in other words, their behaviour will be coordinated), then the distance between the probability distribution of their behaviour (their expected states) will be zero. 

That Quantum physics tells us something we already know about nature and social life is reflected in the various fluffy uses of "entanglement" (e.g. Latour, Barad, etc) in the social science literature. But this is rarely done with any real insight into what it actually means. It basically seems to say "it's complex, init!".

I'm grateful to Loet Leydesdorff for pointing me in the direction of Kullback-Leibler after I requested some degree of measurement for the synergy between different entropy values for different variables. My inspiration for asking this was in thinking about music. Music presents many descriptions to us: rhythm, melody, harmony, timbre, dynamics, etc. Something happens in music when the change in any of these dimensions is accompanied with a similar change in another dimension: so the rhythm changes with melody, for example. At these moment, we often detect some new idea or motif - it's at these moments that things grow. Basically, I'm drawing on a musical experiment I did a few years ago: http://dailyimprovisation.blogspot.co.uk/2015/09/entropy-and-aesthetics-some-musical.html

The same kind of technique can be applied to the analysis of video. Like music, video presents many different descriptions of things. 

I've been looking at Vi Hart's wonderful video on Fibonacci numbers and spirals. 



There are a rich range of descriptions contained in this video, and I was wondering how the probability distribution of each description relates to the distribution of other descriptions. So I've been doing some analysis, using Kinovea for video analysis, Puredata for analysis of the pitch and rhythm of speech, and using YouTube to produce a transcript of the video from which I can do some entropy calculations. 

After munching on the data and converting it into a form I can deal with, I've imported it all into a Jupyter notebook using Python's Panda dataframes, queried it using sql (using the pysqldf library), and done entropy calculations on the whole thing. 

My code is still a bit rubbish, but it's beginning to tell me things. For example, I can look at the changes in entropy of the transcribed text over window periods as the video progresses. So here is a list of the first 20 seconds in 5-second chunks:

0-5: -0.25206419825534054
5-10: -0.24292065819269668
10-15: -0.3868528072345415
15-20: -0.3333333333333334

Now I can do the same for the 'events' which occur in the video. Here I was a bit stuck to describe things, so that when she drew a spiral, I wrote "spiral". She draws a lot of spirals, so the entropy is uninteresting...

0-5: 0
5-10: 0
10-15: 0
15-20: 0

What? Well, maybe there's an error in my coding - I might go back and add some more detail to my analysis. She keeps on drawing spirals, and therefore the entropy is 0.

What about the pitch of her voice? That's the interesting one... I used PD to do this using fiddle~ (I first played with Fiddle~ in PD years ago in improvisation: http://dailyimprovisation.blogspot.co.uk/2008/06/playing-with-pd-fiddle.html - it just goes to show the importance of documenting everything that we do!)
Now the pitches are more interesting than the video events:

0-5: -0.4533324434922346
5-10: -0.366932572935196
10-15:-0.5315857945285835
15-20: -0.6913119495075026

Is there a correlation there? Well, the range of pitches in the voice increases with the variety of vocabulary used in the text. Perhaps that isn't surprising. But it's not surprising for a reason which has everything to do with relative entropy: the entropy of the use of words is likely to be coupled with the pitch, because with more words, there are more syllables and potentially more opportunities for variety in the pitch. Over a more extended period of time, and taking into account that events do occur in the video which increase its entropy, we can start to examine the relationship between the different aspects of what happens. 

The fact that there is a kind of stable ritual of drawing spirals which runs alongside an increase in the variety of words spoken and pitches used suggests that the actions in the video are a kind of 'accompaniment' to the words that are spoken. To begin with, the ritual of drawing spirals is a kind a 'drone' against which other things happen. As in music, the drone maintains the coherence of the piece. 

Image if she started differently: if she started by doing the maths straight away.. then it would have a very different dynamic. The entropies would also be very different.

No comments: