Sunday 4 October 2015

Keynes and Hume on Probability - what would they make of Big Data?

Hume dedicates some attention to the problem of probability in his theory of scientific knowledge. One of the most penetrating commentaries on his approach and its relation to Hume's contemporaries was produced by John Maynard Keynes in his "Treatise on Probability" of 1921. Keynes analysis is not often mentioned today when probability plays an increasing role in underpinning the statistical approaches to big data and information theory. Keynes himself only had to worry about the statistical inferences in economic and social theory - what would he have said about Shannon's information theory?

Ernst Ulrich von Weizsäcker argues that Shannon's H measure conflated two concepts of 'novelty' and 'confirmation' inherent in meaningful information (see http://www.amazon.co.uk/gp/search?index=books&linkCode=qs&keywords=9783319036625 and  Robert Ulanowicz's paper http://www.mdpi.com/2078-2489/2/4/624/pdf). However, this point about the conflation between novelty and confirmation is something that is picked-up on by Keynes:
“Uninstructed commonsense seems to be specially unreliable in dealing with what are termed 'remarkable occurrences'. Unless a ‘remarkable occurrence’ is simply one which produces on us a particular psychological eect, that of surprise, we can only define it as an event which before its occurrence is very improbable on the available evidence. But it will often occur—whenever, in fact, our data leave open the possibility of a large number of alternatives and show no preference for any of them—that every possibility is exceedingly improbable à priori. It follows, therefore, that what actually occurs does not derive any peculiar significance merely from the fact of its being ‘remarkable’ in the above sense.”
Keynes builds on Hume's thinking about causes, which emphasises the role of confirmation in causal reasoning:
"All kinds of reasoning from causes or effects are founded on two particulars, viz. the constant conjunction of any two objects in all past experience, and the resemblance of a present object to any of them. Without some degree of resemblance, as well as union, ’tis impossible there can be any reasoning"
"When we are accustomed to see two impressions conjoined together, the appearance or idea of the one immediately carries us to the idea of the other.... Thus all probable reasoning is nothing but a species of sensation. ’Tis not solely in poetry and music, we must follow our taste and sentiment, but likewise in philosophy. When I am convinced of any principle, ’tis only an idea, which strikes more strongly upon me. When I give the preference to one set of arguments above another, I do nothing but decide from my feeling concerning the superiority of their influence.”
Unless scientists can produce event regularities, there is no ground for reasoning about causes. However, if all regularities simply confirmed each other, there would be nothing that each repetition of the confirmation would add. The basis of reasoning is the repetition to produce some difference, as Keynes notes:
"The object of increasing the number of instances arises out of the fact that we are nearly always aware of some difference between the instances, and that even where the known difference is insignificant we may suspect, especially when our knowledge of the instances is very incomplete, that there may be more. Every new instance may diminish the unessential resemblances between the instances and by introducing a new difference increase the Negative Analogy. For this reason, and for this reason only, new instances are valuable. "
Keynes's starting point is Hume's thinking about the expectation of the taste of eggs. Here again, Hume indicates the need for balance between novelty and confirmation:
"Nothing so like as eggs; yet no one, on account of this apparent similarity, expects the same taste and relish in all of them. ’Tis only after a long course of uniform experiments in any kind, that we attain a firm reliance and security with regard to a particular event. Now where is that process of reasoning, which from one instance draws a conclusion, so different from that which it infers from a hundred instances, that are no way different from that single instance? This question I propose as much for the sake of information, as with any intention of raising difficulties. I cannot find, I cannot imagine any such reasoning. But I keep my mind still open to instruction, if any one will vouchsafe to bestow it on me."
Keynes argues that Hume's argument combines analogy with induction. There is analogy in the identification of the likeness of phenomena (eggs being alike), and there is induction in having experienced so many eggs, a supposition about their taste arises: "We argue from Analogy in so far as we depend upon the likeness of the eggs, and from Pure Induction when we trust the number of the experiments."  Keynes also find echoes of Hume's distinctions in Cournot's theory of probability:
“Cournot, [...] distinguishes between ‘subjective probability’ based on ignorance and ‘objective probability’ based on the calculation of ‘objective possibilities,’ an ‘objective possibility’ being a chance event brought about by the combination or convergence of phenomena belonging to independent series.”
Keynes points out that the balance between analogy and induction is incomplete in Hume's thinking, and that Hume's identification of the contribution of many identical experiments to induction loses sight of the fact that some variation in experiments is a necessary condition for the construction of knowledge:
"His argument could have been improved. His experiments should not have been too uniform, and ought to have differed from one another as much as possible in all respects save that of the likeness of the eggs. He should have tried eggs in the town and in the country, in January and in June. He might then have discovered that eggs could be good or bad, however like they looked. This principle of varying those of the characteristics of the instances, which we regard in the conditions of our generalisation as non-essential, may be termed Negative Analogy. It will be argued later on that an increase in the number of experiments is only valuable in so far as, by increasing, or possibly increasing, the variety found amongst the non-essential characteristics of the instances, it strengthens the Negative Analogy.
If Hume’s experiments had been absolutely uniform, he would have been right to raise doubts about the conclusion. There is no process of reasoning, which from one instance draws a conclusion different from that which it infers from a hundred instances, if the latter are known to be in no way different from the former."
It seems to me that Keynes's 'negative analogy' is a deliberate probing for the constraints of a general principle. The implication is that Hume's regularity theory does not really depend on strict regularities; it requires a certain degree of difference.

So what about probability and information? The striking thing about both Keynes and Hume is that the human psychological aspect of probability is clearly on display: this is not a mathematical abstraction; probability cannot escape the human realm of expectation. Shannon's 'engineering problem' of information based on probability loses sight of this - his 'novelty' and 'confirmation' appear as a single number indicating the degree of 'uncertainty' of a symbol's value. Behind it, however, lies the analogical and inductive reasoning which is deeply human.

Information however, creates its own reality. It can create its own realm of novelty and confirmation to the point where what is confirmed to us is an artificial representation of some other reality, whose actual nature would not produce the same confirmation. Keynes's point about negative analogy would provide a corrective to this. We should explore our expectations against "a variety of non-essential characteristics" of instances.

Instead the designers of big data algorithms want to show that they "work". They can exploit the creation of 'false confirmation' and argue their case. And yet regularities of any sort are hard to identify, let alone the varying of non-essential characteristics. How is this scientific? The human expectation on viewing the results of big data analysis are already framed by technologies which are underpinned by the same formulae that produce the analyses. Part of the problem lies in the subsumption of phenomena within Shannon's formulae, which on the one hand, is blind to its human hinterland of "species of sensation", whilst on the other creates equivalences among phenomena which in reality are not equivalent. Unlike things become alike; everything becomes eggs!

And yet there is something important in Shannon's work - but it lies not in blind application of the equations. Instead it lies in the negative analogies produced and the novelty and confirmation that arise between Shannon's powerful generative ideas and the encounter with the real world. It is in discovering the contours of fit between Shannon's abstractions and the human intersubjective world of expectations and surprises. And this may fit with Hume's own thinking about probabilities. 

No comments: