Bayesianism and Properly Basic Belief

Kircher-Diagram_of_the_names_of_GodXu and Tenebaum, in Word Learning as Bayesian Inference (Psychological Review, 2007), develop a very simple Bayesian model of how children (and even adults) build semantic associations based on accumulated evidence. In short, they find contrastive elimination approaches as well as connectionist methods unable to explain the patterns that are observed. Specifically, the most salient problem with these other methods is that they lack the rapid transition that is seen when three exemplars are presented for a class of objects associated with a word versus one exemplar. Adults and kids (the former even more so) just get word meanings faster than those other models can easily show. Moreover, a space of contending hypotheses that are weighted according to their Bayesian statistics, provides an escape from the all-or-nothing of hypothesis elimination and some of the “soft” commitment properties that connectionist models provide.

The mathematical trick for the rapid transition is rather interesting. They formulate a “size principle” that weights the likelihood of a given hypothesis (this object is most similar to a “feb,” for instance, rather than the many other object sets that are available) according to a scaling that is exponential in the number of exposures. Hence the rapid transition:

Hypotheses with smaller extensions assign greater probability than do larger hypotheses to the same data, and they assign exponentially greater probability as the number of consistent examples increases.

It should be noted that they don’t claim that the psychological or brain machinery implements exactly this algorithm. As is usual in these matters, it is instead likely that whatever machinery is involved, it simply has at least these properties. It may very well be that connectionist architectures can do the same but that existing approaches to connectionism simply don’t do it quite the right way.… Read the rest

Entanglement and Information

shannons-formula-smallResearch can flow into interesting little eddies that cohere into larger circulations that become transformative phase shifts. That happened to me this morning between a morning drive in the Northern California hills and departing for lunch at one of our favorite restaurants in Danville.

The topic I’ve been working on since my retirement is whether there are preferential representations for optimal automated inference methods. We have this grab-bag of machine learning techniques that use differing data structures but that all implement some variation on fitting functions to data exemplars; at the most general they all look like some kind of gradient descent on an error surface. Getting the right mix of parameters, nodes, etc. falls to some kind of statistical regularization or bottlenecking for the algorithms. Or maybe you perform a grid search in the hyperparameter space, narrowing down the right mix. Or you can throw up your hands and try to evolve your way to a solution, suspecting that there may be local optima that are distracting the algorithms from global success.

Yet, algorithmic information theory (AIT) gives us, via Solomonoff, a framework for balancing parameterization of an inference algorithm against the error rate on the training set. But, first, it’s all uncomputable and, second, the AIT framework just uses strings of binary as the coded Turing machines, so I would have to flip 2^N bits and test each representation to get anywhere with the theory. Yet, I and many others have had incremental success at using variations on this framework, whether via Minimum Description Length (MDL) principles, it’s first cousin Minimum Message Length (MML), and other statistical regularization approaches that are somewhat proxies for these techniques.… Read the rest

Artsy Women

Victoire LemoineA pervasive commitment to ambiguity. That’s the most compelling sentence I can think of to describe the best epistemological stance concerning the modern world. We have, at best, some fairly well-established local systems that are reliable. We have consistency that may, admittedly, only pertain to some local system that is relatively smooth or has a modicum of support for the most general hypotheses that we can generate.

It’s not nihilistic to believe these things. It’s prudent and, when carefully managed, it’s productive.

And with such prudence we can tear down the semantic drapery that commands attention at every turn, from the grotesqueries of the political sphere that seek to command us through emotive hyperbole to the witchdoctors of religious canons who want us to immanentize some silly Middle Eastern eschaton or shoot up a family-planning clinic.

It is all nonsense. We are perpetuating and inventing constructs that cling to our contingent neurologies like mold, impervious to the broadest implications and best thinking we can muster. That’s normal, I suppose, for that is the sub rosa history of our species. But only beneath the firmament, while there is hope above and inventiveness and the creation of a new honor that derives from fairness and not from reactive disgust.

In opposition to the structures that we know and live with—that we tolerate—there is both clarity in this cocksure target and a certainty that, at least, we can deconstruct the self-righteousness and build a new sensibility to (at least) equality if not some more grand vision.

I picked up Laura Marling’s Short Movie last week and propagated it to various cars. It is only OK, but it joins a rather large collection of recent female musicians in my music archive.… Read the rest