# Boredom and Being a Decider

Seth Lloyd and I have rarely converged (read: absolutely never) on a realization, but his remarkable 2013 paper on free will and halting problems does, in fact, converge on a paper I wrote around 1986 for an undergraduate Philosophy of Language course. I was, at the time, very taken by Gödel, Escher, Bach: An Eternal Golden Braid, Douglas Hofstadter’s poetic excursion around the topic of recursion, vertical structure in ricercars, and various other topics that stormed about in his book. For me, when combined with other musings on halting problems, it led to a conclusion that the halting problem could be probabilistically solved by an observer who decides when the recursion is too repetitive or too deep. Thus, it prescribes an overlay algorithm that guesses about the odds of another algorithm when subjected to a time or resource constraint. Thus we have a boredom algorithm.

I thought this was rather brilliant at the time and I ended up having a one-on-one with my prof who scoffed at GEB as a “serious” philosophical work. I had thought it was all psychedelically transcendent and had no deep understanding of more serious philosophical work beyond the papers by Kripke, Quine, and Davidson that we had been tasked to read. So I plead undergraduateness. Nevertheless, he had invited me to a one-on-one and we clashed over the concept of teleology and directedness in evolutionary theory. How we got to that from the original decision trees of halting or non-halting algorithms I don’t recall.

But now we have an argument that essentially recapitulates that original form, though with the help of the Hartmanis-Stearns theorem to support it. Whatever the algorithm that runs in our heads, it needs to simulate possible outcomes and try to determine what the best course of action might be (or the worst course, or just some preference). That algorithm is in wetware and is therefore perfectly deterministic. And, importantly, quantum indeterminacy doesn’t rescue us from the free-will implications of that determinism at all; randomness is just random, not decision-making. Instead, the impossibility of assessing the possible outcomes comes from one algorithm monitoring another. In a few narrow cases, it may be possible to enumerate all the stopping results of the enclosed algorithm, but in general, all you can do is greedily terminate branches in the production tree based on some kind of temporal or resource-based criteria,

Free will is neither random nor classically deterministic, but is an algorithmic constraint on the processing power to simulate reality in a conscious, but likely deterministic, head.

# Traitorous Reason, Facts, and Analysis

Obama’s post-election press conference was notable for its continued demonstration of adult discourse and values. Especially notable:

This office is bigger than any one person and that’s why ensuring a smooth transition is so important. It’s not something that the constitution explicitly requires but it is one of those norms that are vital to a functioning democracy, similar to norms of civility and tolerance and a commitment to reason and facts and analysis.

But ideology in American politics (and elsewhere) has the traitorous habit of undermining every one of those norms. It always begins with undermining the facts in pursuit of manipulation. Just before the election, the wizardly Aron Ra took to YouTube to review VP-elect Mike Pence’s bizarre grandstanding in Congress in 2002:

And just today, Trump lashed out at the cast of Hamilton for lecturing Mike Pence on his anti-LGBTQ stands, also related to ideology and belief, at the end of a show.

Astonishing as this seems, we live in an imperfect world being drawn very slowly away from tribal and xenophobic tendencies, and in fits and starts. My wife received a copy of letter from now-deceased family that contained an editorial from the Shreveport Journal in the 1960s that (with its embedded The Worker editorial review) simultaneously attacked segregationist violence, the rhetoric of Alabama governor George Wallace, claimed that communists were influencing John F. Kennedy and the civil rights movement, demanded the jailing of communists, and suggested the federal government should take over Alabama:

The accompanying letter was also concerned over the fate of children raised as Unitarians, amazingly enough, and how they could possibly be moral people. It then concluded with a recommendation to vote for Goldwater.

Is it any wonder that the accompanying cultural revolutions might lead to the tearing down of the institutions that were used to justify the deviation away from “reason and facts and analysis?”

But I must veer to the positive here, that this brief blip is a passing retrenchment of these old tendencies that the Millennials and their children will look back to with fond amusement, the way I remember Ronald Reagan.

# A Big Data Jeremiad and the Moral Health of America

The average of polls were wrong. The past-performance-weighted, hyper-parameterized, stratified-sampled, Monte Carlo-ized collaborative predictions fell as critically short in the general election as they had in the Republican primary. There will be much soul searching to establish why that might have been; from ground game engagement to voter turnout, from pollster bias to sampling defects, the hit list will continue to grow.

Things were less predictable than it seemed. During the 2008 and 2012 elections, the losing party proxies held that the polls were inherently flawed, though they were ultimately predictive. Now, in 2016, they were inherently flawed and not at all predictive.

But what the polls showed was instructive even if their numbers were not quite right. Specifically, there was a remarkable turn-out for Trump among white, less-educated voters who long for radical change to their economic lives. The Democratic candidate was less clearly engaging.

Another difference emerged, however. Despite efforts to paint Hillary Clinton as corrupt or a liar, objective fact checkers concluded that she was, in fact, one of the most honest candidates in recent history, and that Donald Trump was one of the worst, only approximated by Michelle Bachman in utter mendacity. We can couple that with his race-bating, misogyny, hostility, divorces, anti-immigrant scapegoating, and other childish antics. Yet these moral failures did not prevent his supporters from voting for him in numbers.

But his moral failures may be precisely why his supporters found him appealing. Evangelicals decided for him because Clinton was a threat to overturning Roe v. Wade, while he was an unknown who said a few contradictory things in opposition. His other moral issues were less important—even forgivable. In reality, though, this particular divide is an exemplar for a broader division in the moral fabric of America. The white working class has been struggling in post-industrial America for decades. Coal mining gives way to fracked, super-abundant natural gas. A freer labor market moves assembly overseas. The continuous rise in productivity shifts value away from labor in the service of innovation to disintermediated innovation itself.

The economic results are largely a consequence of freedom, a value that becomes suffused in the polarized economy where factories close on egghead economic restructuring. Other values come into question as well. Charles Murray’s Coming Apart: The State of White America, 1960-2010, brought a controversial conservative lens to the loss of traditional values for working class America. In this world, marriage, church, and hard work have dissolved due to the influence of the 60s pernicious counter-cultural deconstruction that was revolutionary for the college-educated elite but destructive to the working class. What is left is a vacuum of virtues where the downtrodden lash out at the eggheads from the coasts. The moral failings of a scion of wealth itself are recognizable and forgivable because at least there is a sense of change and some simple diagnostics about what is wrong with our precious state.

So we are left with pussy grabbing, with the Chinese hoax of climate change, with impossible border walls, with a fornicator-in-chief misogynist, with a gloomy Jeremiad of divided America being exploited into oblivion. Even the statisticians were eggheaded speculators who were manipulating the world with their crazy polls. But at least it wasn’t her.

# Desire and Other Matters

“What matters?” is a surprisingly interesting question. I think about it constantly since it weighs-in whenever plotting future choices, though often I seem to be more autopilot than consequentialist in these conceptions. It is an essential first consideration when trying to value one option versus another. I can narrow the question a bit to “what ideas matter?” This immediately externalizes the broad reality of actions that meaningfully improve lives, like helping others, but still leaves a solid core of concepts that are valued more abstractly. Does the traditional Western liberal tradition really matter? Do social theories? Are less intellectually-embellished virtues like consistency and trust more relevant and applicable than notions like, well, consequentialism?

Maybe it amounts to how to value certain intellectual systems against others?

Some are obviously more true than others. So “dowsing belief systems” are less effective in a certain sense than “planetary science belief systems.” Yet there are a broader range of issues at work.

But there are some areas of the liberal arts that have a vexing relationship with the modern mind. Take linguistics. The field ranges from catalogers of disappearing languages to theorists concerned with how to structure syntactic trees. Among the latter are the linguists who have followed Noam Chomsky’s paradigm that explains language using a hierarchy of formal syntactic systems, all of which feature recursion as a central feature. What is interesting is that there have been very few impacts of this theory. It is very simple at its surface: languages are all alike and involve phrasal groups that embed in deep hierarchies. The specific ways in which the phrases and their relative embeddings take place may differ among languages, but they are alike in this abstract way.

And likewise we have to ask what the impact is of scholarship like René Girard’s theory of mimesis. The theory has a Victorian feel about it: a Freudian/Jungian essential psychological tendency girds all that we know, experience, and see. Violence is the triangulation of wanton desire as we try to mimic one another. That triangulation was suppressed—sublimated, if you will—by sacrifice that refocused the urge to violence on the sacrificial object. It would be unusual for such a theory to rise above the speculative scholarship that only queasily embraces empiricism without some prodding.

But maybe it is enough that ideas are influential at some level. So we have Ayn Rand, liberally called-out by American economic conservatives, at least until they are reminded of Rand’s staunch atheism. And we have Peter Thiel, from PayPal mafia to recent Gawker lawsuits, justifying his Facebook angel round based on Girard’s theory of mimesis. So we are all slaves of our desires to like, indirectly, a bunch of crap on the internet. But at least it is theoretically sound.

# Subtly Motivating Reasoning

Continuing on with the general theme of motivated reasoning, there are some rather interesting results reported in New Republic, here. Specifically, Ian Anson from University of Maryland, Baltimore County, found that political partisans reinforced their perspectives on the state of the U.S. economy more strongly when they were given “just the facts” rather than a strong partisan statement combined with the facts. Even when the partisan statements aligned with their own partisan perspectives, the effect held.

The author concludes that people, in constructing their views of the causal drivers of the economy, believe that they are unbiased in their understanding of the underlying mechanisms. The barefaced partisan statements interrupt that construction process, perhaps, or at least distract from it. Dr. Anson points out that subtly manufacturing consent therefore makes for better partisan fellow travelers.

There are a number of theories concerning how meanings must get incorporated into our semantic systems, and whether the idea of meaning itself is as good or worse than simply discussing reference. More, we can rate or gauge the uncertainty we must have concerning complex systems. They seem to form a hierarchy, with actors in our daily lives and the motivations of those we have long histories with in the mostly-predictable camp. Next we may have good knowledge about a field or area of interest that we have been trained in. When this framework has a scientific basis, we also rate our knowledge as largely reliable, but we also know the limits of that knowledge. It is in predictive futures and large-scale policy that we become subject to the difficulty of integrating complex signals into a cohesive framework. The partisans supply factoids and surround them with causal reasoning. We weigh those against alternatives and hold them as tentative. But then we have to exist in a political life, as well, and it’s not enough to just proclaim our man or woman or party as great and worthy of our vote and love, we must also justify that consideration.

I speculate now that it may be possible to wage war against partisan bias by employing the exact methods described as effective by Dr. Anson. Specifically, if in any given presentation of economic data there was one fact presented that appeared to undermine the partisan position otherwise described by the data, would it lead to a general weakening of the mental model in the reader’s head? For instance, compare the following two paragraphs:

The unemployment rate has decreased from a peak of 10% in 2009 to 4.7% in June of 2016. This rate doesn’t reflect the broader, U-6, rate of nearly 10% that includes the underemployed and others who are not seeking work. Wages have been down or stagnant over the same period.

Versus:

The unemployment rate has decreased from a peak of 10% in 2009 to 4.7% in June of 2016. This rate doesn’t reflect the broader, U-6, rate of nearly 10% that includes the underemployed and others who are not seeking work. Wages have been down or stagnant over the same period even while consumer confidence and spending has risen to an 11-month high.

The second paragraph adds an accurate but upbeat and contradictory signal to the more subtle gloom of the first paragraph. Of course, partisan hacks will naturally avoid doing this kind of thing. Marketers and salespeople don’t let the negative signals creep in if they can avoid it, but I would guess that a subtle contradiction embedded in the signal would disrupt the conspiracy theorists and the bullshit artists alike.

# Euhemerus and the Bullshit Artist

Sailing down through the Middle East, past the monuments of Egypt and the wild African coast, and then on into the Indian Ocean, past Arabia Felix, Euhemerus came upon an island. Maybe he came upon it. Maybe he sailed. He was perhaps—yes, perhaps; who can say?—sailing for Cassander in deconstructing the memory of Alexander the Great. And that island, Panchaea, held a temple of Zeus with a written history of the deeds of men who became the Greek gods.

They were elevated, they became fixed in the freckled amber of ancient history, their deeds escalated into myths and legends. And, likewise, the ancient tribes of the Levant brought their El and Yah-Wah, and Asherah and Baal, and then the Zoroastrians influenced the diaspora in refuge in Babylon, until they returned and had found dualism, elemental good and evil, and then reimagined their origins pantheon down through monolatry and into monotheism. These great men and women were reimagined into something transcendent and, ultimately, barely understandable.

Even the rational Yankee in Twain’s Connecticut Yankee in King Arthur’s Court realizes almost immediately why he would soon rule over the medieval world as he is declared a wild dragon when presented to the court. He waits for someone to point out that he doesn’t resemble a dragon, but the medieval mind does not seem to question the reasonableness of the mythic claims, even in the presence of evidence.

So it goes with the human mind.

And even today we have Fareed Zakaria justifying his use of the term “bullshit artist” for Donald Trump. Trump’s logorrhea is punctuated by so many incomprehensible and contradictory statements that it becomes a mythic whirlwind. He lets slip, now and again, that his method is deliberate:

DT: Therefore, he was the founder of ISIS.

HH: And that’s, I’d just use different language to communicate it, but let me close with this, because I know I’m keeping you long, and Hope’s going to kill me.

Bullshit artist is the modern way of saying what Euhemerus was trying to say in his fictional “Sacred History.” Yet we keep getting entranced by these coordinated maelstroms of utter crap, from World Net Daily to Infowars to Fox News to Rush Limbaugh. Only the old Steven Colbert could contend with it through his own bullshit mythical inversion. Mockery seems the right approach, but it doesn’t seem to have a great deal of impact on the conspiratorial mind.

# Motivation, Boredom, and Problem Solving

In the New York Times Stone column, James Blachowicz of Loyola challenges the assumption that the scientific method is uniquely distinguishable from other ways of thinking and problem solving we regularly employ. In his example, he lays out how writing poetry involves some kind of alignment of words that conform to the requirements of the poem. Whether actively aware of the process or not, the poet is solving constraint satisfaction problems concerning formal requirements like meter and structure, linguistic problems like parts-of-speech and grammar, semantic problems concerning meaning, and pragmatic problems like referential extension and symbolism. Scientists do the same kinds of things in fitting a theory to data. And, in Blachowicz’s analysis, there is no special distinction between scientific method and other creative methods like the composition of poetry.

We can easily see how this extends to ideas like musical composition and, indeed, extends with even more constraints that range from formal through to possibly the neuropsychology of sound. I say “possibly” because there remains uncertainty on how much nurture versus nature is involved in the brain’s reaction to sounds and music.

In terms of a computational model of this creative process, if we presume that there is an objective function that governs possible fits to the given problem constraints, then we can clearly optimize towards a maximum fit. For many of the constraints there are, however, discrete parameterizations (which part of speech? which word?) that are not like curve fitting to scientific data. In fairness, discrete parameters occur there, too, especially in meta-analyses of broad theoretical possibilities (Quantum loop gravity vs. string theory? What will we tell the children?) The discrete parameterizations blow up the search space with their combinatorics, demonstrating on the one hand why we are so damned amazing, and on the other hand why a controlled randomization method like evolutionary epistemology’s blind search and selective retention gives us potential traction in the face of this curse of dimensionality. The blind search is likely weakened for active human engagement, though. Certainly the poet or the scientist would agree; they are using learned skills, maybe some intellectual talent of unknown origin, and experience on how to traverse the wells of improbability in finding the best fit for the problem. This certainly resembles pre-training in deep learning, though on a much more pervasive scale, including feedback from categorical model optimization into the generative basis model.

But does this extend outwards to other ways in which we form ideas? We certainly know that motivated reasoning is involved in key aspects of our belief formation, which plays strongly into how we solve these constraint problems. We tend to actively look for confirmations and avoid disconfirmations of fit. We positively bias recency of information, or repeated exposures, and tend to only reconsider in much slower cycles.

Also, as the constraints of certain problem domains become, in turn, extensions that can result in change—where there is a dynamic interplay between belief and success—the fixity of the search space itself is no longer guaranteed. Broad human goals like the search for meaning are an example of that. In come complex human factors, like how boredom correlates with motivation and ideological extremism (overview, here, journal article, here).

This latter data point concerning boredom crosses from mere bias that might preclude certain parts of a search space into motivation that focuses it, and that optimizes for novelty seeking and other behaviors.

# Quantum Field Is-Oughts

Sean Carroll’s Oxford lecture on Poetic Naturalism is worth watching (below). In many ways it just reiterates several common themes. First, it reinforces the is-ought barrier between values and observations about the natural world. It does so with particular depth, though, by identifying how coarse-grained theories at different levels of explanation can be equally compatible with quantum field theory. Second, and related, he shows how entropy is an emergent property of atomic theory and the interactions of quantum fields (that we think of as particles much of the time) and, importantly, that we can project the same notion of boundary conditions that result in entropy into the future resulting in a kind of effective teleology. That is, there can be some boundary conditions for the evolution of large-scale particle systems that form into configurations that we can label purposeful or purposeful-like. I still like the term “teleonomy” to describe this alternative notion, but the language largely doesn’t matter except as an educational and distinguishing tool against the semantic embeddings of old scholastic monks.

Finally, the poetry aspect resolves in value theories of the world. Many are compatible with descriptive theories, and our resolution of them is through opinion, reason, communications, and, yes, violence and war. There is no monopoly of policy theories, religious claims, or idealizations that hold sway. Instead we have interests and collective movements, and the above, all working together to define our moral frontiers.

# Local Minima and Coatimundi

Even given the basic conundrum of how deep learning neural networks might cope with temporal presentations or linear sequences, there is another oddity to deep learning that only seems obvious in hindsight. One of the main enhancements to traditional artificial neural networks is a phase of supervised pre-training that forces each layer to try to create a generative model of the input pattern. The deep learning networks then learn a discriminant model after the initial pre-training is done, focusing on the error relative to classification versus simply recognizing the phrase or image per se.

Why this makes a difference has been the subject of some investigation. In general, there is an interplay between the smoothness of the error function and the ability of the optimization algorithms to cope with local minima. Visualize it this way: for any machine learning problem that needs to be solved, there are answers and better answers. Take visual classification. If the system (or you) gets shown an image of a coatimundi and a label that says coatimundi (heh, I’m running in New Mexico right now…), learning that image-label association involves adjusting weights assigned to different pixels in the presentation image down through multiple layers of the network that provide increasing abstractions about the features that define a coatimundi. And, importantly, that define a coatimundi versus all the other animals and non-animals.,

These weight choices define an error function that is the optimization target for the network as a whole, and this error function can have many local minima. That is, by enhancing the weights supporting a coati versus a dog or a raccoon, the algorithm inadvertently leans towards a non-optimal assignment for all of them by focusing instead on a balance between them that is predestined by the previous dog and raccoon classifications (or, in general, the order of presentation).

Improvements require “escaping” these local optima in favor of a global solution that accords the best overall outcome to all the animals and a minimization of the global error. And pre-training seems to do that. It likely moves each discriminative category closer to the global possibilities because those global possibilities are initially encoded by the pre-training phase.

This has the added benefit of regularizing or smoothing out the noise that is inherent in any real data set. Indeed, the two approaches appear to be closely allied in their impact on the overall machine learning process.

# New Behaviorism and New Cognitivism

Deep Learning now dominates discussions of intelligent systems in Silicon Valley. Jeff Dean’s discussion of its role in the Alphabet product lines and initiatives shows the dominance of the methodology. Pushing the limits of what Artificial Neural Networks have been able to do has been driven by certain algorithmic enhancements and the ability to process weight training algorithms at much higher speeds and over much larger data sets. Google even developed specialized hardware to assist.

Broadly, though, we see mostly pattern recognition problems like image classification and automatic speech recognition being impacted by these advances. Natural language parsing has also recently had some improvements from Fernando Pereira’s team. The incremental improvements using these methods should not be minimized but, at the same time, the methods don’t emulate key aspects of what we observe in human cognition. For instance, the networks train incrementally and lack the kinds of rapid transitions that we observe in human learning and thinking.

In a strong sense, the models that Deep Learning uses can be considered Behaviorist in that they rely almost exclusively on feature presentation with a reward signal. The internal details of how modularity or specialization arise within the network layers are interesting but secondary to the broad use of back-propagation or Gibb’s sampling combined with autoencoding. This is a critique that goes back to the early days of connectionism, of course, and why it was somewhat sidelined after an initial heyday in the late eighties. Then came statistical NLP, then came hybrid methods, then a resurgence of corpus methods, all the while with image processing getting more and more into the hand-crafted modular space.

But we can see some interesting developments that start to stir more Cognitivism into this stew. Recurrent Neural Networks provided interesting temporal behavior that might be lacking in some feedforward NNs, and Long-Short-Term Memory (LSTM) NNs help to overcome some specific limitations of  recurrent NNs like the disconnection between temporally-distant signals and the reward patterns.

Still, the modularity and rapid learning transitions elude us. While these methods are enhancing the ability to learn the contexts around specific events (and even the unique variability of contexts), that learning still requires many exposures to get right. We might consider our language or vision modules to be learned over evolutionary history and so not expect learning within a lifetime from scratch to result in similarly structured modules, but the differences remain not merely quantitative but significantly qualitative. A New Cognitivism requires more work to rise from this New Behaviorism.