Category: Philosophy

Zebras with Machine Guns

I was just rereading some of the literature on Plantinga’s Evolutionary Argument Against Naturalism (EAAN) as a distraction from trying to write too much on ¡Reconquista!, since it looks like I am on a much faster trajectory to finishing the book than I had thought. EAAN is a curious little argument that some have dismissed as a resurgent example of scholastic theology. It has some newer trappings that we see in modern historical method, however, especially in the use Bayes’ Theorem to establish the warrant of beliefs by trying to cast those warrants as probabilities.

A critical part of Plantinga’s argument hinges on the notion that evolutionary processes optimize against behavior and not necessarily belief. Therefore, it is plausible that an individual could hold false beliefs that are nonetheless adaptive. For instance, Plantinga gives the example of a man who desires to be eaten by tigers but always feels hopeless when confronted by a given tiger because he doesn’t feel worthy of that particular tiger, so he runs away and looks for another one. This may seem like a strange conjunction of beliefs and actions that happen to result in the man surviving, but we know from modern psychology that people can form elaborate justifications for perceived events and wild metaphysics to coordinate those justifications.

If that is the case, for Plantinga, the evolutionary consequence is that we should not trust our belief in our reasoning faculties because they are effectively arbitrary. There are dozens of responses to this argument that dissect it from many different dimensions. I’ve previously showcased Branden Fitelson and Elliot Sober’s Plantinga’s Probability Arguments Against Evolutionary Naturalism from 1997, which I think is one of the most complete examinations of the structure of the argument. There are two critical points that I think emerge from Fitelson and Sober. First, there is the sober reminder of the inherent frailty of scientific method that needs to be kept in mind. Science is an evolving work involving many minds operating, when at its best, in a social network that reduces biases and methodological overshoots. It should be seen as a tentative foothold against “global skepticism.”

The second, and critical take-away from that response is more nuanced, however. The notion that our beliefs can be arbitrarily disconnected from adaptive behavior in an evolutionary setting, like the tiger survivor, requires a very different kind of evolution than we theorize. Fitelson and Sober point out that if anything was possible, zebras might have developed machine guns to defend against lions rather than just cryptic stripes. Instead, the sieve of possible solutions to adaptive problems is built on the genetic and phenotypic variants that came before. This will limit the range of arbitrary, non-true beliefs that can be compatible with an adaptive solution. If the joint probability of true belief and adaptive behavior is much higher than the alternative, which we might guess is true, then there is a greater probability that our faculties are reliable. In fact, we could argue that using a parsimony argument that extends Bayesian analysis to the general case of optimal inductive models (Sober actually works on this issue extensively), that there are classes of inductive solutions that, through eliminating add-ons, outperform predictively those solutions that have extra assumptions and entities. So, P(not getting eaten | true belief that tigers are threats) >> P(not getting eaten | false beliefs about tigers), especially when updated over time. I would be remiss if I didn’t mention that William of Ockham of Ockham’s Razor-fame was a scholastic theologian, so if Plantinga’s argument is revisiting those old angels-head-pin-style arguments, it might be opposed by a fellow scholastic.

Brain Gibberish with a Convincing Heart

Elon Musk believes that direct brain interfaces will help people better transmit ideas to one another in addition to just allowing thought-to-text generation. But there is a fundamental problem with this idea. Let’s take Hubert Dreyfus’ conception of the way meaning works as being tied to a more holistic view of our social interactions with others. Hilary Putnam would probably agree with this perspective, though now I am speaking for two dead philosphers of mind. We can certainly conclude that my mental states when thinking about the statement “snow is white” are, borrowing from Putnam who borrows from Quine, different from a German person thinking “Schnee ist weiß.” The orthography, grammar, and pronunciation are different to begin with. Then there is what seems to transpire when I think about that statement: mild visualizations of white snow-laden rocks above a small stream for instance, or, just now, Joni Mitchell’s “As snow gathers like bolts of lace/Waltzing on a ballroom girl.” The centrality or some kind of logical ground that merely asserts that such a statement is a propositional truth that is shared in some kind of mind interlingua doesn’t bear much fruit to the complexities of what such a statement entails.

Religious and political terminology is notoriously elastic. Indeed, for the former, it hardly even seems coherent to talk about the concept of supernatural things or events. If they are detectable by any other sense than some kind of unverifiable gnosis, then they are at least natural in that they are manifesting in the observable world. So supernatural imposes a barrier that seems to preclude any kind of discussion using ordinary language. The only thing left is a collection of metaphysical assumptions that, in lacking any sort of reference, must merely conform to the patterns of synonymy, metonymy, and other language games that we ordinarily reserve for discernible events and things. And, of course, where unverifiable gnosis holds sway, it is not public knowledge and therefore seems to mainly serve as a social mechanism for attracting attention to oneself.

Politics takes on a similar quality, with it often said to be a virtue if a leader can translate complex policies into simple sound bites. But, as we see in modern American politics, what instead happens is that abstract fear signaling is the primary currency to try to motivate (and manipulate) the voter. The elasticity of a concept like “freedom” is used to polarize the sides of political negotiation that almost always involves the management of winners and losers and the dividing line between them. Fear mixes with complex nostalgia about times that never were, or were more nuanced than most recall, and jeremiads serve to poison the well of discourse.

So, if I were to have a brain interface, it might be trainable to write words for me by listening to the regular neural firing patterns that accompany my typing or speaking, but I doubt it would provide some kind of direct transmission or telepathy between people that would have any more content than those written or spoken forms. Instead, the inscrutable and non-referential abstractions about complex ideas would be tied together and be in contrast with the existing holistic meaning network. And that would just be gibberish to any other mind. Worst still, such a system might also be able to convey raw emotion from person to person, thus just amplifying the fear or joy component of the idea without being able to transmit the specifics of the thoughts. And that would be worse than mere gibberish, it would be gibberish with a convincing heart.

Inclement Science

Found at 6,500 feet in New Mexico’s Organ Mountains this morning, driven into an old log, facing White Sands Missile Range:

Can’t help but think it is a statement on the threat to climate science and missions like Jason-3, but someone likely just lost it on the trail and a good soul pushed the pin into the wood for potential rediscovery.

The Obsessive Dreyfus-Hawking Conundrum

I’ve been obsessed lately. I was up at 5 A.M. yesterday and drove to Ruidoso to do some hiking (trails T93 to T92, if interested). The San Augustin Pass was desolate as the sun began breaking over, so I inched up into triple digit speeds in the M6. Because that is what the machine is made for. Booming across White Sands Missile Range, I recalled watching base police work with National Park Rangers to chase oryx down the highway while early F117s practiced touch-and-gos at Holloman in the background, and then driving my carpool truck out to the high energy laser site or desert ship to deliver documents.

I settled into Starbucks an hour and a half later and started writing on ¡Reconquista!, cranking out thousands of words before trying to track down the trailhead and starting on my hike. (I would have run the thing but wanted to go to lunch later and didn’t have access to a shower. Neither restaurant nor diners deserve an après-run moi.) And then I was on the trail and I kept stopping and taking plot and dialogue notes, revisiting little vignettes and annotating enhancements that I would later salt in to the main text over lunch. And I kept rummaging through the development of characters, refining and sifting the facts of their lives through different sets of sieves until they took on both a greater valence within the story arc and, often, more comedic value.

I was obsessed and remain so. It is a joyous thing to be in this state, comparable only to working on large-scale software systems when the hours melt away and meals slip as one cranks through problem after problem, building and modulating the subsystems until the units begin to sing together like a chorus. In English, the syntax and semantics are less constrained and the pragmatics more pronounced, but the emotional high is much the same.

With the recent death of Hubert Dreyfus at Berkeley it seems an opportune time to consider the uniquely human capabilities that are involved in each of these creative ventures. Uniquely, I suggest, because we can’t yet imagine what it would be like for a machine to do the same kinds of intelligent tasks. Yet, from Stephen Hawking through to Elon Musk, influential minds are worried about what might happen if we develop machines that rise to the level of human consciousness. This might be considered a science fiction-like speculation since we have little basis for conjecture beyond the works of pure imagination. We know that mechanization displaces workers, for instance, and think it will continue, but what about conscious machines?

For Dreyfus, the human mind is too embodied and situational to be considered an encodable thing representable by rules and algorithms. Much like the trajectory of a species through an evolutionary landscape, the mind is, in some sense, an encoded reflection of the world in which it lives. Taken further, the evolutionary parallel becomes even more relevant in that it is embodied in a sensory and physical identity, a product of a social universe, and an outgrowth of some evolutionary ping pong through contingencies that led to greater intelligence and self-awareness.

Obsession with whatever cultivars, whatever traits and tendencies, lead to this riot of wordplay and software refinement is a fine example of how this moves away from the fears of Hawking and towards the impossibilities of Dreyfus. We might imagine that we can simulate our way to the kernel of instinct and emotion that makes such things possible. We might also claim that we can disconnect the product of the effort from these internal states and the qualia that defy easy description. The books and the new technologies have only desultory correspondence to the process by which they are created. But I doubt it. It’s more likely that getting from great automatic speech recognition or image classification to the general AI that makes us fearful is a longer hike than we currently imagine.

Tweak, Memory

Artificial Neural Networks (ANNs) were, from early on in their formulation as Threshold Logic Units (TLUs) or Perceptrons, mostly focused on non-sequential decision-making tasks. With the invention of back-propagation training methods, the application to static presentations of data became somewhat fixed as a methodology. During the 90s Support Vector Machines became the rage and then Random Forests and other ensemble approaches held significant mindshare. ANNs receded into the distance as a quaint, historical approach that was fairly computationally expensive and opaque when compared to the other methods.

But Deep Learning has brought the ANN back through a combination of improvements, both minor and major. The most important enhancements include pre-training of the networks as auto-encoders prior to pursuing error-based training using back-propagation or  Contrastive Divergence with Gibbs Sampling. The critical other enhancement derives from Schmidhuber and others work in the 90s on managing temporal presentations to ANNs so the can effectively process sequences of signals. This latter development is critical for processing speech, written language, grammar, changes in video state, etc. Back-propagation without some form of recurrent network structure or memory management washes out the error signal that is needed for adjusting the weights of the networks. And it should be noted that increased compute fire-power using GPUs and custom chips has accelerated training performance enough that experimental cycles are within the range of doable.

Note that these are what might be called “computer science” issues rather than “brain science” issues. Researchers are drawing rough analogies between some observed properties of real neuronal systems (neurons fire and connect together) but then are pursuing a more abstract question as to how a very simple computational model of such neural networks can learn. And there are further analogies that start building up: learning is due to changes in the strength of neural connections, for instance, and neurons fire after suitable activation. Then there are cognitive properties of human minds that might be modeled, as well, which leads us to a consideration of working memory in building these models.

It is this latter consideration of working memory that is critical to holding stimuli presentations long enough that neural connections can process them and learn from them. Schmidhuber et. al.’s methodology (LSTM) is as ad hoc as most CS approaches in that it observes a limitation with a computational architecture and the algorithms that operate within that architecture and then tries to remedy the limitation by architectural variations. There tends to be a tinkering and tweaking that goes on in the gradual evolution of these kinds of systems until something starts working. Theory walks hand-in-hand with practice in applied science.

Given that, however, it should be noted that there are researchers who are attempting to create a more biologically-plausible architecture that solves some of the issues with working memory and training neural networks. For instance, Frank, Loughry, and O’Reilly at University of Colorado have been developing a computational model that emulates the circuits that connect the frontal cortex and the basal ganglia. The model uses an elaborate series of activating and inhibiting connections to provide maintenance of perceptual stimuli in working memory. The model shows excellent performance on specific temporal presentation tasks. In its attempt to preserve a degree of fidelity to known brain science, it does lose some of the simplicity that purely CS-driven architectures provide, but I think it has a better chance of helping overcome another vexing problem for ANNs. Specifically, the slow learning properties of ANNs have only scant resemblance to much human learning. We don’t require many, many presentations of a given stimulus in order to learn it; often, one presentation is sufficient. Reconciling the slow tuning of ANN models, even recurrent ones, with this property of human-like intelligence remains an open issue, and more biology may be the key.

The Inevitability of Cultural Appropriation

Picasso in Native HeaddressI’m on a TGV from Paris to Monaco. The sun was out this morning and the Jardin de Tuileries was filled with homages in tulips to various still lifes at the Louvre. Two days ago, at the Musée de quai Branly—Jacques Chirac, I saw the Picasso Primitif exposition that showcased the influence of indigenous arts on Picasso’s work through the years, often by presenting statues from Africa or Papua New Guinea side-by-side with examples of Picasso’s efforts through the years. If you never made the connection between his cubism and the statuary of Chad (like me), it is eye opening. He wasn’t particularly culturally sensitive—like everyone else until at least the 1960s—because the fascinating people and their cultural works were largely aesthetic objects to him. If he was aware of the significance of particular pieces (and he might have been), it was something he rarely acknowledged or discussed. The photos that tie Picasso to the African statues are the primary thread of the exhibition, with each one, taken at his California atelier or in Paris or whatnot, inscribed by the curators with a dainty red circle or oval to highlight a grainy African statue lurking in the background. Sometimes they provide a blow-up in case you can’t quite make it out. It is only with a full Native American headdress given to Picasso by the actor Gary Cooper that we see him actively mugging for a camera and providing weight to the show’s theme. Then, next, Brigitte Bardot is leaning over him at the California studio and her cleavage renders the distant red oval uninteresting.

I am writing daily about things I don’t fully understand but try to imbue with a sense of character, of interest, and even of humor. In Against Superheroes I try to give a feel for Turkey, despite having never been there and only been introduced to one Turk, a computational linguist for the language, once. Did I do a good job? I can’t say. The audience is not necessarily Turks who would find fault with my renderings. Yet I do strive towards accuracy. I drill down with Google Earth. I read the history. I read recent  politics and analysis and try to imagine what it can be like to be a person there, immersed in that cultural microcosm.

Similar things are afoot in ¡Reconquista!, my newest novel. Though I grew up in the border region with Mexico, I, unlike my son who took three years of it in California, have only telegraphic and pornographic Spanish at my command. Yet I am developing an elaborate plot that weaves together the lives of an underemployed blue-collar white man with a revolutionary-minded Hispanic woman professor who drinks tequila like it’s water and speaks in elaborate abstractions about topics like, well, cultural appropriation. That’s a fighting phrase for her, despite the other incongruities in the tapestry of her life.

Should I feel confident about writing like this? And if I should not, what can I write about? And, the obverse might apply: should an outsider feel free to write about the array of complex social and political issues that make up America? In 2015, Lionel Shriver, the author of a book that got some press and was made into a movie, caused an uproar when she donned a sombrero in Brisbane, Australia and made a series of declarations that such cultural appropriation that might arise from, especially, white males writing about other cultures, should be treated as a celebration of those cultures rather than an attack upon them. Identity is a nebulous concept, she seeemed to be saying, and tying it down to ability, disability, tendency, orientation, upbringing, religion, culture, or nationality does a disservice to the spinning of a good yarn.

I’m certainly not fully in agreement with this, but I do sympathize with the notion that it is critical for writers to embrace the complexity of the pluralistic world we now live in. Doing less than that, avoiding painting pictures that are as polyglot and multifaceted as America and Europe, leaves little room for authenticity unless the works are written by a balanced committee. Perhaps the more important take-away is that building a more diverse collection of critics and reviewers can help, in turn, provide a better filter for the authenticity that, perhaps, critics of Shriver are looking for. This would parallel efforts to rectify the lack of diversity among Hollywood producers, directors, writers, actors, and voting members of the Academy.

I will close by noting that a chubby little French senior is attempting surgery to extract a splinter from his finger across from me. His wife was helping for a bit, too, stabbing at his index with a white Swiss Army knife that he spent some time surveying and unfolding before landing on the right weapon for the job. She hurt him too much, though, it seemed, and he waved her away. This, in public, and in first class? I suppose I need more data points on the French mind that is increasingly moving towards a closed focus on preserving Frenchness against the outsider. Safe for splinter-stabbing, I suppose.

The Ethics of Knowing

In the modern American political climate, I’m constantly finding myself at sea in trying to unravel the motivations and thought processes of the Republican Party. The best summation I can arrive at involves the obvious manipulation of the electorate—but that is not terrifically new—combined with a persistent avoidance of evidence and facts.

In my day job, I research a range of topics trying to get enough of a grasp on what we do and do not know such that I can form a plan that innovates from the known facts towards the unknown. Here are a few recent investigations:

  • What is the state of thinking about the origins of logic? Logical rules form into broad classes that range from the uncontroversial (modus tollens, propositional logic, predicate calculus) to the speculative (multivalued and fuzzy logic, or quantum logic, for instance). In most cases we make an assumption based on linguistic convention that they are true and then demonstrate their extension, despite the observation that they are tautological. Synthetic knowledge has no similar limitations but is assumed to be girded by the logical basics.
  • What were the early Christian heresies, how did they arise, and what was their influence? Marcion of Sinope is perhaps the most interesting one of these, in parallel with the Gnostics, asserting that the cruel tribal god of the Old Testament was distinct from the New Testament Father, and proclaiming perhaps (see various discussions) a docetic Jesus figure. The leading “mythicists” like Robert Price are invaluable in this analysis (ignore first 15 minutes of nonsense). The thin braid of early Christian history and the constant humanity that arises in morphing the faith before settling down after Nicaea (well, and then after Martin Luther) reminds us that abstractions and faith have a remarkable persistence in the face of cultural change.
  • How do mathematical machines take on so many forms while achieving the same abstract goals? Machine learning, as a reificiation of human-like learning processes, can imitate neural networks (or an extreme sketch and caricature of what we know about real neural systems), or can be just a parameter slicing machine like Support Vector Machines or ID3, or can be a Bayesian network or mixture model of parameters.  We call them generative or non-generative, we categorize them as to discrete or continuous decision surfaces, and we label them in a range of useful ways. But why should they all achieve similar outcomes with similar ranges of error? Indeed, Random Forests were the belles of the ball until Deep Learning took its tiara.

In each case, I try to work my way, as carefully as possible, through the thicket of historical and intellectual concerns that provide point and counterpoint to the ideas. It feels ethically wrong to make a short, fast judgment about any such topics. I can’t imagine doing anything less with a topic as fraught as the US health care system. It’s complex, indeed, Mr. President.

So, I tracked down a foundational paper on this idea of ethics and epistemology. It dates to 1877 and provides a grounding for why and when we should believe in anything. William Clifford’s paper, The Ethics of Belief, tracks multiple lines of argumentation and the consequences of believing without clarity. Even tentative clarity comes with moral risk, as Clifford shows in his thought experiments.

In summary, though, there is no more important statement than Clifford’s final assertion that it is wrong to believe without sufficient evidence. It’s that simple. And it’s even more wrong to act on those beliefs.

The Dynamics of Dignity

My wife got a Jeep Wrangler Unlimited Rubicon a few days back. It has necessitated a new education in off-road machinery like locking axles, low 4, and disconnectable sway bars. It seemed the right choice for our reinsertion into New Mexico, a land that was only partially accessible by cheap, whatever-you-can-afford, vehicles twenty years ago when we were grad students. So we had to start driving random off-road locations and found Faulkner’s Canyon in the Robledos. Billy the Kid used this area as a refuge at one point and we searched out his hidey-hole this morning but ran out of LTE coverage and couldn’t confirm the specific site until returning from our adventure. We will try another day!

Billy the Kid was, of course, a killer of questionable moral standing.

With the Neil Gorsuch nomination to SCOTUS, his role in the legal and moral philosophies surrounding assisted suicide has come under scrutiny. In everyday discussions, the topic often centers on the notion of dignity for the dying. Indeed, the autonomy of the person (and with it some assumption of rational choice) combines with a consideration of alternatives to the human-induced death based on pain, discomfort, loss of physical or mental faculties, and also the future-looking speculation about these possibilities.

Now I combined legal and moral in the same sentence because that is also one way to consider the way in which law is or ought to be formulated. But, in fact, one can also claim that the two don’t need to overlap; law can exist simply as a system of rules that does not include moral repercussions and, if the two have a similar effect on behavior, it is merely a happenstance. Insofar as they are not overlapping, a moral argument can be used to criticize a law.

In formulating a law, then, and regardless of its relationship to a moral norm, the language that is used performs a significant function in directing the limits of the application of the ideas involved. And the language goes further by often challenging the existing holistic relationships in our individual and mental representations of the term. This is also why objective morality seems so nonsensical: in making a moral proposition one is assuming that the language and terms are identifiably externally and internally referential to the objective basis. It is an impossible task that results in either everyday revisionary squeamishness (“Well, sure, ‘do not kill’ should be ‘do not murder,’ but that might exclude killing in warfare or retribution because, well, look at the fate of the Amalekites,”) or a reversion to personal feeling over the matters at hand. Hardly objective at all.

Dignity, then, should be considered as part of this dynamic definitional structure. It has evolved in the legal framework to have at least three meanings, as Lois Shepherd analyzes in some depth in her article, “Dignity and Autonomy after Washington v. Glucksberg: An Essay about Abortion, Death, and Crime,” in the Cornell Journal of Law and Public Policy. For the topic of assisted suicide or euthanasia, SCOTUS and lower courts have used a definition that is in accord with the notion that the individual should be allowed to avoid extreme discomfort and loss of faculties. In so doing, they preserve their physical and mental dignity that arises from their autonomous and rational selves. Any concerns over the latter bring additional scrutiny as to whether they can be said to have autonomy.

The other ideas of dignity, though, include the right of a defendant in a criminal trial to represent herself. And, if given assistance from a court-appointed attorney, the assistant must act in a manner that preserves the perception of the jury as to the dignity of the defendant. And, finally, again related to the autonomy of the individual with regard to medical decision-making, that there it interferes with the dignity of a woman when denied the right to abort a fetus because such an action imposes a barrier to her autonomy and that autonomy has precedence over any case for the fetus to, as yet, have legal status as an individual.

These are arguable points as we all know in the struggles and opposition to abortion and assisted suicide rights. And it is just this dynamism in definitional limitations that has evolved through the legal engagement at the edge of dignity semantics.

(As a postscript to this post, I’ll just add that I’m not trying to specifically pull out legal positivism versus natural law distinctions. Instead, I think there may be an overlooked area of philosophy of language and its intersection with epistemology that could use some emphasis. Where the positivists might agree with me on the general disconnect between moral and legal justifications for laws, they might not have embraced the role of linguistic evolution that is apparent in the definition of terms like “dignity.” It is there, I suggest, that law gets shaped, as we can surmise from any consideration of “fairness” as a legal concept.)

Twilight of the Artistic Mind

Deep Dream Generated Image: deepdreamgenerator.com

Kristen Stewart, of Twilight fame, co-authored a paper on using deep learning neural networks in her new movie that she is directing. The basic idea is very old but the details and scale are more recent. If you take an artificial neural network and have it autoencode the input stream with bottlenecking, you can then submit any stimulus and will get some reflection of the training in the output. The output can be quite surreal, too, because the effect of bottlenecking combined with other optimizations results in an exaggeration of the features that define the input data set. If the input is images, the output will contain echoes of those images.

For Stewart’s effort, the goal was to transfer her highly stylized concept art into the movie scene. So they trained the network on her concept image and then submitted frames from the film to the network. The result reflected aspects of the original stylized image and the input image, not surprisingly.

There has been a long meditation on the unique status of art and music as a human phenomenon since the beginning of the modern era. The efforts at actively deconstructing the expectations of art play against a background of conceptual genius or divine inspiration. The abstract expressionists and the aleatoric composers show this as a radical 20th Century urge to re-imagine what art might be when freed from the strictures of formal ideas about subject, method, and content.

Is there any significance to the current paper? Not a great deal. The bottom line was that there was a great deal of tweaking to achieve a result that was subjectively pleasing and fit with the production goals of the film. That is a long way from automated art and perhaps mostly reflects the ability of artificial neural networks to encode complex transformations that are learned directly from examples. I was reminded of the Nadsat filters available for Unix in the 90s that transformed text into the fictional argot of A Clockwork Orange. Other examples were available, too. The difference was that these were hand-coded while the film example learned from examples. Not hard to do in the language case, though, and likely easier in certain computational aspects due to the smaller range of symbol values.

So it’s a curiosity at best, but plaudits to Stewart for trying new things in her film efforts.

Apprendre à traduire

Google’s translate has always been a useful tool for awkward gists of short texts. The method used was based on building a phrase-based statistical translation model. To do this, you gather up “parallel” texts that are existing, human, translations. You then “align” them by trying to find the most likely corresponding phrases in each sentence or sets of sentences. Often, between languages, fewer or more sentences will be used to express the same ideas. Once you have that collection of phrasal translation candidates, you can guess the most likely translation of a new sentence by looking up the sequence of likely phrase groups that correspond to that sentence. IBM was the progenitor of this approach in the late 1980’s.

It’s simple and elegant, but it always was criticized for telling us very little about language. Other methods that use techniques like interlingual transfer and parsers showed a more linguist-friendly face. In these methods, the source language is parsed into a parse tree and then that parse tree is converted into a generic representation of the meaning of the sentence. Next a generator uses that representation to create a surface form rendering in the target language. The interlingua must be like the deep meaning of linguistic theories, though the computer science versions of it tended to look a lot like ontological representations with fixed meanings. Flexibility was never the strong suit of these approaches, but their flaws were much deeper than just that.

For one, nobody was able to build a robust parser for any particular language. Next, the ontology was never vast enough to accommodate the rich productivity of real human language. Generators, being the inverse of the parser, remained only toy projects in the computational linguistic community. And, at the end of the day, no functional systems were built.

Instead, the statistical methods plodded along but had their own limitations. For instance, the translation of a never-before-seen sentence consisting of never-before-seen phrases, is the null set. Rare and strange words in the data have problems too, because they have very low probabilities and are swamped by well-represented candidates that lack the nuances of the rarer form. The model doesn’t care, of course; the probabilities rule everything. So you need more and more data. But then you get noisy data mixed in with the good data that distorts the probabilities. And you have to handle completely new words and groupings like proper nouns and numbers that are due to the unique productivity of these classes of forms.

So, where to go from here? For Google and its recent commitment to Deep Learning, the answer was to apply Deep Learning Neural Network approaches. The approach threw every little advance of recent history at the problem to pretty good effect. For instance, to cope with novel and rare words, they broke the input text up into sub-word letter groupings. The segmentation of the groupings was based, itself, on a learned model of the most common break-ups of terms, though they didn’t necessarily correspond to syllables or other common linguistic expectations. Sometimes they also used character-level models. The models were then combined into an ensemble, which is a common way of overcoming brittleness and overtraining on subsets of the data set. They used GPUs in some cases as well as reduced-precision arithmetic to speed-up the training of the models. They also used an attention-based intermediary between the encoder layers and the decoder layers to limit the influence of the broader context within a sentence.

The results improved translation quality by as much as 60% over the baseline phrase-based approach and, interestingly, showed a close approach to the average human translator’s performance. Is this enough? Not at all. You are not going to translate poetry this way any time soon. The productiveness of human language and the open classes of named entities remain a barrier. The subtleties of pragmatics might still vex any data driven approach—at least until there are a few examples in the corpora. And there might need to be a multi-sensory model somehow merged with the purely linguistic one to help manage some translation candidates. For instance, knowing the way in which objects fall could help move a translation from “plummeted” to “settled” to the ground.

Still, data-driven methods continue to reshape the intelligent machines of the future.