Inferred Modular Superparrots

The buzz about ChatGPT and related efforts has been surprisingly resistant to the standard deflationary pressure of the Gartner hype cycle. Quantum computing definitely fizzled but appears to be moving towards the plateau of productivity with recent expansions of the number of practical qubits available by IBM and Origin in China, as well as additional government funding out of national security interests and fears. But ChatGPT attracted more sustained attention because people can play with it easily without needing to understand something like Shor’s algorithm for factoring integers. Instead, you just feed it a prompt and are amazed that it writes so well. And related image generators are delightful (as above) and may represent a true displacement of creative professionals even at this early stage, with video hallucinators evolving rapidly too.

But are Large Language Models (LLMs) like ChatGPT doing much more than stitching together recorded fragments of texts ingested from an internet-scale corpus of text? Are they inferring patterns that are in any way beyond just being stochastic parrots? And why would scaling up a system result in qualitative new capabilities, if there are any at all?

Some new work covered in Quanta Magazine has some intriguing suggestions that there is a bit more going on in LLMs, although the subtitle contains the word “understanding” that I think is premature. At heart is the idea that as networks scale up given ordering rules that are not highly uniform or correlated they tend to break up into collections of subnetworks that are distinct (substitute “graphs” for networks if you are a specialist). The theory, then, is that the ingest of sufficient magnitudes of text into a sufficiently large network and the error-minimization involved in tuning that network to match output to input also segregates groupings that the Quanta author and researchers at Princeton and DeepMind refer to as skills.… Read the rest

Reverse Engineering the Future

I’ve been enjoying streaming Apple TV+’s Foundation based on Asimov’s classic books. The show is very different from the high-altitude narrative of the books and, frankly, I couldn’t see it being very engaging if it had been rendered the way the books were written. The central premise of predictability of vast civilizations via “psychohistory” always struck me as outlandish, even as a teen. From my limited understanding of actual history it seemed strange that anything that happened in the past fit anything but the roughest of patterns. I nevertheless still read all the “intellectual history” books that come out in the hope that there are underlying veins that explain the surface rocks strewn chaotically across the past. Cousin marriage bans leads to the rise of individualism? Geography is the key? People want to mimic one another? Economic inequality is the actual key?

Each case is built on some theoretical insight that is stabilized by a broad empirical scaffolding. Each sells some books and gets some play in TED and book reviews. But then they seem to pass away out of public awareness because the fad is over and there are always new ideas bubbling up. But maybe that’s because they can’t help us predict the future exactly (well, Piketty perhaps…see below). But what can?

The mysterious world of stocks and bonds is an area where there seems to be no end to speculation (figuratively and literally) about ways to game the system and make money, or even to understand macroeconomic trends. It’s not that economics doesn’t have some empirical powers, it’s just that it still doesn’t have the kind of reliability that we expect from the physical sciences.… Read the rest

Entanglements: Collected Short Works

Now available in Kindle, softcover, and hardcover versions, Entanglements assembles a decade of short works by author, scientist, entrepreneur, and inventor Mark William Davis.

The fiction includes an intimate experimental triptych on the evolution of sexual identities. A genre-defying poetic meditation on creativity and environmental holocaust competes with conventional science fiction about quantum consciousness and virtual worlds. A postmodern interrogation of the intersection of storytelling and film rounds out the collected works as a counterpoint to an introductory dive into the ethics of altruism.

The nonfiction is divided into topics ranging from literary theory to philosophical concerns of religion, science, and artificial intelligence. Legal theories are magnified to examine the meaning of liberty and autonomy. A qualitative mathematics of free will is developed over the course of two essays and contextualized as part of the algorithm of evolution. What meaning really amounts to is always a central concern, whether discussing politics, culture, or ideas.

The works show the author’s own evolution in his thinking of our entanglement with reality as driven by underlying metaphors that transect science, reason, and society. For Davis, metaphors and the constellations of words that help frame them are the raw materials of thought, and their evolution and refinement is the central narrative of our growth as individuals in a webwork of societies and systems.

Entanglements is for readers who are in love with ideas and the networks of language that support and enervate them. It is a metalinguistic swim along a polychromatic reef of thought where fiction and nonfictional analysis coexist like coral and fish in a greater ecosystem.

Mark William Davis is the author of three dozen scientific papers and patents in cognitive science, search, machine translation, and even the structure of art.… Read the rest

One Shot, Few Shot, Radical Shot

Exunoplura is back up after a sad excursion through the challenges of hosting providers. To be blunt, they mostly suck. Between systems that just don’t work right (SSL certificate provisioning in this case) and bad to counterproductive support experiences, it’s enough to make one want to host it oneself. But hosting is mostly, as they say of war, long boring periods punctuated by moments of terror as things go frustratingly sideways. But we are back up again after two hosting provider side-trips!

Honestly, I’d like to see an AI agent effectively navigate through these technological challenges. Where even human performance is fleeting and imperfect, the notion that an AI could learn how to deal with the uncertain corners of the process strikes me as currently unthinkable. But there are some interesting recent developments worth noting and discussing in the journey towards what is named “general AI” or a framework that is as flexible as people can be, rather than narrowly tied to a specific task like visually inspecting welds or answering a few questions about weather, music, and so forth.

First, there is the work by the OpenAI folks on massive language models being tested against one-shot or few-shot learning problems. In each of these learning problems, the number of presentations of the training data cases is limited, rather than presenting huge numbers of exemplars and “fine tuning” the response of the model. What is a language model? Well, it varies across different approaches, but typically is a weighted context of words of varying length, with the weights reflecting the probabilities of those words in those contexts over a massive collection of text corpora. For the OpenAI model, GPT-3, the total number of parameters (words/contexts and their counts) is an astonishing 175 billion using 45 Tb of text to train the model.… Read the rest

Running, Ancient Roman Science, Arizona Dive Bars, and Lightning Machine Learning

I just returned from running in Chiricahua National Monument, Sedona, Painted Desert, and Petrified Forest National Park, taking advantage of the late spring before the heat becomes too intense. Even so, though I got to Massai Point in Chiricahua through 90+ degree canyons and had around a liter of water left, I still had to slow down and walk out after running short of liquid nourishment two-thirds down. There is an eerie, uncertain nausea that hits when hydration runs low under high stress. Cliffs and steep ravines take on a wolfish quality. The mind works to control feet against stumbling and the lips get serrated edges of parched skin that bite off without relieving the dryness.

I would remember that days later as I prepped to overnight with a wilderness permit in Petrified Forest only to discover that my Osprey Exos pack frame had somehow been bent, likely due to excessive manhandling by airport checked baggage weeks earlier. I considered my options and drove eighty miles to Flagstaff to replace the pack, then back again.

I arrived in time to join Dr. Richard Carrier in an unexpected dive bar in Holbrook, Arizona as the sunlight turned to amber and a platoon of Navajo pool sharks descended on the place for billiards and beers. I had read that Dr. Carrier would be stopping there and it was convenient to my next excursion, so I picked up signed copies of his new book, The Scientist in the Early Roman Empire, as well as his classic, On the Historicity of Jesus, that remains part of the controversial samizdat of so-called “Jesus mythicism.”

If there is a distinguishing characteristic of OHJ it is the application of Bayesian Theory to the problems of historical method.… Read the rest

Black and Gray Boxes with Autonomous Meta-Cognition

Vijay Pande of VC Andreessen Horowitz (who passed on my startups twice but, hey, it’s just business!) has a relevant article in New York Times concerning fears of the “black box” of deep learning and related methods: is the lack of explainability and limited capacity for interrogation of the underlying decision making a deal-breaker for applications to critical areas like medical diagnosis or parole decisions? His point is simple, and related to the previous post’s suggestion of the potential limitations of our capacity to truly understand many aspects of human cognition. Even the doctor may only be able to point to a nebulous collection of clinical experiences when it comes to certain observational aspects of their jobs, like in reading images for indicators of cancer. At least the algorithm has been trained on a significantly larger collection of data than the doctor could ever encounter in a professional lifetime.

So the human is almost as much a black box (maybe a gray box?) as the algorithm. One difference that needs to be considered, however, is that the deep learning algorithm might make unexpected errors when confronted with unexpected inputs. The classic example from the early history of artificial neural networks involved a DARPA test of detecting military tanks in photographs. The apocryphal to legendary formulation of the story is that there was a difference in the cloud cover between the tank images and the non-tank images. The end result was that the system performed spectacularly on the training and test data sets but then failed miserably on new data that lacked the cloud cover factor. I recalled this slightly differently recently and substituted film grain for the cloudiness. In any case, it became a discussion point about the limits of data-driven learning that showed how radically incorrect solutions could be created without careful understanding of how the systems work.… Read the rest

I, Robot and Us

What happens if artificial intelligence (AI) technologies become significant economic players? The topic has come up in various ways for the past thirty years, perhaps longer. One model, the so-called technological singularity, posits that self-improving machines may be capable of a level of knowledge generation and disruption that will eliminate humans from economic participation. How far out this singularity might be is a matter of speculation, but I have my doubts that we really understand intelligence enough to start worrying about the impacts of such radical change.

Barring something essentially unknowable because we lack sufficient priors to make an informed guess, we can use evidence of the impact of mechanization on certain economic sectors, like agribusiness or transportation manufacturing, to try to plot out how mechanization might impact other sectors. Aghion, Jones, and Jones’ Artificial Intelligence and Economic Growth, takes a deep dive into the topic. The math is not particularly hard, though the reasons for many of the equations are tied up in macro and microeconomic theory that requires a specialist’s understanding to fully grok.

Of special interest are the potential limiting role of inputs and organizational competition. For instance, automation speed-ups may be limited by human limitations within the economic activity. This may extend even further due to fundamental limitations of physics for a given activity. The pointed example is that power plants are limited by thermodynamics; no amount of additional mechanization can change that. Other factors related to inputs or the complexity of a certain stage of production may also drag economic growth to a capped, limiting level.

Organizational competition and intellectual property considerations come into play, as well. While the authors suggest that corporations will remain relevant, they should become more horizontal by eliminating much of the middle tier of management and outsourcing components of their productivity.… Read the rest

Apprendre à traduire

Google’s translate has always been a useful tool for awkward gists of short texts. The method used was based on building a phrase-based statistical translation model. To do this, you gather up “parallel” texts that are existing, human, translations. You then “align” them by trying to find the most likely corresponding phrases in each sentence or sets of sentences. Often, between languages, fewer or more sentences will be used to express the same ideas. Once you have that collection of phrasal translation candidates, you can guess the most likely translation of a new sentence by looking up the sequence of likely phrase groups that correspond to that sentence. IBM was the progenitor of this approach in the late 1980’s.

It’s simple and elegant, but it always was criticized for telling us very little about language. Other methods that use techniques like interlingual transfer and parsers showed a more linguist-friendly face. In these methods, the source language is parsed into a parse tree and then that parse tree is converted into a generic representation of the meaning of the sentence. Next a generator uses that representation to create a surface form rendering in the target language. The interlingua must be like the deep meaning of linguistic theories, though the computer science versions of it tended to look a lot like ontological representations with fixed meanings. Flexibility was never the strong suit of these approaches, but their flaws were much deeper than just that.

For one, nobody was able to build a robust parser for any particular language. Next, the ontology was never vast enough to accommodate the rich productivity of real human language. Generators, being the inverse of the parser, remained only toy projects in the computational linguistic community.… Read the rest

A Big Data Jeremiad and the Moral Health of America

monopolydude2The average of polls were wrong. The past-performance-weighted, hyper-parameterized, stratified-sampled, Monte Carlo-ized collaborative predictions fell as critically short in the general election as they had in the Republican primary. There will be much soul searching to establish why that might have been; from ground game engagement to voter turnout, from pollster bias to sampling defects, the hit list will continue to grow.

Things were less predictable than it seemed. During the 2008 and 2012 elections, the losing party proxies held that the polls were inherently flawed, though they were ultimately predictive. Now, in 2016, they were inherently flawed and not at all predictive.

But what the polls showed was instructive even if their numbers were not quite right. Specifically, there was a remarkable turn-out for Trump among white, less-educated voters who long for radical change to their economic lives. The Democratic candidate was less clearly engaging.

Another difference emerged, however. Despite efforts to paint Hillary Clinton as corrupt or a liar, objective fact checkers concluded that she was, in fact, one of the most honest candidates in recent history, and that Donald Trump was one of the worst, only approximated by Michelle Bachman in utter mendacity. We can couple that with his race-bating, misogyny, hostility, divorces, anti-immigrant scapegoating, and other childish antics. Yet these moral failures did not prevent his supporters from voting for him in numbers.

But his moral failures may be precisely why his supporters found him appealing. Evangelicals decided for him because Clinton was a threat to overturning Roe v. Wade, while he was an unknown who said a few contradictory things in opposition. His other moral issues were less important—even forgivable. In reality, though, this particular divide is an exemplar for a broader division in the moral fabric of America.… Read the rest

Startup Next

I’m thrilled to announce my new startup, Like Human. The company is focused on making significant new advances to the state of the art in cognitive computing and artificial intelligence. We will remain a bit stealthy for another six months or so and then will open up shop for early adopters.

I’m also pleased to share with you Like Human’s logo that goes by the name Logo McLogoface, or LM for short. LM combines imagery from nuclear warning signs, Robby the Robot from Forbidden Planet, and Leonardo da Vinci’s Vitruvian Man. I think you will agree about Mr. McLogoface’s agreeability:

logo-b

You can follow developments at @likehumancom on Twitter, and I will make a few announcements here as well.… Read the rest