Tagged: big data

Intelligence Augmentation and a Frictionless Economy

Speed SkatingThe ever-present Tom Davenport weighs in in the Harvard Business Review on the topic of artificial intelligence (AI) and its impact on knowledge workers of the future. The theme is intelligence augmentation (IA) where knowledge workers improve their productivity and create new business opportunities using technology. And those new opportunities don’t displace others, per se, but introduce new efficiencies. This was also captured in the New York Times in a round-up of the role of talent and service marketplaces that reduce the costs of acquiring skills and services, creating more efficient and disintermediating sources of friction in economic interactions.

I’ve noticed the proliferation of services for connecting home improvement contractors to customers lately, and have benefited from them in several renovation/construction projects I have ongoing. Meanwhile, Amazon Prime has absorbed an increasingly large portion of our shopping, even cutting out Whole Foods runs, with often next day deliveries. Between pricing transparency and removing barriers (delivery costs, long delays, searching for reliable contractors), the economic impacts might be large enough to be considered a revolution, though perhaps a consumer revolution rather than a worker productivity one.

Here’s the concluding paragraph from an IEEE article I just wrote that will appear in the San Francisco Chronicle in the near future:

One of the most interesting risks also carries with it the potential for enhanced reward. Don’t they always? That is, some economists see economic productivity largely stabilizing if not stagnating.  Industrial revolutions driven by steam engines, electrification, telephony, and even connected computing led to radical reshaping our economy in the past and leaps in the productivity of workers, but there is no clear candidate for those kinds of changes in the near future. Big data feeding into more intelligent systems may be the driver for the next economic wave, though revolutions are always messier than anyone expected.

But maybe it will be simpler and less messy than I imagine, just intelligence augmentation helping with our daily engagement with a frictionless economy.

Inequality and Big Data Revolutions

industrial-revolutionsI had some interesting new talking points in my Rock Stars of Big Data talk this week. On the same day, MIT Technology Review published Technology and Inequality by David Rotman that surveys the link between a growing wealth divide and technological change. Part of my motivating argument for Big Data is that intelligent systems are likely the next industrial revolution via Paul Krugman of Nobel Prize and New York Times fame. Krugman builds on Robert Gordon’s analysis of past industrial revolutions that reached some dire conclusions about slowing economic growth in America. The consequences of intelligent systems on everyday life will have enormous impact and will disrupt everything from low-wage workers through to knowledge workers. And how does Big Data lead to that disruption?

Krugman’s optimism was built on the presumption that the brittleness of intelligent systems so far can be overcome by more and more data. There are some examples where we are seeing incremental improvements due to data volumes. For instance, having larger sample corpora to use for modeling spoken language enhances automatic speech recognition. Google Translate builds on work that I had the privilege to be involved with in the 1990s that used “parallel texts” (essentially line-by-line translations) to build automatic translation systems based on phrasal lookup. The more examples of how things are translated, the better the system gets. But what else improves with Big Data? Maybe instrumenting many cars and crowdsourcing driving behaviors through city streets would provide the best data-driven approach to self-driving cars. Maybe instrumenting individuals will help us overcome some of things we do effortlessly that are strangely difficult to automate like folding towels and understanding complex visual scenes.

But regardless of the methods, the consequences need to be considered. Our current fascination with Big Data may not lead to Industrial Revolution 4 in five years or twenty, but unless there is some magical barrier that we are not aware of, IR4 seems to be inevitable. And the impacts will perhaps be more profound than the past revolutions because, unlike those transitions, the direct displacement of workers is a key component of the IR4 plan. In Rotman’s article, Thomas Piketty’s r > g is invoked to explain the excess return on capital (r) versus economic growth rate (g) and how that leads to a concentration of wealth among the richest members of our society, creating a barbell distribution of economic opportunities where the middle class has been dismantled due to (per Gordon) the equalization of labor costs through outsourcing to low-cost nations. But at least there remains a left bell to that barbell in that it is largely impossible to eliminate the services jobs that are critical to retail, restaurant, logistics, health care, and a raft of other economic sectors.

All that changes in IR4 and the barbell turns into the hammer from the Olympic hammer throw as the owners of the capital take over the entire cost structure for a huge range of economic activities. The middle may not initially be gone, however, as maintenance of the machinery will require a skilled workforce. Even this will be a point of Big Data optimization, however, as predictive maintenance and self-healing systems optimize against their failure modes over usage cycles.

So let’s go back to Gordon’s pessimism (economics is, after all, the “dismal science”). What headwinds and tailwinds are left in IR4? Perhaps the most cogent is the recommended use of redistributive methods for accelerating educational opportunities while reducing the debt load of American students. The other areas that are discussed include unlimited immigration to try to offset hours per capita declines due to retirement and demographic effects, but Gordon’s application of this is not necessarily valid in IR4 where low-skilled immigration would cease because of a lack of economic opportunities and even higher-skilled workers might find themselves displaced.

One lesson learned from past industrial revolutions is that they created more opportunities than worker displacements. Steam power displaced animal labor and the workers needed to shoe and train and feed those animals. Diesel trains displaced steam engine builders and mechanics. Cars and aircraft displaced trains. But in each case there were new jobs that accompanied the shift. We might be equally optimistic about IR4, speculating about robot trainers and knowledge engineers, massive extraction industries and materials production, or enhanced creative and entertainment systems like Michael Crichton’s dystopian Westworld of the early 70s. Is this enough to buffer against the headwind of the loss of the service sector? Perhaps, but it will not come without enormous global disruption.

Profiled Against a Desert Ribbon

The desert abloomCatch a profile of me in this month’s IEEE Spectrum Magazine. Note Yggdrasil in the background! It’s been great working with IEEE’s Cloud Computing Initiative (CCI) these last two years. CCI will be ending soon, but it’s impact will live on in, for instance, the Intercloud Interoperability Standard and other ways. Importantly, I’ll be at the IEEE Big Data Initiative Workshop in Hoboken, NJ, at the end of the month working on the next initiative in support of advanced data analytics. Note that Hoboken and Jersey City have better views of Manhattan than Manhattan itself!

“Animal” was the name of the program and it built simple decision trees based on yes/no answers (does it have hair? does it have feathers?). If it didn’t guess your animal it added a layer to the tree with the correct answer. Incremental learning at its most elementary, but it left an odd impression on me: how do we overcome the specification of rules to create self-specifying (occasionally, maybe) intelligence? I spent days wandering the irrigation canals of the lower New Mexico Rio Grande trying to overcome this fatal flaw that I saw in such simplified ideas about intelligence. And I didn’t really go home for days, it seemed, given the freedom to drift through my pre-teen and then teen years in a way I can’t imagine today, creating myself among my friends and a penumbra of ideas, the green chile and cotton fields a thin ribbon surrounded by stark Chihuahuan desert.

Industrial Revolution #4

Paul Krugman at New York Times consumes Robert Gordon’s analysis of economic growth and the role of technology and comes up more hopeful than Gordon. The kernel in Krugman’s hope is that Big Data analytics can provide a shortcut to intelligent machines by bypassing the requirement for specification and programming that was once assumed to be a requirement for artificial intelligence. Instead, we don’t specify but use “data-intensive ways” to achieve a better result. And we might get to IR#4, following Gordon’s taxonomy where IR stands for “industrial revolution.” IR#1 was steam and locomotives  IR#2 was everything up to computers. IR#3 is computers and cell phones and whatnot.

Krugman implies that IR#4 might spur the typical economic consequences of grand technological change, including the massive displacement of workers, but like in previous revolutions it is also assumed that economic growth built from new industries will ultimately eclipse the negatives. This is not new, of course. Robert Anton Wilson argued decades ago for the R.I.C.H. economy (Rising Income through Cybernetic Homeostasis). Wilson may have been on acid, but Krugman wasn’t yet tuned in, man. (A brief aside: the Krugman/Wilson notions probably break down over extraction and agribusiness/land rights issues. If labor is completely replaced by intelligent machines, the land and the ingredients it contains nevertheless remain a bottleneck for economic growth. Look at the global demand for copper and rare earth materials, for instance.)

But why the particular focus on Big Data technologies? Krugman’s hope teeters on the assumption that data-intensive algorithms possess a fundamentally different scale and capacity than human-engineered approaches. Having risen through the computational linguistics and AI community working on data-driven methods for approaching intelligence, I can certainly sympathize with the motivation, but there are really only modest results to report at this time. For instance, statistical machine translation is still pretty poor quality, and is arguably not of better quality than the rules-based methods from the 70s and 80s in anything other than scale and diversity of the languages that are being used. Recent achievements like the DARPA grand challenge for self-driving vehicles were not achieved through data-intensive methods but through careful examination of the limits of the baseline system. In that case, baseline meant a system that used a scanning laser rangefinder to avoid obstacles while following a map and an improvement meant marginally outrunning the distance limitations of the rangefinder by using optical image recognition to support a modest speedup. Speech recognition is better due to accumulating many examples of labeled, categorized text, true. And we can certainly guess that the relevance of advertising placed on a web page is better than it once was, if only because it is an easy problem to attack without the necessity of deep considerations of human understanding–unless you take our buying behavior to be a deep indicator of our beings. We can also see some glimmers of data-intensive methods in the IBM Watson system, though the Watson team will be the first to tell you that they dealt with only medium-scale data (wikipedia) in the design of their system.

Still, there is a clear economic-growth argument for the upshot of replacing workers in manual drudgery straight through to fairly intelligent drudgery, which gives an economist like Krugman reason for hope. Now, if the limitations of energy and resource requirements can just be replaced, we can all retire to RICH, creative lives.

An Exit to a New Beginning

I am thrilled to note that my business partner and I sold our Big Data analytics startup to a large corporation yesterday. I am currently unemployed but start anew doing the same work on Monday.

Thrilled is almost too tame a word. Ecstatic does better describing the mood around here and the excitement we have over having triumphed in Sili Valley. There are many war stories that we’ve been swapping over the last 24 hours, including how we nearly shut down/rebooted at the start of 2012. But now it is over and we have just a bit of cleanup work left to dissolve the existing business structures and a short vacation to attend to.

Experimental Psychohistory

Kalev Leetaru at UIUC highlights the use of sentiment analysis to retrospectively predict the Arab Spring using Big Data in this paper. Dr. Leetaru took English transcriptions of Egyptian press sources and looked at aggregate measures of positive and negative sentiment terminology. Sentiment terminology is fairly simple in this case, consisting of positive and negative adjectives primarily, but could be more discriminating by checking for negative modifiers (“not happy,” “less than happy,” etc.). Leetaru points out some of the other follies that can arise from semi-intelligent broad measures like this one applied too liberally:

It is important to note that computer–based tone scores capture only the overall language used in a news article, which is a combination of both factual events and their framing by the reporter. A classic example of this is a college football game: the hometown papers of both teams will report the same facts about the game, but the winning team’s paper will likely cast the game as a positive outcome, while the losing team’s paper will have a more negative take on the game, yielding insight into their respective views towards it.

This is an old issue in computational linguistics. In the “pragmatics” of automatic machine translation, for example, the classic example is how do you translate fighters in a rebellion. They could be anything from “terrorists” to “freedom fighters,” depending on the perspective of the translator and the original writer.

In Leetaru’s work, the end result was an unusually high churn of negative-going sentiment as the events of the Egyptian revolution unfolded.

But is it repeatable or generalizable? I’m skeptical. The rise of social media, enhanced government suppression of the media, spamming, disinformation, rapid technological change, distributed availability of technology, and the evolving government understanding of social dynamics can all significantly smear-out the priors associated with the positive signal relative to the indeterminacy of the messaging.