Concrete AI tasks for forecasting

This page contains a list of relatively well specified AI tasks designed for forecasting. Currently all entries were used in the 2016 Expert Survey on Progress in AI.

List

Translate a text written in a newly discovered language into English as well as a team of human experts, using a single other document in both languages (like a Rosetta stone). Suppose all of the words in the text can be found in the translated document, and that the language is a difficult one.
Translate speech in a new language given only unlimited films with subtitles in the new language. Suppose the system has access to training data for other languages, of the kind used now (e.g. same text in two languages for many languages and films with subtitles in many languages).
Perform translation about as good as a human who is fluent in both languages but unskilled at translation, for most types of text, and for most popular languages (including languages that are known to be difficult, like Czech, Chinese and Arabic).
Provide phone banking services as well as human operators can, without annoying customers more than humans. This includes many one-off tasks, such as helping to order a replacement bank card or clarifying how to use part of the bank website to a customer.
Correctly group images of previously unseen objects into classes, after training on a similar labeled dataset containing completely different classes. The classes should be similar to the ImageNet classes.
One-shot learning: see only one labeled image of a new object, and then be able to recognize the object in real world scenes, to the extent that a typical human can (i.e. including in a wide variety of settings). For example, see only one image of a platypus, and then be able to recognize platypuses in nature photos. The system may train on labeled images of other objects. Currently, deep networks often need hundreds of examples in classification tasks¹, but there has been work on one-shot learning for both classification² and generative tasks³.

^{1 Lake et al. (2015). Building Machines That Learn and Think Like People}
^{2 Koch (2015). Siamese Neural Networks for One-Shot Image Recognition}
^{3 Rezende et al. (2016). One-Shot Generalization in Deep Generative Models}

See a short video of a scene, and then be able to construct a 3D model of the scene that is good enough to create a realistic video of the same scene from a substantially different angle. For example, constructing a short video of walking through a house from a video taking a very different path through the house.
Transcribe human speech with a variety of accents in a noisy environment as well as a typical human can.
Take a written passage and output a recording that can’t be distinguished from a voice actor, by an expert listener.
Routinely and autonomously prove mathematical theorems that are publishable in top mathematics journals today, including generating the theorems to prove.
Perform as well as the best human entrants in the Putnam competition—a math contest whose questions have known solutions, but which are difficult for the best young mathematicians.
Defeat the best Go players, training only on as many games as the best Go players have played. For reference, DeepMind’s AlphaGo has probably played a hundred million games of self-play, while Lee Sedol has probably played 50,000 games in his life¹.

^{1 Lake et al. (2015). Building Machines That Learn and Think Like People}

Beat the best human Starcraft 2 players at least 50% of the time, given a video of the screen. Starcraft 2 is a real time strategy game characterized by:

Continuous time play
Huge action space
Partial observability of enemies Long term strategic play, e.g. preparing for and then hiding surprise attacks.

Play a randomly selected computer game, including difficult ones, about as well as a human novice, after playing the game less than 10 minutes of game time. The system may train on other games.
Play new levels of Angry Birds better than the best human players. Angry Birds is a game where players try to efficiently destroy 2D block towers with a catapult. For context, this is the goal of the IJCAI Angry Birds AI competition¹.

^{1 aibirds.org}

Outperform professional game testers on all Atari games using no game-specific knowledge. This includes games like Frostbite, which require planning to achieve sub-goals and have posed problems for deep Q-networks^{1, 2}.

^{1 Mnih et al. (2015). Human-level control through deep reinforcement learning}
^{2 Lake et al. (2015). Building Machines That Learn and Think Like People}

Outperform human novices on 50% of Atari games after only 20 minutes of training play time and no game specific knowledge. For context, the original Atari playing deep Q-network outperforms professional game testers on 47% of games¹, but used hundreds of hours of play to train².

^{1 Mnih et al. (2015). Human-level control through deep reinforcement learning}
^{2 Lake et al. (2015). Building Machines That Learn and Think Like People}

Fold laundry as well and as fast as the median human clothing store employee.
Beat the fastest human runners in a 5 kilometer race through city streets using a bipedal robot body.
Physically assemble any LEGO set given the pieces and instructions, using non-specialized robotics hardware. For context, Fu 2016¹ successfully joins single large LEGO pieces using model based reinforcement learning and online adaptation.

^{1 Fu et al. (2016). One-Shot Learning of Manipulation Skills with Online Dynamics Adaptation and Neural Network Priors}

Learn to efficiently sort lists of numbers much larger than in any training set used, the way Neural GPUs can do for addition¹, but without being given the form of the solution. For context, Neural Turing Machines have not been able to do this², but Neural Programmer-Interpreters³ have been able to do this by training on stack traces (which contain a lot of information about the form of the solution).

^{1 Kaiser & Sutskever (2015). Neural GPUs Learn Algorithms}
^{2 Zaremba & Sutskever (2015). Reinforcement Learning Neural Turing Machines}
^{3 Reed & de Freitas (2015). Neural Programmer-Interpreters}

Write concise, efficient, human-readable Python code to implement simple algorithms like quicksort. That is, the system should write code that sorts a list, rather than just being able to sort lists. Suppose the system is given only:

A specification of what counts as a sorted list
Several examples of lists undergoing sorting by quicksort

Answer any “easily Googleable” factoid questions posed in natural language better than an expert on the relevant topic (with internet access), having found the answers on the internet. Examples of factoid questions:

“What is the poisonous substance in Oleander plants?”
“How many species of lizard can be found in Great Britain?”

Answer any “easily Googleable” factual but open ended question posed in natural language better than an expert on the relevant topic (with internet access), having found the answers on the internet. Examples of open ended questions:

“What does it mean if my lights dim when I turn on the microwave?”
“When does home insurance cover roof replacement?”

Give good answers in natural language to factual questions posed in natural language for which there are no definite correct answers. For example:”What causes the demographic transition?”, “Is the thylacine extinct?”, “How safe is seeing a chiropractor?”
Write an essay for a high-school history class that would receive high grades and pass plagiarism detectors. For example answer a question like ‘How did the whaling industry affect the industrial revolution?’
Compose a song that is good enough to reach the US Top 40. The system should output the complete song as an audio file.
Produce a song that is indistinguishable from a new song by a particular artist, e.g. a song that experienced listeners can’t distinguish from a new song by Taylor Swift.
Write a novel or short story good enough to make it to the New York Times best-seller list.
For any computer game that can be played well by a machine, explain the machine’s choice of moves in a way that feels concise and complete to a layman.
Play poker well enough to win the World Series of Poker.
After spending time in a virtual world, output the differential equations governing that world in symbolic form. For example, the agent is placed in a game engine where Newtonian mechanics holds exactly and the agent is then able to conduct experiments with a ball and output Newton’s laws of motion.

G Deepwater says:

2017-01-06 at 3:01 AM

Real world application of AI for maximum impact could include assessment of and optimal distribution strategies for life support critical planetary resources such as clean water, healthy forests, healthy fisheries, arable land and agricultural surplus, means of clean energy production.With a focus on these areas, a method of circumventing certain extinction for nearly all life forms on this planet could be devised, and factors preventing an equitable and sustainable future for all beings on this planet could be overcome.

AI Impacts

Concrete AI tasks for forecasting

List

1 Comment

6 Trackbacks / Pingbacks

Leave a Reply Cancel reply

List

Related Articles

Historical economic growth trends

Human-Level AI

Possible Empirical Investigations

1 Comment

6 Trackbacks / Pingbacks

Leave a Reply Cancel reply