TIM BRYANT AND Brian Knudson spent two years preparing for the Cyber Grand Challenge, a $55 million hacking contest cooked up by Darpa, the visionary research operation inside the U.S. Defense Department. But when the contest begins next Thursday evening in a ballroom at the Paris Hotel in Las Vegas, Darpa won’t let them participate.
That’s because the contest pits code against code. It’s a battle of autonomous systems designed to identify security holes in software programs they’ve never encountered—and patch those holes on the fly. Working with others at defense contractor Raytheon, Bryant and Knudson spent those two years building such a system. Today, contest organizers will roll a supercomputer into the Paris Hotel that includes the latest version of Raytheon’s creation, and once that happens, Bryant, Knudson, and the rest of the ten-person team, dubbed Deep Red, can’t guide their system or change it in any way. They become bystanders—like the rest of us. They’ll fly into Vegas, take a seat in the ballroom, and see what happens.
“It’ll be a tense three hours,” Bryant says. “All we can do is worry.”
At around 5 pm local time next Thursday, Darpa will flip the switch on seven supercomputers—seven racks of servers—each loaded with one of the seven autonomous security bots designed by the contest’s seven finalists. Each bot will defend its rack while attacking the others, trying to identify vulnerabilities in the software running these machines. They’ll work to patch holes in their own machines—without hampering the surrounding software—while showing they can exploit holes in others. They may even bar certain network traffic from their own machines, as an added protection.
But not even the people who designed these bots can predict how they’ll perform. No one knows what sort of software Darpa will load onto those machines. And that makes the contest a true test of how well software can protect software. And it could transform the very nature of cybersecurity.
Disaster or a New Day
Of course, the contest could also go horribly wrong, like Darpa’s last Grand Challenge, in which an army of robots stumbled into Internet infamy. A lot of this depends on how Darpa has designed the playing field, Bryant says. But many see next week’s competition as an early step towards a new way of building security software. Today, identifying and patching security holes is a very human skill, and those who possess it are few and far between. If we can build bots that protect machines without relying on direct human intervention, our machines will be that much safer.
The promise is there, with many online operations,including Google, already exploring automated security. Darpa’s contest will only accelerate this movement, says David Brumley, the director of Carnegie Mellon’s security and privacy institute, who’s leading another team in the competition. And that couldn’t come at a better time, he says, as more and more online devices—the so-called Internet of Things—move into daily life.
Others are less sanguine about the prospects of automated bug hunters. “This a long ways off,” says Orion Hindawi, CEO of the security company Tanium. “It’s an extremely expensive way to solve the problem.” But at the very least, Darpa’s challenge will likely show that today’s autonomous systems know how to find familiar bugs—and find them quickly. “If you pit them against a really good hacker,” Brumley says of these bots, “they’re going to be able to totally slaughter them on volume.” But when it comes finding more complex and unexpected threats, machines still lag behind their human counterparts.
Good at Math
Machines are good at math. The more these bots can turn cyberdefense into a math problem, the more likely they are to succeed. Working from inside a startup called ForAllSecure, Brumley’s team created a system that essentially represents software programs as a set of equations. If the system can solve these equations, he says, it can find holes.
The bots will also use other common techniques for pinpointing security holes, including fuzz testing andsymbolic execution, both of which look for specific inputs that will break a piece of software. Machines can apply these methods much faster than humans, but not quite fast enough.
The trouble is that the scope of the problem facing these bots—even in the confines of this contest, much less the real world—is immense. The bots don’t have enough time to check every mathematical possibility. Humans can speed this process through intuition—feeling their way to particularly promising areas of attack—but machines can’t.
To make up for this weakness, systems like the one designed by ForAllSecure will lean on the probabilities of game theory. The company’s bot, Brumley says, will tackle the contest in much the same way a gambler tackles the multi-armed bandit problem.
“You walk into Vegas. You have a bunch of slot machines. Your goal is to make as much money as possible. But you don’t know anything about them. That’s how we’re approaching the Cyber Grand Challenge,” he says. “We’re getting a whole bunch of software programs. We don’t know anything about them. We run an analysis.”
Machine learning—where machines modify their behavior based on past experiences—can help with this, but only up to a point. As Google has said, finding the data needed to drive machine learning in the security world isn’t easy. Brumley and Knudson say that we haven’t yet reached the point where deep neural networks—which had proven so adept at recognizing images and spoken worlds and, in some cases, understand natural language—can also help pinpoint security holes. But some companies, including Google, are already exploring the possibilities.
A New View
So, yes, we’ll have to wait and see how effective these security bots really are. The good news is that with this hacking contest, seeing is lot easier. The Def Con security conference, held in Las Vegas each August, has long hosted Capture the Flag, a contest in which human hackers compete to find and exploit security holes. But it wasn’t the easiest thing to watch. After all, the action played out inside a network of computers. The Cyber Grand Challenge will also happen in cyberspace, but Darpa has devised a new way of showing us what that looks like.
Together with a San Francisco Bay Area gaming company called voidAlpha, Darpa spent the last few months building a “visualization” of what’s going on inside those seven supercomputers. Honeycombs of colored hexagons represent software services running inside the machines, and various colored beams show data flowing into these services, including probes from the seven competing bots. These visualizations will show us when a bot finds a security hole, when it patches a hole, and when it demonstrates an exploit on another machine.
For Darpa, the human contestants, and the seasoned hackers who will provide color commentary, this visualization is a milestone. It could help other researchers and entrepreneurs understand the power of these autonomous bots, spurring their evolution. “We’ve had a really hard time over the years getting to the point where people understand what goes on with hacking, but this visualization—even though it’s really just the first gen, kinda like an 1980s video game—can really make a difference,” Brumley says.
But the contestants aren’t completely sure how this will work, how deeply it will demonstrate what those autonomous bots are doing. Bryant says that although his team has spent ages preparing for the Cyber Grand Challenge, they didn’t see the visualization until WIRED wrote about it last month. This contest—in every way, it seems—is a trip into the unknown.