Engineering at Prezi

Prezi Got Pwned: A Tale of Responsible Disclosure

Disclaimer: For purposes of reference, Prezi runs a Bug Bounty Program that invites attacks like the one detailed below.

The emails that arrive in a security engineer’s inbox can be put into three broad categories.

1) Readable
Details of a new Budapest craft beer bar
Links to articles about actual real-life hoverboards

2) Archivable
Announcement on changes to company travel policy
Links to articles about how that whole hoverboard thing was a scam

3) Mutable
Replies to the announcement on changes to the company travel policy
People teasing me about me taking the hoverboard thing seriously.

Every now and then there are those emails which fit into the “Shiiiiiit” category.

We received one such email on Dec 3, 2013 at 03:06:24AM when a hacker by the name of Nicolas G wrote to us asking for our PGP key. What made it more serious was that this was Nicolas’ second email, and his first had been also worrying:

As you may or may not know, you can download a prezi in a portable format, which means your prezi’s XML and all the referenced media objects are zipped together with a player for Windows and OS X. However, not every image/video in an online prezi is neccessarly stored on our infrastructure. When a portable prezi is created, these resources have to be fetched from their original location. We call this and the underlying infrastructure “conversion” and “conversion service” (we pride ourselves on our ability to give things obscure and irrelevant names).

You can probably guess what comes next: If it’s possible to download anything from the internet, how about “file:///etc/passwd”? Headshot! Nicolas was able to read any files on our local file system that were accessible by the conversion process.

We learnt a lot. Luckily the conversion process was ran by a restricted user, but we could have implemented a better restriction. And obviously, we should never have ever allowed either the download or return of anything from file:// (or similar) locations. (self) Footshot!

The improvement we made was to check that the protocol of anything is http or https before we fetch anything. And it must be true even if a http-served URL redirects us somewhere.

We also identified the specific usage of some libraries (Python’s urllib and PyCURL, JDK’s URL, etc) as a root cause, and so we quickly checked our existing code base for similar occurrences. We have an internal tool called Repoguard, which tests regexes for each line a Prezi developer pushes, and triggers manual code-reviews when matches are found. We enhanced it’s ruleset as well.

Nice bug, neat fix, bounty paid, everyone can sleep safe. Well, until Nicolas sent that second email…

Why would Nicolas asking us for our PGP key make us come over all Clay Davis? Here’s why:

Most of our infrastructure runs on AWS, and the machines are managed with Chef. When you create a server from an AMI, it goes through a bootstrap process, where Chef’s client software is installed. Each chef-client is authorised on the hosted chef server separately, so every server has to contain an RSA private key to authenticate itself. Since these are generated during bootstrap, a special bootstrap key (called chef-validator) is needed at the first contact. After the bootstrap phase the validator key is thrown away. With the validator key the machine’s configuration becomes visible. This is not enough to kick off new (fully functioning) machines, because we don’t store sensitive data in cookbooks directly. Those are stored in encrypted data bags, and the encryption key is accessible for a privileged group of developers only (who start/stop machines regularly). During the bootstrap process we upload this secret as well.

Some of our machines - like the conversion nodes - are in autoscale groups, so for them the whole process is automated. We have to store the validator key and the encrypted data bag secret somewhere, and AWS has a convenient location to place them called user data. This can be either a shell script or a configuration file for a cloud-init program, and runs the first time the machine boots. Perfect place for these secrets, we thought.

What could possibly go wrong? Plenty, if you’ve heard of AWS’s Instance Metadata and User Data service, which provides various info about your instance, including the user data. It is accessible only from a running EC2 instance, and provides information only about the requestor node. If you have an AWS instance, you can access it at And, Nicolas had quite a few. Headshot (again)!

Stuff has happened since we opened that email:
1) Nicolas has written a talk about his experiences (including repeatedly shooting us in the head), which he delivered it at Insomnihack 2014.
2) We now filter the non-public ip addresses as well.
3) We started watching The Wire again.

Update: Here are Nicolas’ Insomnihack slides. We would like to once again express our greatest gratitude towards him, and towards everyone else participating in bug bounty programs.