personal cloud | Jennifer Kramer

Building a Personal Cloud Computer

Wednesday I presented a talk at the Austin Personal Cloud meetup about Building a Personal Cloud computer. Murphy was in full effect, so both of the cameras we had to record the session died, and I forgot to start my audio recorder. I’ve decided to write out the notes that I should have had, so here’s the presentation if it had been read.

In this presentation we’re talking about building a personal cloud computer. This is one approach to the personal cloud, there are certainly others, but this is the one that has been ringing true to me lately.

A lot of what people have been talking about when they speak about the personal cloud is really personal pervasive storage. These are things like Dropbox or Evernote. It’s the concept of having your files everywhere, and being able to give permission to things that want to access them. Think Google Drive, as well.

These concepts are certainly valid, but I’m more interested in software, and I think computing really comes down to running programs. For me, the personal cloud has storage, but it’s power is in the fact that it executes programs for me, just like my personal computer at home.

That computer in the slide is a Commodore +4, the first computer I ever laid fingers on.

Back then, idea of running programs for yourself still appealed to the dreamers. They made movies like TRON, and we anthropomorphized the software we were writing. These were our programs doing work for us, and if we were just smart enough and spent enough time at it, we could change our lives and change the world.

This idea isn’t new, in fact AI pioneers were talking about it back in the 50s. John McCarthy was thinking about it back then, as Alan Kay relates when he talks about his 3rd age of computing:

They had in view a system that, when given a goal, could carry out the details of the appropriate computer operations and could ask for and receive advice, offered in human terms, when it was stuck. An agent would be a ‘soft robot’ living and doing its business within the computer world.

That’s been the dream for a long time…

But that never really happened. The personal computer revolution revolutionized business, and it changed how we communicated with each other, but before the Internet things didn’t interconnect to the point where software could be a useful helper, and then we all went crazy making money with .com 1.0 and Web 2.0, and it was all about being easy and carving out a market niche. Then something else hit…

Mobile exploded. If you’ll notice, mobile applications never really had an early adopter phase. There was no early computing era for mobile. You could say that PDAs were it, but without connectivity that isn’t the same as the world we have now. Most developers couldn’t get their app onto a mobile device until the iOS app store hit, but that platform was already locked down. There was no experimentation phase with no boundaries. We still haven’t had the ability to have an always-connected device in our pocket that can run whatever we want. The Ubuntu phones may be that, but we’re 6 iterations into the post-iPhone era.

And who doesn’t love mobile? Who doesn’t love their phone? They’re great, they’re easy to use, they solve our problems. What’s wrong with them? Why do we need something else? Well, let’s compare them to what we’ve got…

With the PC we had a unique device in so far as we owned the hardware, we owned our data, and EULA issues aside, we owned the software. You could pack up your PC, take it with you to the top of a mountain in Nepal, and write your great novel or game or program, with no worries about someone deactivating it or the machine being EOLed. Unfortunately the PC is stuck at your house, unscalable, badly networked, loaded with an OS that was designed for compatibility with programs written 25 years ago. It isn’t an Internet era machine.

With the web we got Software as a Service (SaaS), and with this I’m thinking about the Picasa’s and Flickr’s and Bloggers of the world. No software to maintain, no hardware to maintain, access to some of your data (but not all of it, such as not having access to traffic metrics with Flickr unless you paid, and only export rights if you were paid up). But in this new world you can’t guarantee your continuity of experience. Flickr releases a redesign and the experience you’ve depended on goes away. The way you’ve organized and curated your content no longer makes sense. Or maybe as in the case of sites like Gowalla, the whole thing just disappears one day.

Mobile has it’s own issues. You often don’t own the hardware, you’re leasing it or it’s locked up and difficult to control. You can’t take your phone to another provider, you can’t install whatever software you want on it. Sometimes it’s difficult to get data out. How do you store the savegame files from your favorite iPhone game without a whole-device snapshot? How do you get files out of a note taking app if it doesn’t have Dropbox integration? In the end, you don’t even really own a lot of that software. Many apps only work with specific back-end services, and once your phone gets older, support starts to disappear. Upgrade or throw it in the junk pile.

Cloud offers us new options. We don’t have to own the hardware, we can just access it through standards compliant means. That’s what OpenStack is all about. OpenStack’s a platform, but OpenStack is also an API promise. If you can do it with X provider, you can also do it with Y provider. No vendor lock-in is even one of the bullet points on our homepage at HP Cloud.

Implicit in cloud is that you own your own data. You may pay to have it mutated, but you own the input and the output. A lot of the software we use in cloud systems is either free, or stuff that you own (usually by building it or tweaking it yourself). It’s a lot more like the old PC model than Mobile or SaaS.

All of these systems solve specific types of problems, and for the Personal Cloud to really take off, I think it needs to solve a problem better than the alternatives. It has to be the logical choice for some problem set. (At the meetup we spent a lot of time discussing exactly what that problem could be, and if the millennials would even have the same problems those of us over 30 do. I’m not sure anyone has a definitive answer for that yet.)

This is what I think the Personal Cloud is waiting for. This explosion of data from all our connected devices, from the metrics of everything we do, read, and say, and what everyone around us says and does. I think the Personal Cloud has a unique place, being Internet-native, as the ideal place to solve those problems. We’re generating more data from our activities than ever before, and the new wave of Quantified Self and Internet of Things devices is just going to amplify that. How many data points a day does my FitBit generate? Stephen Wolfram’s been collecting personal analytics for decades, but how many of us have the skill to create our own suite of tools to analyze it, like he does?

The other play the Personal Cloud can make is as a defense against the productization of you. Bruce Sterling was talking about The Stacks years ago, but maybe there’s an actual defensive strategy against just being a metric in some billion dollar corporations database. I worked on retail systems for a while, it wouldn’t surprise me at all if based on the order of items scanned out of your cart at Target (plus some anonymized data mining from store cameras) they could re-construct your likely path through the store. Track you over time based on your hashed credit card information, and they know a whole lot about you. You don’t know a whole lot about them, though. Maybe the Personal Cloud’s place is to alert you to when you’re being played.

In the end I think the Personal Cloud is about you. It’s about privacy, it’s about personal empowerment. It’s uniquely just about you and your needs, just like the Personal Computer was personal, but can’t keep up, so the Personal Cloud Computer will take that mantel.

The new dream, I think, is that the Personal Cloud Computer runs those programs for you, and acts like your own TRON. It’s your guardian, your watchdog, your companion in a world gone data mad. Just like airbags in your car protect you against the volume of other automobiles and your own lack of perfect focus, so your Personal Cloud protects you against malicious or inconsiderate manipulation and your own data privacy unawareness.

To do this I think the Personal Cloud Computer has to live a central role in your digital life. I think it needs to be a place that other things connect to, a central switching station for everything else.

And I think this is the promise it can fulfill. The PC was a computer that was personal. We could write diary entries, work on our novel for years, collect our photos. In the early days of the Internet, we could even be anonymous. We could play and pretend, we could take on different personas and try them out, like the freedom you have when you move to a new place or a new school or job. We had the freedom to disappear, to be forgotten. This is a freedom that kids today may not have. Everything can connect for these kids (note the links to my LinkedIn profile, Flickr Photos, Twitter account, etc in the sidebar), though they don’t. They seem to be working around this, routing around the failure, but Google and others are working against that. Facebook buys Instagram because that’s where the kids are. Eventually everything connects and is discoverable, though it may be years after the fact.

So how do I think this looks, when the code hits the circuits? I think the Personal Cloud Computer (or ‘a’ personal cloud computer) will look like this:

A Migratory – Think OpenStack APIs, and an orchestration tool optimized for provider price/security/privacy/whuffie.
Standards Compliant – Your PCC can talk to mine, and Facebook knows how to talk to both.
Remotely Accessible – Responsive HTML5 on your Phone, Tablet and Desktop. Voice and Cards for Glass.
API Nexus – Everything connects through it, so it can track what’s going on.
with Authentication – You authenticate with it, Twitter authenticates with it, you don’t have a password at Twitter.
Application Hosting – It all comes down to running Apps, just like the PC. No provider can build everything, apps have to be easy to port and easy to build.
Permission Delegation – These two apps want to talk to each other, so let them. They want to share files, so expose a cloud storage container/bucket for them to use.
Managed Updates – It has to be up to date all the time, look to Mobile for this.
Notifications – It has to be able to get ahold of you, since things are happening all the time online.
and Dynamic Scaling Capabilities – Think spinning up a hadoop cluster to process your lifelog camera data for face and word detection every night, then spinning it down when it’s done.

So how do we actually make this happen? What bits and bobs already exist that look like they’d be good foundational pieces, or good applications to sit on top?

No presentation these days would be complete without a mention of docker, and this one is no different. If you haven’t heard of docker, it’s the hot new orchestration platform that makes bundling up apps and deploying lightweight linux container images super-easy. It’s almost a PaaS in a box, and has blown up like few projects before it in the last 6 months. Docker lets you bundle up an application and run it on a laptop, a home server, in a cloud, or on a managed Platform as a Service. One image, multiple environments, multiple capacities. Looking at that Ubuntu Edge, that looks like a perfect way to sandbox applications iOS style, but still give them what they need to be functional.

Hubot is a chat bot, a descendant of the IRC bots that flourished in the 90’s. Hubot was built by Github, and was originally designed to make orchestration and system management easier. Since they connect and collaborate in text based chat rooms, Hubot sits in their waiting for someone to give it a command. Once it hears a command, it goes off and does it, whether it be to restart a server, post an image or say a joke. You can imagine that you could have a Personal Cloud Computer bot that you’d say ‘I’m on my way home, and it’s pot roast night’ to, and it would switch on the Air Conditioner, turn on the TV and queue up your favorite show, and fire up the crock pot.

The great thing about Hubot, and the thing about these Personal Cloud Bots, is that like WordPress Plugins, they’re developed largely by the community. Github being who they are, Hubot embraces the open development model, and users have developed hundreds of scripts that add functionality to Hubot. I expect we’ll see the same thing with the Personal Cloud Computer.

I’ve talked about Weavrs pretty extensively here on the blog before, so I won’t go into serious depth, but I think that the Personal Cloud Computer is the perfect place for something like Weavrs to live. Weavrs are social bots that have big-data derived personalities, you can create as many of them as you like, and watch them do their thing. That’s a nice playground to play with personalities, to experiment and see what bubbles to the top from the chaos of the internet.

If you listen to game developers talk, you’ll start to hear about that initial dream that got them into game development, the dream of a system that tells stories, or tells stories collaboratively with you. The Kickstarted game Sir, You Are Being Hunted has been playing with this, specifically with their procedurally generated British Countryside Generator. I think there’s a lot of room for that closely personal kind of entertainment experience, and the Personal Cloud Computer could be a great place to do it.

Aaron Cope is someone you should be following if you aren’t. He used to be at Flickr, and is now at the Cooper-Hewett Design Museum in New York. His Time Pixels talk is fantastic. Two of the things that Aaron has worked on of interest are Parallel Flickr, (a networkable backup engine for Flickr, that lets you backup your photos and your contacts photos, but is API compatible with Flickr) and privatesquare (a foursquare checkin proxy that lets you keep your checkins private if you want, or make them public). That feels like a really great Personal Cloud app to me, because it plays to that API Nexus feature.

The Numenta guys are doing some really interesting stuff, and have open sourced their brain simulation system that does pattern learning and prediction. They want people to use it and build apps on top of it, and we’re a long way away from real use, but that could lead to some cool personal data insights that you run yourself. HP spent a bunch of money on Autonomy because extracting insights from the stream of data has a lot of value. Numenta could be a similar piece for the Personal Cloud.

That’s the Adafruit Pi Printer, Berg has their Little Printer, and they’re building a cloud platform for these kind of things. These devices bring the internet to the real world in interesting ways, and there’s a lot of room for personal innovation. People want massively personalized products, and the Personal Cloud Computer can be a good data conduit for that.

Beyond printers, we have internet connected thermostats, doorknobs, and some of those service companies will inevitably go away before people stop using their products. What happens to your wifi thermostat or wifi lightbulbs when the company behind it goes way? Personal Cloud lets you support that going forward, it lets you maintain your own service continuity.

Having an always-on personal app platform lets us utilize interesting APIs provided by other companies to process our data in ways we can’t with open source or our own apps. Mashape has a marketplace that lets you pick and switch between api providers, and lets you extend your Personal Cloud in interesting ways, like getting a sentiment analysis for your Twitter followers.

In addition to stuff we can touch over the network, there’s a growing market of providers that let you trigger meatspace actions through an API. Taskrabbit has an API, oDesk does, Shapeways does, and we haven’t even begun to scratch the possibilities that opens up.

One thing to watch is how the Enterprise market is adapting to utility computing and the cloud. The problems they have (marketplaces, managed permissions, security for apps that run premises, big data) are problems that all of us will have in a few years. We can make the technology work with enterprise and startups, but for end users, we have to make it simple. We have to iPhone it.

So where do we start? I think we have to start with a just good enough, minimum viable product that solves a real problem people have. Early adopters adopt a technology that empowers them or excites them in some way, and whatever Personal Cloud platforms appear, they have to scratch an itch. This is super-critical. I think the VRM stuff from Doc Searls is really interesting, but it doesn’t scratch an itch that I have today in a way I can comprehend. If you’ve been talking about something for years, what will likely happen is not that it’ll eventually grow up, it’s that something radical will come out of left field that uses some of those ideas, but doesn’t honor all of them. That’s my opinion, at least. I think the Personal Cloud community that’s been going for years with the Internet Identity Workshop probably won’t be where the big new thing comes from, but a lot of their ideas will be in it. That’s just my gut feeling.

The last caveat is that Apple and Microsoft and Google are perfectly positioned to make this happen with vendor lockin easily. They all already do cloud. They all have app stores. They have accounts for you, and they want to keep you in their system. Imagine an Apple App Store that goes beyond your iPhone, iPad and even Apple TV, but lets you run apps in iCloud? That’s an easy jump for them, and a huge upending of the Personal Cloud world. Google can do the exact same thing, and they’re even more likely to.

So thanks for your time, and for listening (reading). If you have comments, please share them. It’s an exciting time.

The Personal Cloud: Innovation Happens at the Edges

Personal Cloud A couple of days ago I was cleaning up my recently migrated server, and ran across a directory filled with a couple thousand text files and some perl scripts. The directory wasn’t obviously named, but after some poking around I realized I was looking at the remains of a small consulting gig from 4 years ago. It was a pretty straightforward data mining job: There was a bunch of information on a public web site that an organization needed. Filings or grants applications or something like that. I needed to download it and remix it into a spreadsheet. What should have been a really easy spider and collate job ended up being complicated by the fact that said web host had a rate limiting module setup, so no IP address could grab more than 10-20 pages every hour. There were thousands of them.

If this problem sounds familiar, it’s similar to what Aaron Swartz was doing, and the problem that he was trying to overcome when he snuck that laptop into an MIT closet. In my case there was no login or private, privileged access, and I was running all this stuff in the middle of the night as to not inconvenience anyone else, but the problem remained: If I’d followed the rate limiters desires, it would have taken weeks or months to grab the data.

I ended up getting around the rate limiter by using something called Tor, The Onion Router. Tor works by sending your traffic through a distributed network of hundreds of other participants computers, anonymizing your physical and digital location in the process. For me, that meant that I could download all the files in 15 minutes or so in the middle of the night. For other people it means posting to Twitter or accessing dissident web sites from Syria or China or where-ever.

Running across these files reminded me of something I’ve been thinking about for a while: That what the Personal Cloud really needs to take off is an immediate problem-solving use case, and to find useful examples, we might want to look in the grayer areas of the internet. We can talk about companies bidding for our orders VRM style or Internet of Things devices dumping metrics into our personal data warehouses, but both of those things are going to require a lot of supporting infrastructure before they’re really viable. If you want to get a lot of people excited about something today, you solve a problem they have today, today. And that brings me to the edges…

The Edge: Where Innovation Happens

One common aphorism shared by those in technology or innovation is that new things develop at the boundaries. Change and chaos happens at the edges, it happens at the borders, where things mix and intermingle. Here’s the MIT Media Lab’s Joi Ito talking about it, for instance. MIT is a large, stable organization. Businesses are large, stable organizations. The Media Lab is where they meet, where they cross-populate and where the friction in developing new ideas is reduced as much as possible. The same can be said about border towns. New York City is an American border town. It’s the edge between our country and a whole bunch of immigrants, both old and new. The mix of ideas and talents and experiences creates new things.

Joi Ito: Innovate on the Edges and Embrace… by FORAtv

A lot of the innovation that happens on the edge happens in the gray area outside the strictly legal, or deep in the illegal. Across our southern border we have very advanced drug and gun smuggling tunnels, complete with ventilation and electricity. Neal Stephenson’s last book REAMDE was largely about northern border smuggling. Chocolate, toy filled Kinder Eggs are illegal in this country, so people smuggle those in. I brought some back the last time I went to Mexico, and some friends brought back a whole carton when they went to Germany recently. In Gaza they even have KFC delivered by tunnel:

Given that innovation happens at the edges, that people solve their problems at the edges using interesting methods, and that the Personal Cloud needs some need-driven use cases in order to flourish, I think it’s useful to look at some of the ways people are using things like the Personal Cloud already for dubiously legal purposes (though the legality they’re avoiding isn’t always our own). Perhaps by digging into what makes them compelling, and how their developers have solved those problems, we can learn something about developing Personal Clouds for everybody.

Some Personal Cloud Definitions

When looking for products that fit the Personal Cloud mold, I’m specifically looking for interesting uses of on-demand computing and networking. Especially things that don’t inherently scale beyond the individual, either due to privacy concerns, the need to be distributed, or some other unique aspect of the approach.

A job that only takes 10% more time to run for another person isn’t a good candidate for a personal cloud, because the economy of scale is going to keep it expensive. Running your own mail server is a bad idea these days, because your data and address can be portable (with IMAP and a personal domain) and running the spam filtering and staying on whitelists is hard. It’s a lot better to register your own domain and let a trustworthy third party do it. I should mention that Phil Windley has a good post about IMAP being a proto-Personal Cloud protocol, if you haven’t read it.

So, with that said, let’s look at some examples…

Tor: Anonymize All the Things

So back to Tor. Tor is built as a distributed, self-organizing network. There are Tor nodes that you connect to, the address for which you get either by getting passed an IP address on the side, or by looking one up publicly where that won’t get you thrown in prison. Once connected to the Tor network your public internet traffic is bounced through the network of Tor nodes in a randomized, encrypted way, and eventually finds its way onto the public internet through Tor Bridges.

The people who run Tor Bridges are paying for your traffic twice, because your connections come into their machine and then out again. Running Tor Bridge is a labor of love, done by people who believe in anonymity and freedom of speech. It doesn’t pay, but knowing that a political dissident somewhere can speak freely about an oppressive regime has a karmic payoff.

A few years ago Amazon’s EC2 cloud computing service started offering a free micro level of service. You could sign up and run a really small cloud server for development or testing without paying. It didn’t cost Amazon much to run them, performance wasn’t really great, but it got people onto their platform. Usually people start up Amazon provided server instances to install software and play around on, but the folks behind Tor realized that they could create a pre-configured server image with the Tor Bridge on it, and let people spin those up in Amazon’s free usage tier. They call it the Tor cloud. You still pay for bandwidth, but if you bridge 15 Gig of bandwidth a month, your bill will only be around $3. It’s less than the price of a latte, and you do something good for internet freedom. You don’t have to know a lot about the cloud to set it up, you just register for Amazon Web Services, pick the image, and hit Start. The images are pre-configured to download software updates and patches, so there’s virtually no maintenance work. Just the kind of simplicity you need for a Personal Cloud feature.

Back It Up Or Lose It: The Archive Team

I’ve harped on our tendency to not take care of the things we create before. Web sites get acquired and shutter within months. Promises are made that users will be able to export their data, but promises are made to be broken. Fortunately for us, there’s a group of archivists led by Jason Scott called Archive Team. Archive Team scrapes sites that are destined for the Internet trash heap, and uploads the data to the Internet Archive. So far they’ve archived sites like Apple’s MobileMe homepages, Yahoo Groups, and are currently trying to grab as much of Posterous as they can before Twitter drops the axe. This may sound pointless till a few years after a company acquires and then shutters the site your mom or sister blogs at or posts family photos to, and you realize there’s no way for you to get that stuff back.

Archive Team runs into a lot of the same issues I had around rate limiters. Yahoo! and Twitter don’t want them slurping down the whole site, they want to take the engineering resources off those projects and let them die a quiet, cost-cutting death. To get around this, Archive Team offers a virtual machine, the Archive Team Warrior.

The Archive Team Warrior is a distributed but centrally managed web spider. The Archive Team central server slices the archiving work up into little chunks, and the Warrior on your computer asks the server for some work to do. The central server gives it a small to-do list of URLs to fetch, and the Warrior starts downloading those until it hits the sites rate limit. Any data it can download, it sends back to the Archive Team server for bundling and uploading into the Internet Archive. Then it waits and retries until the site will let it back in.

Warrior Screenshot The Archive Team manages the projects, and the Warrior presents a simple web interface where you can tweak a few settings and track how you’re doing. Most importantly, it’s hands-off. You can set it up once, and let it run in the background forever. It manages its own software updates, and you can tell it to work on whatever the Archive Teams priorities are, and ignore it from then on. If you have a PC sitting around that you don’t use a lot, running the Warrior is a nice way to give back to the Internet that’s given us so much. It’s good karma, and it’s easy.

Pirate All the Things: Seedboxes

So far we’ve talked primarily about projects which give good karma, now let’s talk about a project that is often used for… not so good karma. In 2001 the BitTorrent protocol was introduced, allowing for a (then) secure way to share lots of files in a bandwidth-optimized fashion. Users get pieces of a file, trackers know who’s downloading the file at any one time, and clients cooperate to distribute the pieces as widely as possible. When you’re downloading a file from BitTorrent it’s entirely likely you’ll be downloading chunks of it from people who don’t have the entire file yet, and likewise you’ll be sharing parts of the files you’ve downloaded with other people who don’t have those pieces yet. By working this way everyone gets it faster.

While BitTorrent might have been secure once, it’s now entirely likely that your ISP knows what you’re downloading, who you’re downloading it from, and what you’re sharing back. They can look at payload sizes, the trackers you’re talking to, traffic bursts, and pretty reasonably reconstruct your activity. If they’re the MPAA or other pirate-hunting groups they can even run their own clients and integrate themselves into the network. Running a BitTorrent client from your home computer and downloading anything remotely illegal is like asking the bagger at the grocery store to help you out with your shoplifted goodies.

So let’s say you’re sharing something that you think should be legal but isn’t, or you’re trying to use BitTorrent for a legal end, like sharing a bundle of book materials or distributing an Operating System or a big chunk of GeoCities and don’t have the bandwidth at home to support it. (Or, sure, you could be downloading Iron Man 3.) This is where something called Seedboxes come into play. A Seedbox is a server at an ISP somewhere that just runs a BitTorrent client. You can use them to get your torrents out to a bunch of people really fast, or you can use them to download files that you wouldn’t be comfortable with downloading to your home IP. You can even buy them in another country, increasing the difficulty of tracing the traffic back to you.

Seedboxes are managed servers, you don’t install software updates on them, the provider does that, but they likely won’t give you much in the way of customer support. Lots of them use a Web UI called ruTorrent, an open source frontend for the rTorrent BitTorrent client. You don’t SSH into these machines, you probably don’t even have a server login, but you can use the web UI, and conduct your business in the cloud.

In this way ruTorrent Seedboxes are a perfect prototype for our Personal Cloud. The providers don’t watch the servers or monitor their quality. Privacy is implicit when you’re doing something at the edge of legality. What they don’t know won’t hurt them as much when Interpol comes calling. The web UIs are built for self-service. You have a login, but the web UI is your entire management plane. rTorrent has an Android front-end, but most people likely manage them through the web. There isn’t any software on your home computer, just a username and password to a web site somewhere. The data’s yours, and if you wanted to shove it sideways into a cloud storage provider, you probably could.

Points of Presence: The Personal VPN

Spoilers As an addendum to these offerings, a sort of post-script on the idea of exploiting technologies at the edges for personal gain, I’d be remiss if I didn’t mention personal VPNs. Tor’s good for anonymity, but what if you just want to appear like you’re somewhere else. Say, for instance, somewhere the new season of Sherlock, Doctor Who or Downton Abbey is available for streaming 6 months or a year before it comes to your country. (Or vice versa, where we get new episodes of Mad Men a year before they do.) What do you do then?

The same technology that your company uses to securely connect you to your corporate network can be used to make you appear to be in the UK, or the US Midwest, or Japan, or wherever else the content is region-limited. You run the software (likely built-in to your Operating System), and connect somewhat securely to a computer in some other country or even continent, and all your internet traffic appears to come from there.

A few years ago I was in Mexico over Christmas, and there were some really good deals on Steam’s Holiday Sale. I have a US account, with a US billing address and a US credit card, but I couldn’t buy anything because my computer was with me in Mexico. I ended up installing a bunch of software on one of my servers and setting up a VPN to it, just to buy some cheap games. These days I could just plunk down a few bucks and be good to go, and a lot of people do.

A Few Learnings Lessons Learned

Users have problems, and will go to considerable lengths to solve them. None of these services are as easy as they could be, either because they’re niche offerings (Tor and Archive Team) or because of their dubious legality (Seedboxes). ruTorrent is a lot easier to use than it probably was, but it still isn’t as easy as using the Netflix or iPlayer iPad apps. The Warrior is a 174 meg download that requires installing Virtualbox on your computer. The Tor Cloud Bridge requires signing up for Amazon Web Services, and navigating their UI. To get a VPN provider or Seedbox requires research, dealing with a company that might not be entirely legit, and really falls in the class of early adopter technologies.

Even though all this stuff is hard to use, people do it. Seedboxes and private VPNs give people things they want. You may not have known that you wanted to watch the new season of Dr. Who before it comes out in the US, but once you know you can, you’ll go to some pretty extreme lengths to make that happen. Motivation can be powerful, and people will overcome serious technical hurdles if they’re properly motivated.

So looking at these examples, we can see that a Personal Cloud app really needs to offer 3 things:

1. Motivation: It needs to solve a real, immediate problem.

2. Self-Service: It needs to be super-easy to start using and offer a familiar, understandable interface.

3. Hands-Off: It needs to have software updates and easy maintenance built-in.

Any Personal Cloud offerings that don’t check these boxes may get some niche use, and may excite developers, but they aren’t going to start climbing up the adoption curve. As you build your Personal Cloud app, keep these things in mind. Users have needs we can solve, and we can empower them, but our solutions need to be compelling, simple to use, and simple to maintain.

The Archive Project

Ideas are funny things. Some are fleeting: You’ll be reading your twitter stream, one will pop into your head, and two tweets further it’s gone. Sometimes you can backtrack, reconstruct your experience and get it back. Sometimes it’s gone forever. Other ideas stick with you. They nestle into your brain and make a home for themselves, popping up when you read something tangentially related, or when you’re staring at the blank sheet of a new project.

For me, The Archive is that idea.

Prologue

As a kid I really loved anecdotal stories. One of my favorites were a series of sermons told in the form of the life story of a missionary named Otto Koning, relating the lessons he learned working with a tribe in New Guinea. Otto is a masterful storyteller, and I probably listened to the tapes dozens of times. Hearing him describe his experiences almost made you feel like you were there, and gave a really unique insight into a time and place that would have otherwise been undocumented.

chad-n-stack — Chad and the San Marcos dialup stack (circa early 1996).

When I started working my first internet job at a small ISP in San Marcos, Texas in 1995, I began to spend a lot of time riding shotgun on tech support house calls with Chad Neff. By now Chad has probably fixed half of the computers in San Marcos, but before he became the town’s resident Internet Guy, he had an entire career as an artist and printmaker. You still see his work popping up on eBay, and his prints as set dressing on movies and TV shows, especially Star Trek: The Next Generation. Chad also did a stint in the Army, in signals intelligence, plus a bunch of years in the Mounted Park Patrol and Police Reserve. Needless to say, Chad has a lot of stories.

When Chad wasn’t telling stories, we’d brainstorm the big idea that was going to make us internet millionaires (back then being a millionaire was an impressive thing). One of the ideas that we had, probably on the way to one of San Marcos’s funeral homes (Chad designed the awning over the entrance to one of them), was the Permanent Internet Memorial. The internet has the unique ability, compared to traditional headstones, of actually telling you something about a person that’s longer than a few sentences. We knew back then that storage and bandwidth were just going to get cheaper, so it seemed like a logical idea: Start a company designed to last forever, and charge a one-time fee to create a permanent memorial on the web.

Needless to say, the idea didn’t go any further than that ride, but the core concept of extended longevity on the internet, mashed up with the explosion in self publishing and data driven explorable sites eventually coalesced into the idea for The Archive Project. These ideas solidified in early 2000, and this is the concept as I had it then:

The Idea

The Archive Project is a web database for personal stories, index-able by place, theme, time, person and object. The building block of The Archive is the story, a personal anecdote about something that happened to you. Once you’ve created a story, you tell the system where it happened, when it happened, and you can tag other people in it.

Shoes — There’s a great story behind these shoes, but the flickr page only tells a little bit of it.

Users would be able to tag people in stories that may or may not be users. Eventually if a user signed up, they could claim all those people tags as themselves, assuming the original author validated it was really them.

I think I was designing this system before geocoders were as prevalent as they are now, because I actually requested and received a burned DVD copy of the USGS’s global gazetteer. The idea was you’d be able to type a place name like Austin, Texas into the system and the site would be able to drop the story on a map, which you could then make precision modifications to. With Open Street Map this is really easy, but at the time it was still something of an unknown.

When pinning a story in time, you’d be able to say broad things like The Early 50’s or Spring 1976, or burrow down to specific dates. You’d be able to put together strands of memories into an overall story, like Our Year In Paris.

My goal was to create a site where people would be able to publish their life story, like the vanity autobiography publishers of yesteryear. By wrapping the anecdotes that make up a life story in semantic data, you’d be able to surf through the system in what I hoped would be really interesting ways. You’d be able to explore stories from people who lived in San Francisco in the 60s, or who migrated from the midwest to New York in the 70s, read about what it was like from an adult point of view when you were growing up. You could read stories by people who travelled great distances when they were young, or from people who stayed in the same place their whole lives. You’d be able to read stories about sewing machines in New York or stories about cars in Arizona. You’d be able to find a narrative across all kinds of contexts.

Some stories would have associated media, photos or audio recordings or video. It would be like a museum for the human race, the opportunities for interesting curation would be enormous.

For the authors, the people who contributed content to the site, they would know that The Archive existed solely to serve as a caretaker for their stories. Like Wikipedia it wouldn’t be sold, and their kids and grandkids and great-grandkids could add new stories to theirs, and their contribution would be part of a permanent family history.

There’s even an opportunity to have a Real and Fictional versions of the Archive, where fans could assemble consolidated versions of their favorite stories or characters lives. For instance, on December 18th, 2009 in Colorado, Jeff Winger had a fight with some fly dancers, and was rescued by his friends.

Imagine the mobile possibilities: You could be standing in a random location, open an Archive browser app, and read stories that happened there before. There’s nothing stopping museums, or a place like Mount Vernon from creating stories from George Washington’s life.

The Archive Project never got beyond dreams and some rough architecture diagrams. I knew what I wanted to build, but the scope was large and I knew it would be difficult to promote. It would be way too easy to fail, and once you accepted your first story from a user, you would be honor bound to host the thing forever.

Present Tense

Things are a little different now. It’s become possible to host vanity projects, even at a reasonable size, for not that much money. Creating socially conscious organizations is easier than it was, and there’s more support. Most importantly, though, over the last dozen years we’ve gotten really good at creating database centric social web sites without reinventing the wheel. Personally, I learned a lot of lessons from building Specialized Bicycle Components social network, the Riders Club. Specifically, features don’t matter if they aren’t easy to use, and in the end you’re really there to enable their use of the site, they’re not there to populate your dream.

Privacy was always a sticky wicket with the archive project. It could be a gold mine for identity theft, mostly in enabling spear phishing social engineering, but these days the risk is less, I think, because people realize that so much of their lives are already available to people who want to know. The reward from publishing your memories is greater than the risk of someone doing something bad with them.

Aaron Cope’s talk at the New Zealand National Digital Forum sparked some interesting thoughts about The Archive, since it’s essentially a catalog of memories. The idea of assigning artisinal integers to each memory, and building the entire thing in a way that it can be human shardable (something I’m going to write a blog post about soon), makes a lot of sense. Having the system be able to collate data from both a centrally hosted repository and a network of individual Archive sites that individuals could run themselves or for a group would be really powerful, and act as protection against the collapse of the central site.

I think you could prototype a version of The Archive pretty quickly these days, and I may spend part of early next year doing just that. I think the idea is still valid, and if things like Storify have shown us anything, it’s that people crave narrative.

Conclusion

The Archive is one of those things I want to exist. If Wikimedia had something like this already that wasn’t a wiki (I don’t think people should be able to edit others stories unless they have permission), I’d put this idea to bed. But it hasn’t happened, and it needs to.

Interviewing my parents on video. Not everyone has this chance.

We have the technology to record and share our experiences. We could hold on to our history, but we’re letting it slip through our fingers. The best stories get passed on to the kids, and maybe to the grand kids, but a few generations out the person is just an entry on a family tree. I’ve interviewed my parents on video about their lives, but I don’t have a place to put it, or best practices on how to turn it into something other people could learn from, so the project has stalled. Individual communities have started story archiving projects, and there are Best Of or focused media collections like StoryCorps, but nobody’s taken this to the web, to make it easy for everyone.

So let’s make it happen. If you’re interested in working on The Archive, if you have thoughts or ideas, or if you know of a project like it that already exists, drop me a line.

Personal Cloud Contest Winner and Entries

The entries are in for the Personal Cloud Contest, the judges have considered them carefully, and the winner of the 2013 SXSW Gold Badge is… Carlos Ovalle! Read on for all the entries.

Continue reading “Personal Cloud Contest Winner and Entries”

The Personal Cloud Computer

DEC KL10 — This is what computers were like before PCs. (photo by phrenologist)

40 years ago the development of the Personal Computer sparked a revolution. It took a decade for PCs to land in the home, and another decade for them to land in a majority of US homes, but it created an entire industry. Having a computer that was yours led to generations of hackers and programmers, it created Microsoft, Apple, and led to the rise of Amazon and Google.

The PC is now in decline. In 2008 the laptop outsold the desktop, and now the tablet is eating the laptop’s lunch. As form factors have shrunk and the Internet has become a more dominant element of most users experience, the computer you own that runs software you own and has explicit privacy is disappearing. We store our spreadsheets and documents in Google Drive, we post our pictures on Flickr, we store our correspondence in Gmail, we chat with our friends on Facebook or Twitter.

HP Touchpad — HP TouchPad, a $500 paper weight. (photo by traferty)

No one is learning how to program on Facebook, especially when their only device is a cell phone or tablet. It’s dangerous to store your personal pictures only in Flickr. Your Google Drive documents and Gmail email are a clever hacker away from being in someone else’s hands. On the internet, services die. Devices become orphans and eventually the content on them is lost.

Maybe it’s time for a new paradigm, something that preserves the hackability and ownership of the PC, but takes advantage of all the new technologies we’ve come up with in the last 40 years. Maybe that thing is…

The Personal Cloud Computer: The essentials of single user focus, software and data ownership, but the portability, networkability and burstability of the cloud, the display flexibility of HTML5 interfaces, the hackability of linux and the flexibility of a PaaS.

So what does the Personal Cloud Computer (PC2, maybe? Let’s try it out.) look like, specifically it’s fundamental architecture, organization and software use cases? Well, let’s start from the top…

Architecture

I think we’re looking at something like a PaaS similar to CloudFoundry, but with a UI front end like WordPress, and tuned to run apps for you, not run apps for web consumption. You’ll access it via HTTPS, it’ll be optimized for desktop, tablet and mobile, and it’ll have API access routes for stand-alone applications or hardware devices. By default your distribution may come with a set of plugins (from the desktop metaphor, these are our programs), but no one wants to be limited to one programming language, so something like CloudFoundry makes sense. You’ll be able to run plugins written in Java, Python, Ruby, PHP, etc. Initially each PC2 platform creator will probably have it’s own plugin spec, but developer demand will push them towards a common, unified interface spec.

Dog — Even mongrels can be beautiful. (photo by w1n9zr0)

Logging into the UI should be as secure as possible. Maybe we’ll use two factor authentication with bearer tokens, maybe there will be a super-secure pay-for service that holds the master password for your device. However we do it, login needs to be safe, and lost password needs to be really, really, really difficult to hack. Maybe you need to round up a quorum of your friends and coworkers, and by combining bits of a key you’ve given them, they can re-generate your master reset password.

WordPress has learned that software updates are a big issue, and having the update interface be as integrated and simple as possible is a huge deal. Apple figured out that having devices live their entire lives without being tethered to a PC was an important feature. PC2’s will need something similar. Updates for the core platform and plugins should be easy, as secure as possible and baked in.

For memory consumption’s sake, we’ll probably follow the iOS model. Programs only run when you’re making requests of them. They can schedule tasks to wake themselves up with a central platform scheduler, and can run little chunks of code to check things in the background, but they don’t run continually when you’re not using them. The core platform also provides a notification/alert hub, so if your scheduled task needs to tell you something, it can push it to you.

The interface between the core and the plugins should be network-able. You’ll want the flexibility to run your PC2 in the cloud, but execute a program on your phone, or your house’s thermostat, or your car. Authentication will probably be similar to Oauth, or the two factor unique password setup that Google does. You’ll pair devices with your PC2 by entering a network identifier for your PC2 into the device, then the device will generate a random key, which you’ll punch into a devices section of your PC2 interface. If you lose your cell phone, you can go into the PC2 interface and turn off it’s access without resetting everything.

Sharing should be baked in to the platform. You’ll be able to grand read and write access to files, or between plugins, to other PC2 installs. You may even share back to centralized services, or pull from centralized services like a car sharing service or traffic updates. You could share where your car is with a city-wide traffic nexus that shares back the ability to create a route based on live traffic conditions, for instance.

Your UI would be driven like building a Facebook App. Plugins feed UI markup from back to the PC2’s display layer, and it arranges things so the UI can be optimized for a plane of tiles style desktop UI, a tablet, or a single-tile phone UI. You may even have baked in interface specifications for voice or visual interfaces, so you can control apps with your voice or eyeball movements in your Google Glass.

Microsoft is probably tackling a bunch of these problems with Metro, or whatever it’s called today. I’m not sure if I trust them to succeed. These PC2 solutions would have to grow organically, defining the entire spec at once would be a recipe for disaster. Learn the lessons, but design for simplicity. Nobody’s going to be building Word on this platform in the first year or two.

A bunch of use cases now and over the next few years are going to be built around pay per use or subscription APIs (for facial detection in your lifestream videos, or machine translation, or whatever the next thing is). Having a centralized billing platform for those will be important. You’ll either have accounts with a few external services that plugins can use, or the billing and payment part will be built into the platform. You’d have an internal provider model, so plugins would be able to discover their options without needing to know the authentication or implementation details themselves.

Utilizing cloud services would be similar to subscription APIs. Being able to burst their CPU use or disk usage should be a service provided to plugins by the PC2. Your thermostat should be able to request a hadoop run to churn consumption data, utility billing rates and weather forecasts once a week. The thermostat doesn’t need to know how to spin up the hadoop cluster, but a ‘can run hadoop jobs’ component can be a part of the PC2, and it can know how to use various cloud services and be able to optimize based on price. (I’m looking at you, Amazon EC2 spot market.)

So what do we have? We have a base UI framework with robust integration options, strong login security, a networkable plugin interface, a centralized scheduler, integrated sharing, integrated API billing and a burstable cloud resource provider. We’ve created a blueprint for an Operating System, something designed for the strengths of the cloud, but something very different from what we have now. Something like Salesforce.com, but for people, not businesses.

Organization

I think there will be a bunch of companies and groups creating platforms like this. Some will flourish, some will die. Early adopters will bear the brunt of the pain, but they’ll put up with it for the advantages, just like they always do. I think most of the successful groups will look like Automattic. WordPress is an easy example to point to, they’ve done really well financially and still embrace the open source model. They make money from their hosted solution, but you can install it yourself if you want. I don’t worry that they’re going to hire a new VP really focused on ‘maximizing value’, and make a deal with Microsoft so their mobile UI is only optimized for Windows Phone. I know they have an open source ethos from the top down, so I trust them.

But in the beginning someone’s going to have to start cobbling these things together into a value-providing alpha. Will it be me? Will it be you?

Use Cases

Johnny Mnemonic Pinball — Like this, but… not. (photo by robinvanmourik)

It doesn’t take too much imagination to think of things that a platform like this could provide, but it takes the right combination of experience and imagination to get it off the ground. Most of the people who would get this kind of platform are early adopters who are already involved in the cloud. They may run VMs in a couple different clouds, they may have written integration and maintenance software. The first programs they’re going to build will be things that the PC2 is uniquely suited for, namely tying together your internet of things, and running consuming and consolidating services.

Your PC2 may be a great place to tie all your home automation and quantified self stuff together. You may have zigbee’s and the Nest and your Withings scale and your Expereal app and your food logger and your Fitbit. You may like the services, but wouldn’t it be interesting to know if you walked more on days when it was cold, or what combination of exercise, travel and food intake led to your greatest happiness. That’s data I’d want to keep long term, long after those respective companies bite the dust. That’s a perfect PC2 application. It’s big data for people.

The PC2 could also be a great place to host personal Weavr type bots. It’s an always-on platform that has API access, both free and paid, and the UI options mean you could get a back-channel or tweaking interface to your weavr in your car, or on your cell phone.

With Tropo or Google Voice, your PC2 could be the center of your personal message hub. You could call your PC2 and ask it things, Siri-style, or other people could call it and you could intelligently channel them to what they need to get to. All the audio data would live in your own cloud storage, so if you wanted to run analytics on it 5 years down the road, you could. Hey, voice-driven twitter-style sharing with just your friends, call in, record a clip, and it gets sent to all your buddies.

Someone will eventually build an office suite for the PC2. It will start simple, and then it will get smarter. With easy cloud access you’ll be able to run Wolfram Alpha style processing on your data, on demand. Once the (open source) software’s written once, everyone can use it, they just have to pay for the CPU horsepower.

The PC2 initially wouldn’t have more memory or CPU demand than a low end VM or cell phone, which means that if you didn’t want to pay for a cloud server, or had already used your free Amazon EC2 option, you could run your PC2 on, oh… a Raspberry Pi.

TL;DR

PC2’s are a response to a market opportunity, and a technological tipping point. People need tools to thrive, and their PCs are turning into services they rent. All the pieces are in place for a new approach, nothing new really needs to be invented. The only thing that remains is to start writing code and see if this is something people actually want. Of course, that’s the hardest part.

SXSW 2013 Badge Contest

I’m giving away a SXSW 2013 Gold Badge to the person who has the most innovative idea about how a person could use the cloud. The contest is open till midnight November 25th, so you can talk to your family and friends over Thanksgiving, come up with some great ideas, and maybe get a chance to see Elon Musk, Joi Ito, Nate Silver and many others at SXSW Interactive and Film, 2013.

SXSW 2013 Personal Cloud Contest

(The contest is now closed.)

Platform Persistence, Virtual Death and Pocket Worlds

Note: This is a long, rambling, train of thought post. The tl;dr version is: Emotional connection to bots happens, we get sad when things we care for go away, so there’s a big ethical risk associated with human-acting bots living in unportable platforms. We members of the ‘Bot 2.0’ community need to address this before we get too far.

A little over a year ago I started playing a cloud-based iPhone game called GodVille. GodVille describes itself as a Zero Player Game. You take the role of a god, you create a hero, and you send that hero out into the game world to fight on your behalf. Your hero is an independent being. When you come back to check on them, they will have recorded an entertaining diary of monsters fought, treasures collected, and items sold, all without your input. You only have four influence options on your hero: you can encourage them, which makes them heal faster, discourage them, which makes them fight better, shout down at them, and activate some of the items they pick up.

While it isn’t a very interactive game, it’s still a compelling experience. I check on my hero every day or two, look for interesting items to activate, and encourage him as much as I can.

Your GodVille hero can’t permanently die. They can be killed, but they’ll just wait around in the ground, writing notes in their diary until you resurrect them. (They’ll get tired of waiting for you and dig themselves out after a few days.) Not killing these bot-like characters is common in online games, permanent death is generally reserved for the hardcore modes of single player releases. (A really interesting article in wired.co.uk postulates that the free to play model is driving this, because developers don’t want to give you an excuse to walk away from their microtransactions, or get the feeling that your money was wasted.)

Once sufficiently powerful, your GodVille hero can adopt a pet, it’s own sub-bot that helps it fight and gains it’s own levels. My hero adopted a pet earlier this year. Over the a few weeks I watched the pet (a dust bunny named Felix) fight along side my hero, shield him from attacks and help heal him. The pet went up in level, gained some abilities, and everything was going just peachy.

Then I opened the app one day, and the pet was dead. My hero was carrying around Felix’s corpse. I went to the web and searched for pet resurrection, but found it wasn’t possible. Sometimes the hero will pay to have the pet resurrected, sometimes they’ll just bury them. After a grieving period, they’ll adopt a new one.

Felix’s death had a lot more of an emotional impact on me than I expected. I didn’t know Felix, I never met it, it really only existed as a few hundred bytes of data on a server somewhere. I’ve had more interactions with lamps in my house than I did with Felix. If you tip a lamp I really like off a table and shatter it into a million pieces, I may be angry, but I likely won’t feel an immediate emotional loss.

A Lamp with Feelings

Felix’s death was hard because I’d made an emotional connection to him, watching him interact with my hero. His death highlighted my powerlessness in the game. I can resurrect my hero, within the confines of the game mechanic, but I can’t resurrect his pet. No matter what I do, no matter how hard I try, I can’t bring Felix back to life.

Someday, inevitably, GodVille will shut down. People will move on to other projects, the server bill won’t get paid, iPhone apps won’t be the hot thing anymore. My hero, his diary and pet will disappear, and because he only lives inside the GodVille system (and being part of that system is a fundamental aspect of who he is), he will be gone forever.

Bruce Sterling talk at SXSW 2010 | Flickr - Photo Sharing! — Bruce Sterling at SXSW 2010 (photo by jonl)

Bruce Sterling gave a great talk about this at SXSW in 2010, about how the Internet doesn’t take care of it’s creations. We build and throw away. Startups form, grow like crazy, and if they don’t sufficiently hockey stick, they close. Or they get popular but not popular enough, and the team gets hired away to bigger players. Either way, the service shutters, the content and context disappears, history is lost. If it’s bad to have this happen to your restaurant checkins and photos, how much worse is it when it happens to virtual beings you’ve created an emotional attachment to? As creators, if we encourage platforms like this, roach motels where content comes in and never comes out, what does that say about us?

Eighteen and a half years ago I created my first character on a text based multiplayer internet game called Ghostwheel, hosted by my first ISP, Real/Time Communications. Ghostwheel was a MOO, an Object Oriented version of a Multi-User Dungeon, the progenitor of today’s MMORPGs like World of Warcraft. In a MOO you can create characters, build environments and objects, talk to other people, fight, and even create bots.

Real/Time Communications hosted Ghostwheel on a small server in their data center, a 486 desktop machine. People from all over the world connected to that server, created characters, and wove shared stories together over the early boom years of the internet.

Early Ghostwheel Meetup — A Late 90’s Austin Ghostwheel Austin Meetup

Eventually Real/Time Communications lost interest in hosting and maintaining Ghostwheel (and eventually Real/Time itself disappeared), so we took it elsewhere. As someone with colocated servers and ISP experience, I ended up hosting it on one of my machines. It now lives in a cloud VM, and even though the players have left for newer, more exciting destinations, everything they created, the characters, the setting, the dusty echoes of romances and feuds and plots all still exist. It still exists because someone with the wherewithal got their hands on it, and cared enough about it to keep it going, and it exists because MOO is an open source platform that doesn’t depend on one company being in business.

While piecing together the thoughts for this post it occurred to me the that the MOO server could probably be compiled on some modern linux based smartphone. They have more than enough CPU power and memory, and even a 3G connection is fine for text. I could conceivably load Ghostwheel on one and carry it around in my pocket. A whole world, nearly a thousand characters, tens of thousands of rooms and objects, dozens and dozens of species of monsters, all living in my pocket. I could hand it to people and ask them about the weight of a world. Every time I think about that it blows my mind. There’s definitely the kernel of something new and weird there.

So back to my point, as I’ve talked about before there’s a whole species of autonomous bots appearing around us that we relate to as nearly human. Like my GodVille character, we don’t have direct control over them, their autonomy being one of the things that makes them seem more human. They’re coming, they’re awesome, and I think in a few years they’ll be as common as Facebook accounts.

The most exciting work I’ve seen in this field is from the good folks at Philter Phactory and their Weavrs system. Weavrs are social bots defined by location, work and play interests, and groups of emotional tags. The Weavrs system hooks into Twitter, generates its own personal web pages (kind of like a bot-only mini-Tumblr) for each weavr, and is extensible through API driven modules called prosthetics. Some example prosthetics include the dreams prosthetic, which folds images the weavr has reposted into strange, creepy kaleidoscopes.

Weavrs are easy to create, they produce some compelling content, and they’re fun to watch. I’ve created a few, my wife has one, several of my friends have them. Interest is picking up from marketing and branding agencies, and where the cool hunters go, tech interests will inevitably follow.

The thing that’s starting to concern me is the possibility that Bots 2.0 could end up being another field like social networking where the hosted model gets out ahead of ownership and portability. What happens when the service hosting our bots disappears? What happens to all it’s posts, it’s images, it’s conversations? (I suppose I wouldn’t be qualified to work at a cloud provider if I didn’t have strong feeling about data portability.)

Weavrs as a whole isn’t open source, but it has lots of open source bits. Philter Phactory is trying to run a business, and I don’t begrudge them that. They have the first mover advantage in a field that’s going to be huge. I’m sure data portability is on their radar, but it’s a lot easier to prototype and build a service when you’re the only one running it. Conversely, it’s a lot easier to scale out a platform designed to be run stand-alone than to create a stand-alone version of a platform.

Once a few more folks start to realize how interesting and useful these things are, I think we’re going to see a Cambrian Explosion of social bots, and I’m sure plenty of entrants in the field won’t be thinking in terms of portability. They’ll be thinking about the ease of centralized deployment and management, and the reams of juicy data they can mine out of these things.

I remember in the early 2000’s feeling a similar excitement about self publishing (blogging). It was obviously going to be something that was going to be around forever once it was perfected. You could see the power in it’s first fits and starts, and it was just going to keep getting better. I think there are more than superficial similarities between self publishing platforms and social bot platforms, in fact.

Thinking back on that evolution, I think the archetype that we should hope for would be the WordPress model. I remember Matt Mullenweg visiting the Polycot offices in 2004 or so. He was passionate, had a great project on his hands, and I’m embarrassed to say that we weren’t smart enough to figure out a way to help him with it. Matt, Automattic and the WordPress community have done a great job of managing the vendor lockin problem while still providing a great hosted service people are willing to pay for. They get the best of both worlds, the custom WordPress sites and associated developer community, millions of blogs hosted by ISPs, the plugin developers, and still get to run a nicely profitable, extremely popular managed service. If wordpress.com goes away (god forbid), someone will still be maintaining the core codebase, and you’ll be able to export your data and run your own instance as long as you like. (Just remember to register your own domain name.)

I hope that the social bot community evolves something similar. I think that platforms are coming online to encourage that, and I think the people in the field are smart and recognize the ethical implications. Maybe in a year you’ll be able to run your bots on a hosted service or, if you’re motivated, run your own bot server and fiddle with it’s innards as you please. Who knows, you may even run them on your smartphone.

The Quantified Car: Progressive Snapshot

Let me paint you a picture: It’s the near future. Your insurance company sends you a little gadget that you carry around. It notes when you get a little too aggressive or if you’re out partying too late, and automatically sends the information wirelessly back to your insurance company (say, the 164th largest company in the US). If you do something they consider risky, it might even alert you with a buzz or beep. If you fit their definition of lower-risk (by, say, not being out past midnight) they give you a discount on your policy.

Sounds like the future, doesn’t it? Pervasive metric collection, big data analytics, pattern and custom behavior based pricing optimization? Except that it’s been available since last year, just for your car.

Progressive calls it Snapshot, it’s a little device that plugs into your car’s debug port. It gets it’s power from the car’s battery, reads the car’s metrics directly from the car’s computer and reports automatically back to Progressive via the AT&T network. They know how fast you’ve gone (speed 1 second ago – speed now = braking speed per second, over a 7 mph drop per second and you’re a risk), they know when you drive (the cell network includes time as part of its protocol, so you never need to set its clock), what you drive (the vehicle identification number is transmitted) and likely generally where you drive (since they presumably know what cell tower the device is talking to). There isn’t a GPS in it so they don’t know exactly where you are (so unlike a car rental monitor, they don’t know if you were breaking the law by speeding in a specific place). Since they’re your insurance company they also know a lot about you financially (they use your credit history to determine your rate, for instance), where you live (if you live in a shady neighborhood, full coverage might be more expensive), how old you are (if you’re young you pay more), your gender (girls pay less) and whatever other data they can derive from your name and address (oh, you gave them your social security number when you signed up, didn’t you?).

To some people this is just usage based car insurance, which has been around for a while. For those in the experience and conversion monetization business, this is something else. It’s an insane treasure trove of data, willingly given by customers. Their privacy statement explicitly says so: ‘To meet our legal obligations to state departments of insurance, we retain information collected or derived from the device for the time we determine is required by law; after which we will de-personalize the data and keep it indefinitely.’ Imagine what kind of data their analysts are rolling in! “Here’s a snow day in Texas, notice how 50% fewer people who live in higher income neighborhoods aren’t commuting today, presumably because they can telecommute, but only 15% of people in less advantaged neighborhoods are staying at home.” “35% of 32-35 year old primary drivers with 1+ children make 3+ mid-day trips during the week, while only 25% of 36-42 year olds do.” “Here’s the rage-graph of peak braking velocity grouped by age, notice how it drops from 21 on, then spikes again for men at what’s considered the mid-life-crisis.” If dating sites can produce interesting graphs like these, imagine what insurance companies can do?

Some people would be shouting Big Brother and 1984 at this point, but in reality it’s no more than Google, Facebook or your cell phone provider know about you. When your pill bottles report back to your insurance company, it’s no more than your health insurance provider knows about you. The future is going to be behavior modification heavy. Unless society reacts strongly and begins to value privacy and anonymity more, it’s how everything’s going to be. Google makes it’s money because it knows who you are and can optimize your ad viewing experience to maximize the money you spend. Insurance companies want people who brake slowly, don’t drive at peak times or even drive much at all. Energy companies want people to not use power at peak times. Some companies may even want to use the data to optimize our collective driving experience by crowdsourcing the speed of traffic to avert gridlock.

Progressive doesn’t do a great job of explaining what Snapshot is in it’s commercials, it isn’t an easy story to tell in 30 seconds, but it isn’t hard to convince someone to try it when they’re on the phone switching car insurance. “Save up to 30%, no possibility of my rate increasing? Sign me up!” In fact, if not for the fact that my sister-in-law mentioned driving slower due to her Snapshot beeping at her for braking hard, I probably would have never realized that it was the quantified self in car form. For those of us in the ‘data industry,’ the potential is scary, but for some of middle America a 30% reduction in car insurance is worth the loss of privacy. How long and how far it’ll go, only the future can tell.

Update:

I haven’t seen a teardown of the Snapshot device, yet. I’d be curious to see what’s inside. It also seems like Allstate has something similar (nee identical) called Allstate Drive Wise. In fact, they were fighting it out over whether each could offer it.

If I were writing the Bruce Sterling or Cory Doctorow version of this story, there would be an enterprising group of car tuners on the border, maybe driving stolen cars through Nuevo Laredo, their tech guru comes up with some fancy way to rewrite the VIN data as it’s read off the engine debug port and they realize they can make a little dongle that sits between the debug port and the Snapshot device to smooth out acceleration data. They get somebody in Shanghai or Monterrey to make 10,000 of them, and since it’s a legal grey area, get people to mark them up 150% and sell them with targetted Facebook ads. “Maximize your insurance discount! Drive how you want!” Who knows if the Snapshot devices can have their firmware updated over the air, but if they can, imagine a running tech battle between the car chippers and Progressive programmers like the DirecTV access card hackers back in the day.

On a more melancholy note (maybe this is the William Gibson version), with so many people using Snapshot, essentially a wirelessly connected black box for your car, statistically there have to be some people who have had accidents and likely died, with the graph of data from the exact moment of the crash sitting on Progressive’s servers. Imagine a particularly effected individual in the data processing department collecting those and spinning art out of them in an anonymous only-on-the-internet memorial.