New York | Frankfurt | At Sea

A prospective client, one of the large bulge bracket investment banks, recently asked us to run a test. The hypothesis was as follow: What if we (the bank) could run our emails (hundreds of millions) through an A.I. algorithm and learn interesting stuff from us that will help us close more business?

So they asked us to take the first step and test Sia, our A.I. engine, against a large email data set. Here are the results.

We used the public Enron email data set for this purpose; it’s a very large, complex (highly specific language) and “dirty” (spam, long threads, etc) set of emails. We loaded it into our system last Thursday and processed the results over the weekend.

We extracted a large number of entity types per default, none of this is trained or part of a specific taxonomy.

We looked at common metrics to assess performance and manually checked 20,000 results. Our precision is better than 1 out of 2 from the start, our recall rate is very high, which leads to an overall (F1-score) that’s above expectations. Again, all of this is our default entity extraction without training on data source, type of entity or taxonomy.

Area Under the Curve (AUC) is typically used to assess whether a result is good enough or not – internal benchmarks from companies such as Google of Facebook set this at 0.8. Our AUC is between 0.77 and 0.9 – off the bat, without training or optimization.

Not too shabby. Sure, Google and Facebook would have done a lot better, and I am sure there is a few startups out of Stanford or MIT that could show better results as well. But my point is this: A.I. out of the box that works is here!

While those results have a lot of room to improve we were able to show that within a few short days, without any training or custom work, we can deliver results that meet or exceed benchmarks. Results, that for the most part, are better than what humans can achieve. Results, that can now within a few short week be improved, tweaked and adapted.

This means that A.I. is not any longer a larger, uncertain project but a reality that is within grasp of every enterprise. The risk is limited (it’s SaaS), the resources required manageable (a PM, an expert or two, and a budget that’s barely more expensive than your last holiday party) and the opportunities to make it your own, learn from it, improve and adapted it literally limitless.

Get started with A.I. out of the box now and reap the fruits of your efforts before the snow melts.


Leave a comment