March 13, 2014 · machine learning presentation

Detecting Fraudulent Skype Users via Machine Learning

As part of my Data Science class with General Assembly, we each gave a presentation about a real-world application of data science. My talk was about using machine learning to detect fraud on Skype, and was based upon an excellent paper by Microsoft Research published in November 2013.

Although Skype already had measures in place to detect fraud (e.g., credit card fraud, spam instant messages), the research team's goal was to improve the detection of "stealthy fraudulent users" that evade Skype's defenses for a prolonged period. They built a machine learning classifier that flagged potentially fraudulent users, and was able to detect 68% of these users with a false positive rate of 5%. The novelty in their approach was the fusing of disparate data types (profile information, Skype product usage, and Skype social activity) into a single classifier.

My presentation slides are embedded below, and are also available on Speaker Deck.

If you enjoyed the slides and would like more details, Microsoft's research paper provides a great introduction to the modern machine learning workflow and does not require a statistics background to understand. Alternatively, you can read a less technical article summarizing the paper.

If you would like to read about other applications of machine learning, here are a few of my favorite articles:

For a longer list of articles and research papers like these, check out the GitHub repository for my Data Science class!

  • LinkedIn
  • Tumblr
  • Reddit
  • Google+
  • Pocket
Comments powered by Disqus