Projects

SparkTutorials.net

shutdown
2015

SparkTutorials.net

SparkTutorials.net was an educational platform dedicated to teaching Apache Spark for data science, providing comprehensive tutorials, examples, and resources to help developers and data scientists master distributed computing and big data processing.

The Problem

In 2015, Apache Spark was gaining popularity but learning resources were scattered and inconsistent:

  • Limited comprehensive tutorials for beginners
  • No centralized resource for Spark best practices
  • Difficulty finding practical examples for real-world use cases
  • Lack of structured learning paths for different skill levels
  • Inconsistent documentation across different Spark components

The Solution

SparkTutorials.net provided:

  • Comprehensive tutorials covering Spark Core, SQL, Streaming, and MLlib
  • Practical examples with real-world datasets and use cases
  • Progressive learning paths from beginner to advanced concepts
  • Interactive code examples with Jupyter notebooks
  • Best practices and performance optimization guides

Current state

I shut down this project long ago, it achieved over 1/2 a million page views and was a key resource for many in the community.

See my concluding writeup.