Books and Resources

Core Reading

Books directly related to the course topics.

Stream Processing

  1. G. Shapira, T. Palino, R. Sivaram, K. Petty – Kafka: The Definitive Guide. Real-time data and stream processing at scale, 2nd ed., O’Reilly 2022. Free access
  2. G. Maas, F. Garillot – Stream Processing with Apache Spark, O’Reilly 2019. Description
  3. F. Hueske, V. Kalavri – Stream Processing with Apache Flink, O’Reilly 2019. Description
  4. A. Bellemare – Building Event-Driven Microservices, O’Reilly 2021. Description
  5. T. Akidau, S. Chernyak, R. Lax – Streaming Systems: The What, Where, When, and How of Large-Scale Data Processing, O’Reilly 2018.
  6. J. Korstanje – Machine Learning for Streaming Data with Python, Packt 2022.

Apache Spark

  1. J. S. Damji, B. Wenig, T. Das, D. Lee – Learning Spark, 2nd ed., O’Reilly 2020.
  2. B. Chambers, M. Zaharia – Spark: The Definitive Guide, O’Reilly 2018. Description
  3. J. Quddus – Machine Learning with Apache Spark Quick Start Guide, Packt. Description

MLOps and Model Deployment

  1. N. Gift, A. Deza – Practical MLOps: Operationalizing Machine Learning Models, O’Reilly 2022.
  2. V. Lakshmanan, S. Robinson, M. Munn – Machine Learning Design Patterns, O’Reilly 2021.

Supplementary Reading

Machine Learning with Python

  1. A. Geron – Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow, 2nd ed., O’Reilly. Description
  2. W. McKinney – Python for Data Analysis, 2nd ed., O’Reilly. Description
  3. J. Grus – Data Science from Scratch, 2nd ed., O’Reilly. Description
  4. S. Raschka – Python Machine Learning, 2nd ed., Packt. Description
  5. T. Hastie, R. Tibshirani, J. Friedman – The Elements of Statistical Learning, Springer 2017. Free access

Deep Learning

  1. F. Chollet – Deep Learning with Python, Manning. Description
  2. J. Howard, S. Gugger – Deep Learning for Coders with fastai and PyTorch, O’Reilly. Description
  3. A. Koul, S. Ganju, M. Kasam – Practical Deep Learning for Cloud, Mobile & Edge, O’Reilly 2019.

Docker and Tools

  1. J. Krochmalski – Docker: Up & Running, O’Reilly. Description
  2. P. Bell, B. Beer – Introducing GitHub, O’Reilly. Description

Python

  1. C. Althoff – The Self-Taught Programmer, Triangle Connection 2017. Description
  2. A. Sweigart – Automate the Boring Stuff with Python, No Starch Press. Description

Quantum Computing

  1. A. Jacquier, O. Kondratyev – Quantum Machine Learning and Optimisation in Finance: On the Road to Quantum Advantage.

Useful Websites

Software

  1. GitHub
  2. Git documentation
  3. Python
  4. PyPI – Python Package Index
  5. Docker
  6. Apache Kafka
  7. Apache Spark

Python Libraries

  1. NumPy
  2. Pandas
  3. Scikit-learn
  4. Matplotlib
  5. JupyterLab
  6. TensorFlow
  7. Keras
  8. Beautiful Soup

Editor

  1. Visual Studio Code

Datasets

Courses and Tutorials

  1. Machine Learning – Andrew Ng (YouTube)
  2. Python Programming for Data Science – T. Beuzen
  3. Chris Albon – Data Science & AI Notes