Info
Real Time Analytics
Kod: 222891-D
Winter semester 2022/2023, SGH Warsaw School of Economics
Basics information about this course can be found in the syllabus.
List of books! I recommend.
If You don’t know what Python is go here.
Schedule
Lectures
The lecture is carried out in hybrid mode. It is OPTIONAL and takes place in Aula I building G
20-02-2023 (Monday) 08:00-9:30 - Lecture 1Completed topics:- structured and unstructured data,
- introduction to Big Data
- data generation processes
- OLAP and OLTP data processing models.
27-02-2023 (Monday) 08:00-9:30 - Lecture 2Completed topics:- batch processing vs data stream processing
- ETL
- the MapReduce pattern
- business requirements for the data stream
- definitions of: event, event stream processing, event analysis,
- batch apps and streaming apps
06-03-2023 (Monday) 08:00-9:30 - Lecture 3Completed topics:- Time in streaming data processing
- Operation of the client-server system: REST API
13-03-2023 (Monday) 08:00-9:30 - Lecture 4Completed topics:- Lambda and Kappa architectures
- pub/sub communication for Apache Kafka
Lectures end with a TEST: 10 questions - 20 minutes. The test is conducted via MS Teams.
Labs
- 21-03-2023 (tuesday) 08:00-11:30 - C4D 2 groups
- 28-03-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 04-04-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 18-04-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 25-04-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 09-05-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 16-05-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 23-05-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 30-05-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
- 06-06-2023 (tuesday) 08:00-11:30 - C4D, 2 grupy
Place
Lectures 1-4: G-Aula I Labs 1-10: C-4D
Exam
Lectures will end with a test (last class). Positive evaluation of the test (above 13 points) entitles you to carry out the exercises.
After the exercises, homework will be carried out via the MS teams’ platform. Passing all exercises and tasks entitles you to complete the project.
The project should be carried out in groups of no more than 5 people.
Project requirements:
- The project should present a BUSINESS PROBLEM that can be implemented using the information provided online. (This does not mean that you cannot use batch processing, e.g. to generate a model).
- Data should be sent to Apache Kafka and further processed and analyzed from there.
- The programming language is free - applies to each component of the project.
- BI tools can be used
- Data sources can be a table, artificially generated data, IoT, etc.
Technology
Participating in the classes, you must know and at least use the following information technologies: