Pla docent de l'assignatura


Tanca imatge de maquetació




Dades generals


Nom de l'assignatura: Dades Massives

Codi de l'assignatura: 572667

Curs acadèmic: 2019-2020

Coordinació: Jordi Vitria Marca

Departament: Departament de Matemàtiques i Informàtica

crèdits: 3

Programa únic: S





API spark, 



Hores estimades de dedicació

Hores totals 75


Activitats presencials



-  Teoricopràctica



Treball tutelat/dirigit


Aprenentatge autònom




Competències que es desenvolupen


Learn the difference between classical computing and big data computing.


Learn to use a big data cloud infrastructure.


Learn how to store massive data sets.


Learn how to deal with high-velocity data sources.


Learn how to process massive data sets.


Learn how to manage the life-cycle of data science projects.





Objectius d'aprenentatge


Referits a coneixements

Most of date science problems involve to work with big volumes of information that has be stored, cleaned and processed to be useful for machine learning algorithms. This subject focuses on explaining how to develop and end-tone-end data science application to allow students to develop data products based on big data technologies.



Blocs temàtics


1. Introduction to Big Data.

*  Introduction to classical computing

Evolution of Infrastructure

Evolution of Big Data

What is Big data (The five Vs)

Need for Big data infra

2. Introduction to Cloud Infrastructure

3. Introduction to Docker and Kubernetics

4. Big Data Storage

5. Big Data Ingestion

6. Big Data Processing

7. Data Science Life cycle Management



Metodologia i activitats formatives


All sessions follow a practical approach, where the teacher will explain a certain concept and the students will apply it autonomously.




Avaluació acreditativa dels aprenentatges


The requirements for this course consist of an exam and 2 assignments. The grading breakdown is the following:

  • Homework (60%, 2 assignments, 30% each)
  • Exam (40%)


Avaluació única

Exam (100%)