Data Science laboratory

« Return

Objectives

1. To know the fundamental concepts and methodologies of Data Science.

2. To know how to explore data and build graphical visualization.

3. To know and learn to apply simple methods of statistical learning.

4. To create and use models of decision trees and bayesian classification.

5. To use methods for supervised models evaluation.

6. To acquire competences in the use of R and Python languages in Data Science applications.

Program

1. Introduction to concepts and methodologies of Data Science.
2. Using R and Python programming environments.
3. Identify data sources, preprocessing, exploration and visualisation of data.
4. Introduction to supervised statistical learning.
5. Decision trees for regression and classification.
6. Naive Bayes method for classification.
7. Model evaluation with two samples and cross validation.

Teaching Methodologies

In the lectures, an expositive and demonstrative method is used, with the support of examples. In the laboratory sessions, students work in projects supported by a previously prepared protocol. All the support materials are online using the Moodle Learning Management System. This platform is also used to laboratory report deliveries and project discussion.

Bibliography

Essencial

- McKinney, W. (2017). Python for Data Analysis (2.a ed.). O’Reilly Media.

- Keen, K.J.. (2018). Graphics for Statistics and Data Analysis with R (2.a ed.). CRC Press.

- James, G, Witten, D., Hastie,T., Tibshirani,R. (2013). An Introduction to Statistical Learning with Applications in R, Springer.

Complementar

- Milovanović., I. (2013). Python Data Visualization Cookbook. 1st edition. PACKT Books.

- Berthold, M.R., Borgelt, C., Höppner, F., e Klawonn, F. Guide to Intelligent Data Analysis - How to Intelligently Make Sense of Real Data. 1st edition. Texts in Computer Science. Springer-Verlag London, 2010.

Code

01060999

ECTS Credits

6

Classes

  • Práticas e Laboratórios - 45 hours
  • Teóricas - 15 hours

Evaluation Methodology

  • Frequency: 30%
  • Participation in lab and lecture sessions: 10%
  • Project: 60%