Nazwa przedmiotu:
Data Mining
Koordynator przedmiotu:
Marzena Kryszkiewicz
Status przedmiotu:
Fakultatywny ograniczonego wyboru
Poziom kształcenia:
Studia II stopnia
Program:
Informatyka
Grupa przedmiotów:
Przedmioty techniczne - zaawansowane
Kod przedmiotu:
EDAMI
Semestr nominalny:
3 / rok ak. 2015/2016
Liczba punktów ECTS:
6
Liczba godzin pracy studenta związanych z osiągnięciem efektów uczenia się:
30 hours of lectures 30 hours preparation for tests 15 hours of laboratory exercises 15 hours of preparation for the laboratory exercises 15 hours of project meetings 45 hours of implementation of project assignments
Liczba punktów ECTS na zajęciach wymagających bezpośredniego udziału nauczycieli akademickich:
30 hours of lecture 15 hours of laboratory exercises 15 hours of project meetings which gives approx. 2.5 ECTS
Język prowadzenia zajęć:
angielski
Liczba punktów ECTS, którą student uzyskuje w ramach zajęć o charakterze praktycznym:
15 hours of laboratory exercises 15 hours of preparation for laboratory exercises 15 hours of project meetings 45 hours of implementation of project assignments which gives approx. 4 ECTS
Formy zajęć i ich wymiar w semestrze:
  • Wykład30h
  • Ćwiczenia0h
  • Laboratorium15h
  • Projekt15h
  • Lekcje komputerowe0h
Wymagania wstępne:
knowledge of data bases is recommended
Limit liczby studentów:
30
Cel przedmiotu:
The objective of the course is to make students familiar with important topics in the area of data mining. The techniques and algorithms to be presented are of practical value – they are well suited to the discovery of hidden data in real large data sources. The methods to be presented are anticipated to have a great impact on the evolution of database systems towards effective knowledge base systems. As a result of participating in the course, students should become capable of efficiently discovering novel, non-trivial and useful knowledge from large data resources.
Treści kształcenia:
Data mining as a multidisciplinary area: Roots and development of data mining area. Current challenges in data mining. Classification of data mining tasks. Data preprocessing: Data cleaning. Data integration and transformation. Data reduction. Discretization and concept hierarchy generation. Data mining language: Specifying required properties of knowledge to be discovered by means of a sample data mining language. Frequent patterns and association rules: Scalable methods of discovering frequent patterns and association rules in transactional and relational databases. Modifications of algorithms capable to deal with hierarchy and negation. Usage of imposed constraints for efficient reduction of a discovery process. Concise models of frequent patterns: Generators, closed itemsets and k-disjunction-free sets as basic elements of lossless representations of frequent patterns. Discovery of concise representations of frequent patterns. Usage of the models for derivation of all frequent patterns. Concise models of association rules: Generators, closed itemsets, and pseudo-closed sets as building blocks of lossless representations of association rules. Mechanisms of deriving association rules from their representations. Functional and approximate dependencies: Scalable methods of discovering functional and approximate dependencies in large databases. Other patterns and rules: Scalable methods of discovering sequential patterns, episode rules, quantitative rules, frequent patterns, decision tree classifiers, and rough set decision rules. Clustering: Scalable methods of clustering objects. Usage of multidimensional indexing techniques to support the process of discovering clusters and outliers. Reasoning under incompleteness: Legitimate approach to reasoning from data with missing values. Mining from partial knowledge. Data mining applications: Sample applications of data mining in the financial, telecommunication, biomedical and DNA areas. Brief overview of selected data mining systems.
Metody oceny:
In order to pass the EDAMI course, students must achieve a pass grade from each of the three course components: the lecture part (assessed on the basis of two tests), the project part (assessed on the basis of an implemented software and carried out tests, a report and presentation of the project) and the laboratory part (recognized as successfully completed if all 5 laboratory tasks are done correctly). A positive final grade is determined on the basis of the average of the grade from the tests and the grade from the project. If the grade from the tests is lower than the grade from the project, the final assessment is determined as the rounding down of that average. Otherwise, it is determined as the rounding up of that average.
Egzamin:
nie
Literatura:
Han J., Kamber M., Pei, J., Data Mining: Concepts and Techniques, The Morgan Kaufmann Series in Data Management Systems, 3rd edition, Morgan Kaufmann, 2011 Fayyad U.M. , Piatetsky-Shapiro G. , Smyth P., Uthurusamy R. (eds.), Advances in Knowledge Discovery and Data Mining, AAAI, Menlo Park, California, 1996 Kryszkiewicz M., Concise Representations of Frequent Patterns and Association Rules, Prace Naukowe, Elektronika, Oficyna Wydawnicza Politechniki Warszawskiej, z. 142 (2002) Communications of the ACM, November 1996, Vol. 39. No 11., 1996 Ganter B., Wille R., Formal Concept Analysis, Mathematical Foundations, Springer-Verlag, 1999 and a number of recent data mining publications accessible via Internet. The instructor will recommend the respective publications during the course.
Witryna www przedmiotu:
Uwagi:
A project task is to design, implement and perform an experimental evaluation of selected data mining algorithms. The aim of the laboratory is to acquaint students with modern technologies of data mining. During the laboratory classes, students will become familiar with possibilities of carrying out data mining using a selected commercial system, for example, IBM Warehouse Design Studio.

Efekty uczenia się

Profil ogólnoakademicki - wiedza

Efekt EDAMI_W01
has knowledge of discovering patterns and dependencies by means of data mining methods
Weryfikacja: test
Powiązane efekty kierunkowe: K_W06, K_W08, K_W09
Powiązane efekty obszarowe: T2A_W04, T2A_W07, T2A_W03
Efekt EDAMI_W02
has knowledge of methods of representing frequent patterns and reasoning about them
Weryfikacja: test
Powiązane efekty kierunkowe: K_W06
Powiązane efekty obszarowe: T2A_W04
Efekt EDAMI_W03
has knowledge of modern data mining technologies
Weryfikacja: laboratory excercises
Powiązane efekty kierunkowe: K_W11
Powiązane efekty obszarowe: T2A_W03, T2A_W04, T2A_W07

Profil ogólnoakademicki - umiejętności

Efekt EDAMI_U01
is capable of planning and implementing a knowledge discovery process as well as of interpreting its results
Weryfikacja: project
Powiązane efekty kierunkowe: K_U01, K_U06, K_U09, K_U13
Powiązane efekty obszarowe: T2A_U01, T2A_U08, T2A_U09, T2A_U11, T2A_U18
Efekt EDAMI_U02
is capable of presenting a plan, implementation and results of a knowledge discovery process in an oral and written form
Weryfikacja: project
Powiązane efekty kierunkowe: K_U03, K_U06
Powiązane efekty obszarowe: T2A_U03, T2A_U08, T2A_U09
Efekt EDAMI_U03
is capable of discovering knowledge using modern data mining technologies
Weryfikacja: laboratory excercises
Powiązane efekty kierunkowe: K_U13
Powiązane efekty obszarowe: T2A_U18