Summary of topics offered - Department of Informatics (FBE)


Basic information

Type of work: Diploma thesis
Topic:
Comparison of open-source data mining tools for textual data analysis
State of topic:
approved (prof. Ing. Cyril Klimeš, CSc. - head of department)
Thesis supervisor:
doc. Ing. František Dařena, Ph.D.
Faculty: Faculty of Business and Economics
Supervising department:
Department of Informatics - FBE
Max. no. of students:5
Proposed by:
Summary: For mining knowledge from textual data, a variety of open-source solutions can be used. These solutions implement many commonly used machine learning algorithms. Differences can be seen in the possibilities of the process of transforming the raw data into a suitable format, the technological possibilities of the programs (memory management, speed), the variety of provided outputs, the connection of simple steps to more complicated tasks, etc. The aim of the thesis is to propose experiments employing inductive supervised and unsupervised learning, carry them out with selected open-source tools (c5, Weka, SVMlight, Cluto, R, Octave, Python, Perl), and evaluate the suitability of deploying these tools for specific types of tasks on the basis of the specified criteria.



Limitations of the topic

To sign up for a topic it is necessary to fulfil one of the following restrictions

Restrictions by study
The table shows restrictions by study to which the student has to be enrolled in order to sign up for the given topic.

Programme
C-SIA System Engineering and Informatics

Limit to courses
The table shows limitations of a course the student has to complete to be able to register for a given topic.

Department
Course title
No suitable data found.