Summary of topics offered - Department of Informatics (FBE)
|Type of work:||Diploma thesis|
Comparison of open-source data mining tools for textual data analysis
State of topic:
|approved (prof. Ing. Cyril Klimeš, CSc. - head of department)|
|doc. Ing. František Dařena, Ph.D.|
|Faculty:||Faculty of Business and Economics|
Department of Informatics - FBE
|Max. no. of students:||5|
|Summary:||For mining knowledge from textual data, a variety of open-source solutions can be used. These solutions implement many commonly used machine learning algorithms. Differences can be seen in the possibilities of the process of transforming the raw data into a suitable format, the technological possibilities of the programs (memory management, speed), the variety of provided outputs, the connection of simple steps to more complicated tasks, etc. The aim of the thesis is to propose experiments employing inductive supervised and unsupervised learning, carry them out with selected open-source tools (c5, Weka, SVMlight, Cluto, R, Octave, Python, Perl), and evaluate the suitability of deploying these tools for specific types of tasks on the basis of the specified criteria.|
Limitations of the topic
To sign up for a topic it is necessary to fulfil one of the following restrictions
Restrictions by study
The table shows restrictions by study to which the student has to be enrolled in order to sign up for the given topic.
C-SIA System Engineering and Informatics
The table shows limitations of a course the student has to complete to be able to register for a given topic.
No suitable data found.