Data mining is a field that implies analyzing large data sets in order to discover new patterns and methods for database management, data processing and inference considerations.
Weka is a package that offers users a collection of learning schemes and tools that they can use for data mining. The algorithms that Weka provides can be applied directly to a dataset or your Java code.
When running the program, you can view four available applications that you can access: 'Explorer', 'Experimenter', 'KnowledgeFlow' and 'Simple CLI'.
The first section allows you to open a dataset or a database and edit it as you wish. You can filter the data contents, change the attributes and visualize the result in a bar chart. Also, you can classify the available data according to a predefined set of rules, as well as perform a complete cost / benefit analysis that automatically displays the cost matrix and the threshold curve.
In addition to this, the program also includes tools for data clustering, association rules and attributes evaluator. Furthermore, you can use it for data plotting, as it allows you to view and analyze point graphs for each possible attribute combination.
The program is also suitable for developing new machine learning schemes. You just have to configure your experiment by choosing its type: classification or regression. Also, you have to choose the desired dataset and algorithm and then you can run it. The results can be saved either in ARFF or CSVformats or as a JDBC database.
Also, you can analyze and test a data file. The program allows you to choose the significance and the comparison field, as well as the sorting criteria and the test base.
Weka is an easy to use application, yet it is designed for those who are familiar with data mining procedures and database analysis. Using this software, you can view and analyze ARFF data files, as well as perform data clustering and regression.