Results of te study

Preliminary Phase Results.

The first two research questions aligned with the first objective of identifying the best two performing classifiers that were used as base models:

o Which classifiers are the top performers in predicting water quality using time series data?

o How do these top-performing classifiers compare in terms of accuracy and efficiency?

In trying to answer those questions, we will present the preliminary findings in three ways. First, the ranking of the five models, when compared to each other, will show how many times classifier1 won and/or lost against the different 4 classifiers, and WEKA ranks them according to wins. Secondly, we present the findings of each of the five classifiers using a per cent correct statistic, which measures the accuracy of a machine learning model. Lastly, we will present the findings where we check whether there’s a statistically significant difference between ZeroR, the baseline classifier and four classifiers.

Ranking of the 5 classifiers results
1. SMO – Won 4, Lost 0
2. NaiveBayes – Won 3, Lost 1
3. DesionStump – Won 1, Lost 2
4. LWL – Won 1, Lost 2
5 Zero -Won 0, Lost 4

Percent Correct Statistic
SMO – 92.89%
NaiveBayes – 89.74%
DesionStump – 84.29%
LWL – 84.99%
ZeroR – 80.40%

Statistically Significant results against ZeroR

NaiveBayes achieved a classification of accuracy of 89.74% (+/- 1.41%), which is statistically better than ZeroR at 80.40% (+/- 0.04%).
SMO achieved a classification of accuracy of 92.89% (+/- 0.80%) which is statistically better than ZeroR at 80.40% (+/- 0.04%).
LWL achieved a classification of accuracy of 84.99% (+/- 2.89%) which is statistically better than ZeroR at 80.40% (+/- 0.04%)
DecisionStump achieved a classification of the accuracy of 84.89% (+/- 1.49%) which is statistically better than ZeroR at 80.40% (+/- 0.04%)

Summary and analysis.
This should focus on proving that SMO and NaiveBayes are the best two-performing classifiers when using water quality using time series data. Secondly, show their efficiency against each other and the other four. Your discussion offers the reliability of results by mentioning that 500 sets of results for the experiment were loaded, which resulted in running 100 experimental runs for each of the five classifiers, in addition to using a 0.05 significance level.
When discussing statistical significance, also mention the (+/-%), which is the deviation