Random Forest is a Supervised Machine Learning Algorithm that is used widely in Classification and Regression problems.
For this problem we used the Classification method.
Our problem predicts the quality of our wines. Our wines are classified as “Fair” or “Very Good” for the red and white wines.
Our wines have these features: alcohol, density, pH, residual sugar, free sulfur dioxide, chlorides, volatile acidity, total sulfur dioxide, citric acid, and fixed acidity.
We predicited our wines using 4 scenarios:
We used different combinations of parameters and the GridSearchCV library. We choose the best accuracy rate avoiding overfitting and underfitting the model.
We got a final result of 93 %.
The classifier made 135 predictions (10% of data) and predict a total of 125 true positives and 10 false positives.
We got a final result of 94 %. A little higher result comparing with all features.
The Random Forests classifies the features after the first prediction.
We used the rfimp package for this classification
The classifier made 135 predictions (10% of data) and predict a total of 127 true positives and 8 false positives.
We got a final result of 85 %.
The classifier made 396 predictions (10% of data) and predict a total of 338 true positives and 58 false positives.
We got a final result of 85 %. The same result as all features.
The Random Forests classifies the features after the first prediction.
We used the rfimp package for this classification
The classifier made 396 predictions (10% of data) and predict a total of 338 true positives and 58 false positives.