All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Store the features in X
and the target y
.
Store the values in the variables in X_train
, X_test
,y_train
, y_test
, and random_state
.
Train a Random Forest Classifier using the training data, and store the model in rf
. You can specify the model parameters such as the maximum depth of the tree or the minimum number of samples required to split an internal node.
Calculate the accuracy
of both the training and testing sets and run the code in a Jupyter Notebook.
Store the results in the variables train_accuracy
and test_accuracy
.
The expected accuracy for a simple problem varies depending on the specifics of the problem and data. However, for a well-defined and simple problem with a large and diverse training dataset, a well-trained machine learning model could achieve an accuracy of over 80% in some cases.
Store the precision, recall, and f1-score of the positive class in the variables precision
,recall
and f_1_score
Which model presents the worst performance in the test dataset?