Mastering DataFrame Mutations with Wine Quality data

input

What is maximum amount of citric acid in the wine dataset?

Enter the answer to 1 decimal point.

multiplechoice

How many missing values are in the dataset?

Check the dataset and initial analysis to check for missing values.

input

What is median wine quality?

Enter the answer to 1 decimal point.

codevalidated

Rename dataframe columns to appropriate format

Rename the columns to have underscore instead of space. For example old name: fixed acidity to new name: fixed_acidity. Skip single-word columns. Set inplace=True.

codevalidated

Drop the first and last row

Perform the modification and store in a new variable: df_first_last.

codevalidated

Remove maximum total sulfur dioxide from dataset

Locate and remove the row with the maximum value for total_sulfur_dioxide and store in a new variable: df_drop.

codevalidated

Convert the quality column to the float

All the datatypes are float besides the quality column. Create a new column in the df DataFrame named quality_float which contains the values of quality, but with a float type.

codevalidated

Remove density, residual sugar and chlorides columns from the dataset

Modify the dataframe by droping the three variables density, residual_sugar,chlorides and store your result as df_drop_three.

codevalidated

Create a new column that calculates the alcohol content in terms of percentage (%)

Get the percentage of alcohol content for each datapoint and store your result in a new column alcohol_perc.

codevalidated

Evaluate the amount of sulphates and citric acid in the red wine

Create a new column in the data frame that contains the sum of sulphates and citric_acid. Store your result in a new column: sulphate_citric_acid.

codevalidated

Create a new column that identifies if the alcohol content is below the mean of the alcohol content in the dataset.

Modify the dataset accordingly and store your result in a new column deviation_alcohol

codevalidated

Convert the wine quality scores into categorical labels: `low`, `medium`, `high`

Convert the wine quality scores into categorical labels. Classify as low if values are 5 and below; medium if values are between 5 and 7; high if greater than 7. Store your result in a new column quality_label

codevalidated

Create a new column that calculates the ratio of free sulfur dioxide to total sulfur dioxide.

Modify the DataFrame to obtain the ratio and store your result in a new column free_total_ratio.

Bolaji Bamiro

Project Activities

What is maximum amount of citric acid in the wine dataset?

How many missing values are in the dataset?

What is median wine quality?

Rename dataframe columns to appropriate format

Drop the first and last row

Remove maximum total sulfur dioxide from dataset

Convert the quality column to the float

Remove density, residual sugar and chlorides columns from the dataset

Create a new column that calculates the alcohol content in terms of percentage (%)

Evaluate the amount of sulphates and citric acid in the red wine

Create a new column that identifies if the alcohol content is below the mean of the alcohol content in the dataset.

Convert the wine quality scores into categorical labels: `low`, `medium`, `high`

Create a new column that calculates the ratio of free sulfur dioxide to total sulfur dioxide.

Bolaji Bamiro

Intro to Pandas for Data Analysis

Set Operations using Sakila

LIKE Operator using World

Membership and Range Operators with World Database