All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Use a vectorized operation to compute the mean age of all passengers in the df.
Enter the exact value returned.
Create a boolean series named is_survivor using the Survived column , indicating whether each passenger survived.
Your result should look something like this :

df['Age'].fillna(df['Age'].mean(), inplace=True)
Calculate the mean age from the Age column and store in a variable named mean_age.
Create a series named age_difference to store age difference from the mean_age.
Your result would look something like this :

Calculate the minimum and maximum values from the Fare column and store them in variables named fare_min and fare_max.
Next, create a series named normalized_fare to store the normalized fare values, calculated using the formula:
(fare - min_fare ) / ( max_fare - min_fare ) .
Your result would look something like this:

Create a series named family_size by calculating the total family size for each passenger.
This is done by summing the values from the Siblings/Spouses Aboard and Parents/Children Aboard columns, and adding 1 (to include the passenger themselves).
Your result would look something like this:

Calculate the fare per family member by dividing the Fare column by the family_size.
Store the result in a series named fare_per_family_member.
Your result would look something like this:

Create a series fare_weight by dividing the Fare values by the maximum fare value.
Then, calculate the weighted age by multiplying the Age column by the fare_weight and store the result in a series named weighted_age.
Your result would look something like this:

Sort the Fare column in ascending order and store it in a series named sorted_fares.
Then, calculate the cumulative fare percentage by taking the cumulative sum of the sorted_fares and dividing it by the total fare sum.
Multiply the result by 100 to express it as a percentage.
Store the final result in a series named cumulative_fare_percentage.
Your result would look something like this:

Calculate the first (Q1) and third (Q3) quartiles of the Fare column.
Then, compute the interquartile range (IQR) as Q3 - Q1.
Identify any outliers where the fare is less than Q1 - 1.5 * IQR or greater than Q3 + 1.5 * IQR.
Store the result as a boolean series named is_fare_outlier.
Your result would look something like this:

Sort the dataset by its index and then calculate the rolling average of the Fare over a window of 10 rows.
The minimum number of periods required for calculation is 1.
Store the result in a series named rolling_average_fare.
Your result would look something like this:
