All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
To calculate the revenue, a new column revenue
can be generated by summing the values of the budget
and gross
columns of the df
dataframe.
To determine the Percentage Profit, a new column(percentage_profit
) can be created by dividing the gross
value by the revenue
and multiplying the result by 100. This will give the percentage of profit made from the movie in the dataframe."
Create a new column high_budget_movie
with the value True
if the movie's budget
is greater than 100 million and False
otherwise.
Create a new column successful_movie
with the value True
if the movie's profit is greater than 0 and False
otherwise. Here profit refers to the revenue
of the movie.
Create a new column is_critically_acclaimed
which is True
if the value of score
column is greater than 8 and False
otherwise.
Create a new column is_new_release
which is True
if the value of year
column is greater than 2020 and False
otherwise.
Create a new column is_long_movie
which is True
if the value of runtime
column is greater than 150 minutes and False
otherwise.
Drop all the rows where the successful_movie
column value is False
. Use the inplace
parameter to make the changes permanent.
Drop all the rows where the value of budget
is greater than 100 million and store the new dataframe in the variable high_budget_df
. Don't drop from the original dataframe.
To remove the budget
column from the movie dataframe, use the drop method and specify the column name budget
. Ensure to specify the axis to indicate that it's a column and not a row. Additionally, specify the inplace
parameter as True
to make the change permanent."
To eliminate the director
and writer
columns from the movie dataframe, use the drop method and pass in the column names director
and writer
. Specify the axis to indicate that they are columns and not rows. Set the inplace parameter to False to create a new dataframe named new_df
without modifying the original dataframe.
new_df
.