All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
To calculate the revenue, a new column revenue can be generated by summing the values of the budget and gross columns of the df dataframe.
To determine the Percentage Profit, a new column(percentage_profit) can be created by dividing the gross value by the revenue and multiplying the result by 100. This will give the percentage of profit made from the movie in the dataframe."
Create a new column high_budget_movie with the value True if the movie's budget is greater than 100 million and False otherwise.
Create a new column successful_movie with the value True if the movie's profit is greater than 0 and False otherwise. Here profit refers to the revenue of the movie.
Create a new column is_critically_acclaimed which is True if the value of score column is greater than 8 and False otherwise.
Create a new column is_new_release which is True if the value of year column is greater than 2020 and False otherwise.
Create a new column is_long_movie which is True if the value of runtime column is greater than 150 minutes and False otherwise.
Drop all the rows where the successful_movie column value is False. Use the inplace parameter to make the changes permanent.
Drop all the rows where the value of budget is greater than 100 million and store the new dataframe in the variable high_budget_df. Don't drop from the original dataframe.
To remove the budget column from the movie dataframe, use the drop method and specify the column name budget. Ensure to specify the axis to indicate that it's a column and not a row. Additionally, specify the inplace parameter as True to make the change permanent."
To eliminate the director and writer columns from the movie dataframe, use the drop method and pass in the column names director and writer. Specify the axis to indicate that they are columns and not rows. Set the inplace parameter to False to create a new dataframe named new_df without modifying the original dataframe.
new_df.