Hotel Booking Analysis: From Basics to Insights
Hotel Booking Analysis: From Basics to Insights Data Science Project
Data Wrangling with Pandas

Hotel Booking Analysis: From Basics to Insights

Explore a rich hotel booking dataset through engaging activities! From basic filtering to advanced visualizations, enhance your Pandas skills as you uncover booking trends. Ready to turn raw data into insights? Let's dive in!
Start this project
Hotel Booking Analysis: From Basics to InsightsHotel Booking Analysis: From Basics to Insights
Project Created by

Lohith Unnam

Project Activities

All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.

All our activities include solutions with explanations on how they work and why we chose them.

codevalidated

Perform a Left Join on Two DataFrames Using a Common Column

Merge DataFrames df_numeric and df_string based on the common column Booking_ID using a left join. Store the resultant dataframe in df variable.

multiplechoice

Which of the following methods is used to filter rows in a DataFrame?

multiplechoice

Which of these is NOT a valid way for selecting columns in pandas?

codevalidated

Filter Hotel Bookings by Number of Adults

Filter the DataFrame df to display only the rows where the number of adults is greater than 2. Store the resultant dataframe in df_n_adults variable.

codevalidated

Filter and Sort 2018 Bookings by Lead Time and Price

Filter the DataFrame df to include only bookings from the arrival year 2018 with a lead time greater than 100. After filtering, sort the results by avg_price_per_room in descending order. Store the resultant dataframe in filtered_df variable.

multiplechoice

Which method is used for grouping data in pandas?

codevalidated

Calculate Average Room Price by Room Type

Group the DataFrame df by room_type_reserved and calculate the average avg_price_per_room for each room type. Store the result in avg_price_by_room.

input

What is the costliest room type in the hotel bookings dataset?

Enter the name of the room type that has the highest average price per room.

codevalidated

Analyze Monthly Room Price Statistics

Group the DataFrame df by arrival_month and calculate the count, mean and median of avg_price_per_room for each month. Store the result in monthly_stats variable.

codevalidated

Calculate Total Revenue and Booking Count by Year and Month

Use groupby() and agg() to calculate sum of avg_price_per_room and the booking count for each combination of arrival_year and arrival_month. Store the resultant dataframe in revenue_summary variable.

codevalidated

Create Dummy Variables for Meal Plans

Generate dummy variables for the type_of_meal_plan column in the DataFrame df. After creating the dummy variables, drop the original type_of_meal_plan column, if it exists. Set drop_first=True. Store the result in df.

multiplechoice

What technique is used to convert continuous data into categorical data?

multiplechoice

Which of the following is NOT a valid way to create bins in pandas?

codevalidated

Categorizing Hotel Room Prices by Binning

Create a new column price_category in the DataFrame df by binning the avg_price_per_room into three distinct categories:

  • Budget - [0 - 180]
  • Standard - (180 - 360]
  • Luxury - (360 - 540]
codevalidated

Calculate Total Nights Stayed Using Apply Function

Use the apply() function to create a new column total_nights in the DataFrame df . This column will be the sum of no_of_weekend_nights and no_of_week_nights, providing a total count of nights stayed for each booking.

codevalidated

Convert All String Columns to Lowercase

Create a new DataFrame named string_df and store all the string columns in that DataFrame. Use the applymap() function to convert all string values in string_df to lowercase.

multiplechoice

Which plot type is best for showing the distribution of a continuous variable?

codevalidated

Bar Plot of Average Lead Time by Market Segment Type

Create a bar plot that visualizes the average lead_time for each market_segment_type in the df. Set color='orange to make the bars in orange color.

multiplechoice

Which plot type is best for showing the relationship between two continuous variables?

codevalidated

Scatter Plot of Lead Time vs. Average Price per Room

Create a scatter plot to visualize the relationship between lead_time and avg_price_per_room.

codevalidated

Booking Status Distribution Across Market Segments

Plot a stacked bar chart showing the distribution of canceled vs. not canceled bookings across different market segments.

codevalidated

Room Type Popularity Based on Arrival Month

Plot a line chart showing the number of bookings for each room type over different months of the year.

Hotel Booking Analysis: From Basics to InsightsHotel Booking Analysis: From Basics to Insights
Project Created by

Lohith Unnam

This project is part of

Data Wrangling with Pandas

Explore other projects