All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Let's start with identifying the Top 5 sources
, through which people are getting recruited at this company!
Your task is to first identify the top 5 recuritment sources using the value_counts()
method on the RecruitmentSource
column. Store the results in top_recruitment_sources
variable.
Second, now using the top_recruitment_sources
, plot a bar plot contaning following details :
X-axis
: 'Recruitment Source'
Y-axis
: 'Number of Hires'
Title
: 'Top Five Recruitment Sources by Number of Hires'
The result should match the following output:
Let's plot a Stacked Bar Plot to understand the distribution of performance scores by gender within the Sales department.
First filter out the Department
column of the dataframe to extract records of Sales
department. Store the results in sales_gender_performance
variable.
Now using sales_gender_performance
, plot a stacked bar plot contaning following details :
X-axis
: 'Gender'
Y-axis
: 'Count'
Title
: 'Performance Scores by Gender in Sales'
The result should match the following output:
Let's plot a histogram to visualise how satisfied the employees are!
Use the EmpSatisfaction
column to plot the histogram using the following details :
X-axis
: 'Satisfaction Score'
Y-axis
: 'Frequency'
Title
: 'Distribution of Employee Satisfaction Scores'
The result should match the following output:
Which Employee Satisfaction Score has the highest frequency?
Plot a Scatter plot between Salary
and EngagementSurvey
to see if there is any relationship between them.
Use the following details :
X-axis
: 'Salary'
Y-axis
: 'Engagement Survey Score'
Title
: 'Relationship Between Salary and Engagement Score'
The result should match the following output:
Visualize the relationship between employee satisfaction and the number of special projects, with bubble size representing salary.
First, scale down the Salary
column by dividing it by 1000
for use as the bubble size marker.
Next, using the dataframe df, plot a scatter plot containing the following details:
X-axis
: 'Employee Satisfaction'
Y-axis
: 'Number of Special Projects'
Bubble Size
: 'Salary'
Title
: 'Employee Satisfaction vs. Number of Special Projects (Bubble Size: Salary)'
The result should match the following output:
Plot a Line Plot and Visualize the trend of employee terminations over the years.
First, extract the year from the DateofTermination
column and store the results in a new column YearTerminated
.
Next, count the number of terminations per year and sort the results. Store this in the terminations_per_year
variable.
Now, using terminations_per_year
, plot a line plot containing the following details:
X-axis
: 'Year Terminated'
Y-axis
: 'Number of Terminations'
Title
: 'Trend of Employee Terminations Over the Years'
The result should match the following output:
Create a pie chart that portrays the distribution of employees by their Marital Status.
First, count the occurrences of each marital status category in the MaritalDesc
column and store the results in the marital_status_counts
variable.
Next, using marital_status_counts
, plot a pie chart containing the following details:
Use autopct
formatted as %1.1f%%
to display the percentage of each slice.
Set ylabel=''
to clear any standard Y-axis label.
Set the title
to Employee Distribution by Marital Status
.
The result should match the following output:
From which state are the maximum employees from ?
Use value_counts()
method to find out.