All our Data Science projects include bite-sized activities to test your knowledge and practice in an environment with constant feedback.
All our activities include solutions with explanations on how they work and why we chose them.
Find total numbers of companies listed and input the answer as an integer.
Given a specific company name, list all the job titles available in the job_listings_by_company_title
dictionary. Store the job titles in a list named job_titles
.
Use
IBM
as the company name.
Create a function named get_job_details_by_id()
that takes a job ID as an argument and returns the job details as a dictionary. If the job ID is not found, return None
.
The definition of the function should look like this:
def get_job_details_by_id(job_id):
# Your code goes here
Returned dictionary should have the following structure:
{
'type': ...,
'location': ...,
'criteria': ...,
'posted_date': ...,
'link': ...
}
Find all the companies that have remote jobs. Store the company names in a list named remote_companies
. The company names should be unique.
Calculate the average salary of all the jobs listed. Round the answer to the nearest integer and input the answer as an integer.
Find the highest salary of all the jobs listed. Round the answer to the nearest integer and input the answer as an integer.
Find the company with the most job listings and input the company name as a string. If there are multiple companies with the same number of job listings, input the company name that comes first in alphabetical order.
Create a function named get_job_type_distribution()
that takes a company name as an argument and returns a dictionary containing the job type distribution for the company. The keys of the dictionary should be the job types and the values should be the number of jobs for each job type. If the company name is not found, return None
.
The definition of the function should look like this:
def get_job_type_distribution(company_name):
# Your code goes here
Returned dictionary should have the following structure:
{
'onsite': ...,
'remote': ...,
'hybrid': ...
}
Create a function named get_location_statistics()
that takes a company name as an argument and returns a dictionary containing the location statistics for the company. The keys of the dictionary should be the locations and the values should be the number of jobs for each location. If the company name is not found, return None
.
The definition of the function should look like this:
def get_location_statistics(company_name):
# Your code goes here
Returned dictionary should have the following structure:
{
location: ...
...
}
Categorize the jobs into salary ranges. Create a dictionary named salary_range_distribution
that contains the salary range distribution for all the jobs listed. The keys of the dictionary should be the salary ranges and the values should be the number of jobs for each salary range.
The salary ranges should be as follows:
(0, 50000): "0-50k",
(50001, 75000): "50-75k",
(75001, 100000): "75-100k",
(100001, float('inf')): "Above 100k"
The salary ranges are inclusive of the lower bound and exclusive of the upper bound. For example, the salary range (0, 50000)
includes all the salaries greater than or equal to 0
and less than 50000
. The salary range (100001, float('inf'))
includes all the salaries greater than 100001
.
The expected output is as follows:
salary_range_distribution: {
(0, 50000): 2,
(50001, 75000): 1,
(75001, 100000): 1,
(100001, float('inf')): 1
}
This is example output. Your output may be different.
Apply a 10% salary increment to all the jobs listed and update the dictionary.
If you encounter failure even after incrementing the salary, try re-reading the dictionaries
job_listings_by_company_title
andjob_details_by_id
once again.
Create a function named get_jobs_by_location()
that takes a location as an argument and returns a list containing job ids.
The definition of the function should look like this:
def get_jobs_by_location(location):
# Your code goes here
The returned list should have distinct values of ids. If there are no jobs with given location then return empty list.
Remove all the job listings for the location Johannesburg, Gauteng, South Africa
from both the dictionary job_listings_by_company_title
and job_details_by_id
and update the dictionary.
Create a function named get_company_name_abbreviation()
that takes a company name as an argument and returns the abbreviation of the company name. If the company name is not found, return None
.
The definition of the function should look like this:
def get_company_name_abbreviation(company_name):
# Your code goes here
Abbreviation is calculated using first letter from word of the company name.
For example, the abbreviation of IBM
is I
, the abbreviation of Experian
is E
, the abbreviation of Progressive Edge
is PE
, and the abbreviation of Ovations Technologies (Pty) Ltd
is OTPL
.
Create a dictionary named company_name_abbreviation
that contains the company names and their abbreviations. The keys of the dictionary should be the company names, and the values should be the abbreviations.
The expected dictionary looks like the following:
company_name_abbreviation: {
'IBM': 'I',
'Experian': 'E',
'Progressive Edge': 'PE',
'Ovations Technologies (Pty) Ltd': 'OTPL',
...
}