SQL Case Study on Data Science Jobs
Introduction:
In today's data-driven world, understanding salary trends across various roles in the tech industry is crucial for both job seekers and employers. This blog post delves into a comprehensive analysis of a dataset containing salary information for different job titles, experience levels, and locations. By leveraging SQL queries, we aim to uncover insights and trends that can help individuals make informed career decisions and assist companies in benchmarking their compensation packages.
About the Dataset:
The dataset used in this analysis includes detailed salary
information for various roles in the tech industry. Here are the key columns in
the dataset:
- work_year:
The year the data was collected.
- experience_level:
The experience level of the employee, categorized as Junior (JR),
Mid-level (MI), Senior (SE), and Executive (EX).
- employment_type:
The type of employment, such as full-time (FT), part-time (PT), contract
(CT), or internship (IN).
- job_title:
The job title of the employee, such as Data Scientist, Machine Learning
Engineer, AI Engineer, etc.
- salary:
The salary of the employee in the local currency.
- salary_currency:
The currency in which the salary is paid.
- salary_in_usd:
The salary converted to USD for standardization.
- employee_residence:
The country of residence of the employee.
- remote_ratio:
The percentage of remote work, ranging from 0 (no remote work) to 100
(fully remote).
- company_location:
The location of the company.
- company_size:
The size of the company, categorized as Small (S), Medium (M), and Large
(L).
This dataset provides a rich source of information to analyze salary trends across different dimensions, such as job titles, experience levels, and geographic locations. In the following sections, we will use SQL queries to extract meaningful insights from this data and discuss the implications of our findings.
SQL Queries and Results:
Here I will discuss about the situations and questions arises regarding the Data Science salary dataset.
- Q1. You're a compensation analyst employed by a multinational corporation. Your assignment is to pinpoint countries who give work fully remotely, for the title 'managers’ paying salaries exceeding $90,000 USD.
SELECT DISTINCT company_location FROM salaries
WHERE job_title LIKE '%Manager%' and remote_ratio = 100 and salary > 90000;
So above are the countries who give work fully remotely, for the title 'managers’ paying salaries exceeding $90,000 USD, and if anybody wants to get hired as a manager post can visit above country.
- Q2. As a remote work advocate working for a progressive HR tech startup who place their fresher's clients in large tech firms. you're tasked with identifying top 5 country having greatest count of large (company size) number of companies.
SELECT job_title, AVG(salary_in_usd) AS average_salary
FROM salaries
GROUP BY job_title
ORDER BY average_salary DESC;
So above are the top 5 countries having greatest count of large no. of companies.
- Q3. Picture yourself as a data scientist working for a workforce management platform. Your objective is to calculate the percentage of employees who enjoy fully remote roles with salaries exceeding $100,000 USD, shedding light on the attractiveness of high-paying remote positions in today's job market.
SET @`total` = (SELECT COUNT(*) FROM salaries WHERE salary_in_usd > 100000);
SET @`count` = (SELECT COUNT(*) FROM salaries WHERE salary_in_usd > 100000 AND remote_ratio = 100);
SET @`percentage` = round(((SELECT @count)/(SELECT @total)) * 100,2);
SELECT @percentage AS `final_percentage`;
So above are the percentage of employees who enjoy fully remote roles with salaries exceeding $100,000 USD, shedding light on the attractiveness of high-paying remote positions in today's job market.
- Q4. Imagine you're a data analyst working for a global recruitment agency. Your task is to identify the locations where entry-level average salaries exceed the average salary for that job title in market for entry level, helping your agency guide candidates towards lucrative opportunities.
SELECT t1.job_title, company_location, avg_salary, avg_salary_country_wise FROM
(SELECT job_title, AVG(salary_in_usd) as `avg_salary` FROM salaries
GROUP BY job_title) t1
INNER JOIN
(SELECT job_title, company_location, AVG(salary_in_usd) as `avg_salary_country_wise` FROM salaries
GROUP BY job_title, company_location) t2
ON
t1.job_title = t2.job_title
WHERE avg_salary_country_wise > avg_salary;
So there are 141 results for the locations where entry-level average salaries exceed the average salary for that job title in market for entry level.
- Q5. You've been hired by a big HR consultancy to look at how much people get paid in different countries. Your job is to find out for each job title for which country pays the maximum average salary. This helps you to place your candidates in those countries.
SELECT t1.job_title, t2.max_salary, t1.company_location
FROM (
SELECT t1.job_title, MAX(t1.avg_salary) as max_salary
FROM (
SELECT job_title, company_location, AVG(salary_in_usd) AS avg_salary
FROM salaries
GROUP BY job_title, company_location
ORDER BY job_title ASC
) AS t1
GROUP BY t1.job_title
) AS t2
INNER JOIN (
SELECT job_title, company_location, AVG(salary_in_usd) AS avg_salary
FROM salaries
GROUP BY job_title, company_location
ORDER BY job_title ASC
) AS t1
ON t1.job_title = t2.job_title AND t1.avg_salary = t2.max_salary;
There are 147 results that shows each job title for which country pays the maximum average salary.
- Q6. As a data-driven Business Consultant, you've been hired by a multinational corporation to analyze salary trends across different company locations. Your goal is to pinpoint locations where the average salary has consistently increased over the past few years. Specifically, you will focus on countries where data is available for three years only (the present year and the past two years), providing insights into locations experiencing sustained salary growth.
WITH temptbl AS (
SELECT *
FROM salaries
WHERE company_location IN (
SELECT company_location
FROM (
SELECT
company_location,
AVG(salary_in_usd) AS avg_salary,
COUNT(DISTINCT work_year) AS cnt
FROM salaries
WHERE work_year >= (YEAR(CURRENT_DATE()) - 2)
GROUP BY company_location
HAVING cnt = 3
) t1
)
)
SELECT
company_location,
MAX(CASE WHEN work_year = 2022 THEN ROUND(average, 2) END) AS avg_salary_2022,
MAX(CASE WHEN work_year = 2023 THEN ROUND(average, 2) END) AS avg_salary_2023,
MAX(CASE WHEN work_year = 2024 THEN ROUND(average, 2) END) AS avg_salary_2024
FROM (
SELECT
company_location,
work_year,
AVG(salary_in_usd) AS average
FROM temptbl
GROUP BY company_location, work_year
) t2
GROUP BY company_location
HAVING avg_salary_2024 > avg_salary_2023 AND avg_salary_2023 > avg_salary_2022
ORDER BY company_location;
So above are the 8 country where the average salary has consistently increased over the past few years.
- Q7. Picture yourself as a workforce strategist employed by a global HR tech startup. Your mission is to determine the percentage of fully remote work for each experience level in 2021 and compare it with the corresponding figures for 2024, highlighting any significant increases or decreases in remote work adoption over the years.
SELECT
t3.experience_level,
t3.pct21,
t4.pct24,
ROUND((t4.pct24 - t3.pct21), 2) AS `change_in_pct`
FROM
(SELECT
remote_2021.experience_level,
ROUND((remote_count * 100.0) / total_count, 2) AS `pct21`
FROM
(SELECT experience_level, COUNT(*) AS remote_count
FROM salaries
WHERE remote_ratio = 100 AND work_year = 2021
GROUP BY experience_level) AS remote_2021
INNER JOIN
(SELECT experience_level, COUNT(*) AS total_count
FROM salaries
WHERE work_year = 2021
GROUP BY experience_level) AS total_2021
ON remote_2021.experience_level = total_2021.experience_level) AS t3
INNER JOIN
(SELECT
remote_2024.experience_level,
ROUND((remote_count * 100.0) / total_count, 2) AS `pct24`
FROM
(SELECT experience_level, COUNT(*) AS remote_count
FROM salaries
WHERE remote_ratio = 100 AND work_year = 2024
GROUP BY experience_level) AS remote_2024
INNER JOIN
(SELECT experience_level, COUNT(*) AS total_count
FROM salaries
WHERE work_year = 2024
GROUP BY experience_level) AS total_2024
ON remote_2024.experience_level = total_2024.experience_level) AS t4
ON t3.experience_level = t4.experience_level
ORDER BY t3.experience_level
LIMIT 0, 1000;
So above are the information about the percentage of fully remote work for each experience level in 2021 and compare it with the corresponding figures for 2024.
- Q8. AS a Compensation specialist at a Fortune 500 company, you're tasked WITH analyzing salary trends over time. Your objective is to calculate the average salary increase percentage for each experience level and job title between the years 2023 and 2024, helping the company stay competitive IN the talent market
SELECT t1.job_title, company_location, avg_salary, avg_salary_country_wise FROM
(SELECT job_title, AVG(salary_in_usd) as `avg_salary` FROM salaries
GROUP BY job_title) t1
INNER JOIN
(SELECT job_title, company_location, AVG(salary_in_usd) as `avg_salary_country_wise` FROM salaries
GROUP BY job_title, company_location) t2
ON
t1.job_title = t2.job_title
WHERE avg_salary_country_wise > avg_salary;
Here INSIGHTS.
- Q9. You're a database administrator tasked with role-based access control for a company's employee database. Your goal is to implement a security measure where employees in different experience level (e.g. Entry Level, Senior level
SELECT t1.job_title, company_location, avg_salary, avg_salary_country_wise FROM
(SELECT job_title, AVG(salary_in_usd) as `avg_salary` FROM salaries
GROUP BY job_title) t1
INNER JOIN
(SELECT job_title, company_location, AVG(salary_in_usd) as `avg_salary_country_wise` FROM salaries
GROUP BY job_title, company_location) t2
ON
t1.job_title = t2.job_title
WHERE avg_salary_country_wise > avg_salary;
Here INSIGHTS.
Comments
Post a Comment