SQL Case Study on Data Science Jobs

SQL Case Study on Data Science Jobs


Introduction:

In today's data-driven world, understanding salary trends across various roles in the tech industry is crucial for both job seekers and employers. This blog post delves into a comprehensive analysis of a dataset containing salary information for different job titles, experience levels, and locations. By leveraging SQL queries, we aim to uncover insights and trends that can help individuals make informed career decisions and assist companies in benchmarking their compensation packages.


About the Dataset:

The dataset used in this analysis includes detailed salary information for various roles in the tech industry. Here are the key columns in the dataset:

  • work_year: The year the data was collected.
  • experience_level: The experience level of the employee, categorized as Junior (JR), Mid-level (MI), Senior (SE), and Executive (EX).
  • employment_type: The type of employment, such as full-time (FT), part-time (PT), contract (CT), or internship (IN).
  • job_title: The job title of the employee, such as Data Scientist, Machine Learning Engineer, AI Engineer, etc.
  • salary: The salary of the employee in the local currency.
  • salary_currency: The currency in which the salary is paid.
  • salary_in_usd: The salary converted to USD for standardization.
  • employee_residence: The country of residence of the employee.
  • remote_ratio: The percentage of remote work, ranging from 0 (no remote work) to 100 (fully remote).
  • company_location: The location of the company.
  • company_size: The size of the company, categorized as Small (S), Medium (M), and Large (L).

This dataset provides a rich source of information to analyze salary trends across different dimensions, such as job titles, experience levels, and geographic locations. In the following sections, we will use SQL queries to extract meaningful insights from this data and discuss the implications of our findings.


SQL Queries and Results:

Here I will discuss about the situations and questions arises regarding the Data Science salary dataset.

  • Q1. You're a compensation analyst employed by a multinational corporation. Your assignment is to pinpoint countries who give work fully remotely, for the title 'managers’ paying salaries exceeding $90,000 USD.
  SELECT DISTINCT company_location FROM salaries
  WHERE job_title LIKE '%Manager%' and remote_ratio = 100 and salary > 90000;


So above are the countries who give work fully remotely, for the title 'managers’ paying salaries exceeding $90,000 USD, and if anybody wants to get hired as a manager post can visit above country.


  • Q2. As a remote work advocate working for a progressive HR tech startup who place their fresher's clients in large tech firms. you're tasked with identifying top 5 country having greatest count of large (company size) number of companies.
  SELECT job_title, AVG(salary_in_usd) AS average_salary
  FROM salaries
  GROUP BY job_title
  ORDER BY average_salary DESC;  



So above are the top 5 countries having greatest count of large no. of companies.


  • Q3. Picture yourself as a data scientist working for a workforce management platform. Your objective is to calculate the percentage of employees who enjoy fully remote roles with salaries exceeding $100,000 USD, shedding light on the attractiveness of high-paying remote positions in today's job market.
  SET @`total` = (SELECT COUNT(*) FROM salaries WHERE salary_in_usd > 100000);
  SET @`count` = (SELECT COUNT(*) FROM salaries WHERE salary_in_usd > 100000 AND remote_ratio = 100);
  SET @`percentage` = round(((SELECT @count)/(SELECT @total)) * 100,2);

  SELECT @percentage AS `final_percentage`;





So above are the percentage of employees who enjoy fully remote roles with salaries exceeding $100,000 USD, shedding light on the attractiveness of high-paying remote positions in today's job market.


  • Q4. Imagine you're a data analyst working for a global recruitment agency. Your task is to identify the locations where entry-level average salaries exceed the average salary for that job title in market for entry level, helping your agency guide candidates towards lucrative opportunities.
  SELECT t1.job_title, company_location, avg_salary, avg_salary_country_wise FROM
      (SELECT job_title, AVG(salary_in_usd) as `avg_salary` FROM salaries 
      GROUP BY job_title) t1
  INNER JOIN
      (SELECT job_title, company_location, AVG(salary_in_usd) as `avg_salary_country_wise` FROM salaries
      GROUP BY job_title, company_location) t2
  ON
  t1.job_title = t2.job_title
  WHERE avg_salary_country_wise > avg_salary;





So there are 141 results for the locations where entry-level average salaries exceed the average salary for that job title in market for entry level.


  • Q5. You've been hired by a big HR consultancy to look at how much people get paid in different countries. Your job is to find out for each job title for which country pays the maximum average salary. This helps you to place your candidates in those countries.
  SELECT t1.job_title, t2.max_salary, t1.company_location 
  FROM (
      SELECT t1.job_title, MAX(t1.avg_salary) as max_salary 
      FROM (
          SELECT job_title, company_location, AVG(salary_in_usd) AS avg_salary 
          FROM salaries
          GROUP BY job_title, company_location
          ORDER BY job_title ASC
      ) AS t1
      GROUP BY t1.job_title
  ) AS t2
  INNER JOIN (
      SELECT job_title, company_location, AVG(salary_in_usd) AS avg_salary 
      FROM salaries
      GROUP BY job_title, company_location
      ORDER BY job_title ASC
  ) AS t1
  ON t1.job_title = t2.job_title AND t1.avg_salary = t2.max_salary;






There are 147 results that shows each job title for which country pays the maximum average salary.


  • Q6. As a data-driven Business Consultant, you've been hired by a multinational corporation to analyze salary trends across different company locations. Your goal is to pinpoint locations where the average salary has consistently increased over the past few years. Specifically, you will focus on countries where data is available for three years only (the present year and the past two years), providing insights into locations experiencing sustained salary growth.
 WITH temptbl AS (
      SELECT * 
      FROM salaries 
      WHERE company_location IN (
          SELECT company_location 
          FROM (
              SELECT 
                  company_location, 
                  AVG(salary_in_usd) AS avg_salary, 
                  COUNT(DISTINCT work_year) AS cnt 
              FROM salaries
              WHERE work_year >= (YEAR(CURRENT_DATE()) - 2)
              GROUP BY company_location 
              HAVING cnt = 3
          ) t1
      )
  )

  SELECT 
      company_location,
      MAX(CASE WHEN work_year = 2022 THEN ROUND(average, 2) END) AS avg_salary_2022,
      MAX(CASE WHEN work_year = 2023 THEN ROUND(average, 2) END) AS avg_salary_2023,
      MAX(CASE WHEN work_year = 2024 THEN ROUND(average, 2) END) AS avg_salary_2024
  FROM (
      SELECT 
          company_location, 
          work_year, 
          AVG(salary_in_usd) AS average 
      FROM temptbl
      GROUP BY company_location, work_year
  ) t2
  GROUP BY company_location
  HAVING avg_salary_2024 > avg_salary_2023 AND avg_salary_2023 > avg_salary_2022
  ORDER BY company_location;





So above are the 8 country where the average salary has consistently increased over the past few years.


  • Q7. Picture yourself as a workforce strategist employed by a global HR tech startup. Your mission is to determine the percentage of fully remote work for each experience level in 2021 and compare it with the corresponding figures for 2024, highlighting any significant increases or decreases in remote work adoption over the years.
  SELECT 
      t3.experience_level, 
      t3.pct21, 
      t4.pct24, 
      ROUND((t4.pct24 - t3.pct21), 2) AS `change_in_pct`
  FROM
      (SELECT 
          remote_2021.experience_level, 
          ROUND((remote_count * 100.0) / total_count, 2) AS `pct21`
      FROM 
          (SELECT experience_level, COUNT(*) AS remote_count 
          FROM salaries 
          WHERE remote_ratio = 100 AND work_year = 2021
          GROUP BY experience_level) AS remote_2021
      INNER JOIN 
          (SELECT experience_level, COUNT(*) AS total_count 
          FROM salaries 
          WHERE work_year = 2021
          GROUP BY experience_level) AS total_2021
      ON remote_2021.experience_level = total_2021.experience_level) AS t3
  INNER JOIN
      (SELECT 
          remote_2024.experience_level, 
          ROUND((remote_count * 100.0) / total_count, 2) AS `pct24`
      FROM 
          (SELECT experience_level, COUNT(*) AS remote_count 
          FROM salaries 
          WHERE remote_ratio = 100 AND work_year = 2024
          GROUP BY experience_level) AS remote_2024
      INNER JOIN 
          (SELECT experience_level, COUNT(*) AS total_count 
          FROM salaries 
          WHERE work_year = 2024
          GROUP BY experience_level) AS total_2024
      ON remote_2024.experience_level = total_2024.experience_level) AS t4
  ON t3.experience_level = t4.experience_level
  ORDER BY t3.experience_level
  LIMIT 0, 1000;





So above are the information about the percentage of fully remote work for each experience level in 2021 and compare it with the corresponding figures for 2024.


  • Q8. AS a Compensation specialist at a Fortune 500 company, you're tasked WITH analyzing salary trends over time. Your objective is to calculate the average salary increase percentage for each experience level and job title between the years 2023 and 2024, helping the company stay competitive IN the talent market
  SELECT t1.job_title, company_location, avg_salary, avg_salary_country_wise FROM
      (SELECT job_title, AVG(salary_in_usd) as `avg_salary` FROM salaries 
      GROUP BY job_title) t1
  INNER JOIN
      (SELECT job_title, company_location, AVG(salary_in_usd) as `avg_salary_country_wise` FROM salaries
      GROUP BY job_title, company_location) t2
  ON
  t1.job_title = t2.job_title
  WHERE avg_salary_country_wise > avg_salary;



Here INSIGHTS.


  • Q9. You're a database administrator tasked with role-based access control for a company's employee database. Your goal is to implement a security measure where employees in different experience level (e.g. Entry Level, Senior level
  SELECT t1.job_title, company_location, avg_salary, avg_salary_country_wise FROM
      (SELECT job_title, AVG(salary_in_usd) as `avg_salary` FROM salaries 
      GROUP BY job_title) t1
  INNER JOIN
      (SELECT job_title, company_location, AVG(salary_in_usd) as `avg_salary_country_wise` FROM salaries
      GROUP BY job_title, company_location) t2
  ON
  t1.job_title = t2.job_title
  WHERE avg_salary_country_wise > avg_salary;



Here INSIGHTS.

Comments

Popular posts from this blog

Unveiling the Gold Mine: The Intricate World of Data and its Transformative Impact