LogoInterview Master

SQL Learning

SQL Aggregations and Grouping

Turning Data into Insights with SQL

Welcome to the core of SQL data analysis: aggregation functions and grouping operations. These powerful tools allow you to condense vast amounts of raw data into meaningful summaries, enabling you to uncover trends, calculate key metrics, and answer critical business questions efficiently.

Think of SQL aggregations like a skilled analyst who can quickly summarize market trends, customer behavior, or product performance. Instead of sifting through individual records, you use functions like COUNT, SUM, AVG, MIN, and MAX, combined with GROUP BY, to get the big picture.

Why Master SQL Aggregations?

  • Data analysis: Extract key statistics (totals, averages, counts) from large datasets.
  • Business intelligence: Create summary reports and dashboards that drive strategic decisions.
  • Technical interviews: Aggregation and grouping questions are fundamental in data-focused roles.
  • Performance optimization: Aggregate data at the database level, reducing load on applications.
  • Data exploration: Quickly understand the distribution and characteristics of your data.

In this section, we'll explore each essential aggregation function and grouping clause with practical examples using a hypothetical TokTuk dataset. You'll learn how to move beyond simple data retrieval to performing sophisticated analysis directly within your database.

Essential SQL Aggregation and Grouping Tools

Let's dive into the building blocks of SQL data analysis. Each function and clause serves a distinct purpose in transforming raw data into actionable intelligence:

COUNT

Counts the number of rows or non-null values. Ideal for answering questions like "How many users signed up last month?" or "How many videos received comments?". Use COUNT(*) for total rows or COUNT(column) to count non-null values in a specific column. COUNT(DISTINCT column) counts unique non-null values.

-- Count total number of users
SELECT COUNT(*) AS total_users FROM Users;

-- Count videos with titles (non-null check)
SELECT COUNT(title) AS videos_with_titles FROM Videos;

-- Count unique users who interacted
SELECT COUNT(DISTINCT user_id) AS unique_interacting_users
FROM Interactions;
Learn more about COUNT

SUM

Calculates the total sum of values in a numeric column. Use this to find metrics like total views across all videos, total revenue generated, or total time spent watching content.

-- Calculate total views across all TokTuk videos
SELECT SUM(views) AS total_platform_views FROM Videos;

-- Sum views for videos uploaded in 2023
SELECT SUM(views) AS total_2023_views
FROM Videos
WHERE SUBSTR(upload_date, 1, 4) = '2023';
Learn more about SUM

AVG

Computes the average (mean) value of a numeric column. Answers questions like "What's the average number of views per TokTuk video?" or "What's the average number of likes per user?". NULL values are typically ignored in the calculation.

-- Find average views per video on TokTuk
SELECT AVG(views) AS average_video_views FROM Videos;

-- Calculate average views per video for a specific user
SELECT AVG(views) AS user_avg_views
FROM Videos
WHERE user_id = 123;
Learn more about AVG

MIN and MAX

Find the minimum (smallest) and maximum (largest) values in a column. Useful for identifying extremes: the most/least viewed video, the earliest/latest signup date, the highest/lowest interaction count. Works on numeric, date, and text columns.

-- Find the highest and lowest view counts
SELECT
  MIN(views) AS least_viewed,
  MAX(views) AS most_viewed
FROM Videos;

-- Get the date range of user signups
SELECT
  MIN(join_date) AS first_user_joined,
  MAX(join_date) AS latest_user_joined
FROM Users;
Learn more about MIN/MAX

GROUP BY

Groups rows that share the same values in one or more specified columns into a summary row. Aggregate functions (COUNT, SUM, AVG, etc.) are then applied to each group independently. This is crucial for segmented analysis, like finding the total views per user or the number of interactions per video category.

-- Count videos uploaded by each user
SELECT
  user_id,
  COUNT(*) AS videos_uploaded
FROM Videos
GROUP BY user_id
ORDER BY videos_uploaded DESC;

-- Calculate average views by video upload month
SELECT
  SUBSTR(upload_date, 1, 7) AS upload_month, -- Extracts 'YYYY-MM'
  AVG(views) AS average_views,
  COUNT(*) AS videos_in_month
FROM Videos
GROUP BY upload_month
ORDER BY upload_month;
Learn more about GROUP BY

HAVING

Filters the results of a GROUP BY query based on the aggregated values. While the WHERE clause filters rows *before* grouping, HAVING filters the groups *after* aggregation. Use this to answer questions like "Which users have uploaded more than 5 videos?" or "Which video categories have an average view count above 1000?".

-- Find users who have interacted (liked, commented, etc.) 3+ times
SELECT
  user_id,
  COUNT(*) AS interaction_count
FROM Interactions
GROUP BY user_id
HAVING COUNT(*) >= 3
ORDER BY interaction_count DESC;

-- Find video categories where the average views exceed the overall average
SELECT
  category,
  AVG(views) AS avg_category_views
FROM Videos
GROUP BY category
HAVING AVG(views) > (SELECT AVG(views) FROM Videos) -- Subquery for overall average
ORDER BY avg_category_views DESC;
Learn more about HAVING

CASE WHEN

Adds conditional logic (if-then-else) within your SQL query. While not strictly an aggregation function, it's incredibly useful when combined with GROUP BY. You can use CASE WHEN to categorize data before aggregation (e.g., grouping videos into "Short", "Medium", "Long" based on duration) or to perform conditional aggregation (e.g., counting only "like" interactions).

-- Categorize videos by view count and count videos in each category
SELECT
  CASE
    WHEN views < 1000 THEN 'Low Views'
    WHEN views BETWEEN 1000 AND 10000 THEN 'Medium Views'
    ELSE 'High Views'
  END AS view_category,
  COUNT(*) AS video_count,
  AVG(views) AS avg_views_in_category
FROM Videos
GROUP BY view_category;

-- Count likes vs comments per video
SELECT
  video_id,
  COUNT(CASE WHEN interaction_type = 'like' THEN 1 END) AS like_count,
  COUNT(CASE WHEN interaction_type = 'comment' THEN 1 END) AS comment_count
FROM Interactions
GROUP BY video_id;
Learn more about CASE WHEN

Next Steps

You now have a solid overview of SQL"s core aggregation and grouping capabilities. The best way to master these concepts is through practice.

  • Explore each function in detail using the links provided above.
  • Try praciting by modifying the code editor above.
  • Review common interview questions involving these functions.
  • Combine aggregations with Joins and Subqueries for more complex analysis.

Loading SQL editor...