Computations
So far we've seen how to find individual interesting rows, like posts with high scores. A lot of the queries you'll find on SEDE are of this type—find posts from low-reputation users, all comments over a certain score threshold, all self-answered questions with positive scores for both question and answer, most-edited tag wikis, and so on. But SEDE (and SQL) can also do computations and aggregation.
Consider the question "who answers my questions"—that is, who are the users on the site who tend to help me? We can start with the following query:
SELECT a.Id AS [Post Link], a.Score, au.Id AS [User Link] FROM Posts p JOIN Posts a ON p.Id = a.ParentId JOIN Users au ON a.OwnerUserId = au.Id WHERE p.PostTypeId = 1 AND p.OwnerUserId = ##UserId## ORDER BY a.Score DESCRun Query
This query finds all questions (PostTypeId =
1
) from a given user(##UserId##
)
and returns the answer links, answer scores, and answering users, ordered by answer score (highest first).
With this query we can start to see who provides the most-popular answers to somebody's questions.
But this doesn't tell us anything about overall patterns; it operates at the single-answer level. We're really more interested in patterns–which users tend to give good answers to my (or anybody else's) questions?
The next query produces some aggregate data, like this:

Here's the query:
SELECT SUM(a.Score) as 'Total Score', COUNT(au.Id) as '# Answers', au.Id AS [User Link] FROM Posts p JOIN Posts a ON p.Id = a.ParentId JOIN Users au ON a.OwnerUserId = au.Id WHERE p.PostTypeId = 1 AND p.OwnerUserId = ##UserId## -- We want one row per user: GROUP BY au.Id -- In order by total score, highest first: ORDER BY SUM(a.Score) DESCRun Query
We use the SUM
and
COUNT
functions in the
SELECT
statement to get the row values.
We want the total score (SUM
) of all
answers from each user, and we want to know how many answers there were. That's the first part of the query:
SELECT SUM(a.Score) as 'Total Score', COUNT(au.Id) as '# Answers', au.Id AS [User Link]
(Perhaps you would prefer the average score, not the sum. You can do that with
AVG
.)
The JOIN
s are the same as in the
previous query; we're linking question, answers, and answering users. The restrictions
(WHERE
) are also the same.
When reporting results we group the rows by answering user—one row per user, as opposed to one row per
answer—and we also want to show the highest score first. We used the
SUM
earlier, in the
SELECT
, but it's not actually a value
stored in the table, so we have to use it again here:
GROUP BY au.Id ORDER BY SUM(a.Score) DESC
Aggregate functions you might find helpful, in addition to SUM
and COUNT
, are
AVG
(average),
MIN
, and
MAX
.
Notice, by the way, that while the "base" table in the last two queries is Posts, we don't actually return any columns from that table. That's fine. We're using it entirely as "glue" to hold the rest of the query together. Just because you include a table doesn't mean you have to show anything from it.
Next Steps