tl;dr: WITH ORDINALITY and ORDER BY ordinality are your friends.

Once upon a time I could write fairly simple SQL queries, pull my data into R, and then manipulate the resulting data in R. It worked really well for me. But now I live in a world where my only two tools are PostgreSQL and Javascript, and I don’t know anyone who reaches for Javascript when she needs to reshape a bunch of tabular data.

The problem that I have right now is that each lesson is associated with an ordered list of problemIDs. But these id numbers are not meaningful to humans; the problems have other names that people know about. The lessons table has a column called problemIDs, which contains an array of the problems’ ids (in order). Meanwhile, the problems table has a column of ids and a column of the corresponding problemNames.

Something sort of like this:

lessons

idproblemIDs
100 {1, 2, 3, 4}
101 {5, 6, 7, 8}

problems

idproblemName
1 'some name'
2 'whatever'
3 'more stuff'
4 'yadda yadda'

(OK, not exactly. I have fudged the structure of things to simplify this example. Don’t even get me started on the column full of JSON that I also need to deal with in the real-life version of things.)

My goal is to get the array problemIDs as well as the corresponding array of problemNames—and to have both of these in the same order as the original array problemIDs.

That is, I want rows that look something like this:

100, {1, 2, 3, 4}, {'some name', 'whatever', 'more stuff', 'yadda yadda'}

Philosophically, what I need to do is unnest(problemIDs), then JOIN with the problems table to get the corresponding name for each problem, and then array_agg() everything back together into arrays.

The issue, of course, is that goodness knows what order the query planner is going to do things in, and there is no guarantee that my arrays are going to be in the same order as the original array problemIDs. I had been working my way around this by using subqueries and various functions, but it was all pretty awkward.

But on Friday I learned about the good news of unnest(foo) WITH ORDINALITY. Pair that with array_agg(bar ORDER BY ordinality), and the problem is solved.

SELECT lessons.id, 
array_agg("problemID" ORDER BY ordinality) AS "problemIDs", 
array_agg(problems."problemName" ORDER BY ordinality) AS "problemNames" 
FROM lessons, unnest("problemIDs") WITH ORDINALITY AS "problemID"
INNER JOIN problems
ON "problemID" = problems.id
GROUP BY lessons.id