Guide: Many-to-Many and Join Tables
Join tables are often used to implement many-to-many relationships between tables. Join queries are not directly supported in PowerSync Sync Rules, and require some workarounds depending on the use case. This guide contains some recommended strategies.
Example
As an example, consider a social media application. The app has message boards. Each user can subscribe to boards, make posts, and comment on posts. Posts may also have one or more topics.
Many-to-many: Bucket parameters
For this app, we generally want to sync all posts in boards that users have subscribed to. To simplify these examples, we assume a user has to be subscribed to a board to post.
Boards make a nice grouping of data for Sync Rules: We sync the boards that a user has subscribed to, and the same board data is synced to all users subscribed to that board.
The relationship between users and boards is a many-to-many, specified via the board_subscriptions
table.
To start with, in our PowerSync Sync Rules, we define a bucket and sync the posts. The parameter query is defined using the board_subscriptions
table:
Avoiding joins in data queries: Denormalize relationships (comments)
Next, we also want to sync comments for those boards. There is a one-to-many relationship between boards and comments, via the posts
table. This means conceptually we can add comments to the same board bucket. With general SQL, the query could be:
Unfortunately, joins are not supported in PowerSync’s Sync Rules. Instead, we denormalize the data to add a direct foreign key relationship between comments and boards: (Postgres example)
Now we can add it to the bucket definition in our Sync Rules:
Now we want to sync topics of posts. In this case we added board_id
from the start, so post_topics
is simple in our Sync Rules:
Many-to-many strategy: Sync everything (topics)
Now we need access to sync the topics for all posts synced to the device. There is a many-to-many relationship between posts and topics, and by extension boards to topics. This means there is no simple direct way to partition topics into buckets — the same topics be used on any number of boards.
If the topics table is limited in size (say 1,000 or less), the simplest solution is to just sync all topics in our Sync Rules:
Many-to-many strategy: Denormalize data (topics, user names)
If there are many thousands of topics, we may want to avoid syncing everything. One option is to denormalize the data by copying the topic label over to post_topics
: (Postgres example)
Now we don’t need to sync the topics
table itself, as everything is included in post_topics
. Assuming the topic label never or rarely changes, this could be a good solution.
Next up, we want to sync the relevant user profiles, so we can show it together with comments and posts. For simplicity, we sync profiles for all users subscribed to a board.
One option is to add the author name to each board subscription, similar to what we’ve done for topics
: (Postgres example)
Sync Rules:
Many-to-many strategy: Array of IDs (user profiles)
If we need to sync more than just the name (let’s say we need a last activity date, profile picture and bio text as well), the above approach doesn’t scale as well. Instead, we want to sync the users
table directly. To sync user profiles directly in the bucket for the board, we need a new array.
Adding an array to the schema in Postgres:
By using an array instead of or in addition to a join table, we can use it directly in Sync Rules:
This approach does require some extra effort to keep the array up to date. One option is to use a trigger in the case of Postgres:
Note that this approach does have scaling limitations. When the number of board subscriptions per user becomes large (say over 100 rows per user), then:
- Updating the
subscribed_board_ids
array in Postgres becomes slower. - The overhead is even more pronounced on PowerSync, since PowerSync maintains a separate copy of the data in each bucket.
In those cases, another approach may be more suitable.