Join tables are often used to implement many-to-many relationships between tables. Join queries are not directly supported in PowerSync Sync Rules, and require some workarounds depending on the use case. This guide contains some recommended strategies.
As an example, consider a social media application. The app has message boards. Each user can subscribe to boards, make posts, and comment on posts. Posts may also have one or more topics.
For this app, we generally want to sync all posts in boards that users have subscribed to. To simplify these examples, we assume a user has to be subscribed to a board to post.
Boards make a nice grouping of data for Sync Rules: We sync the boards that a user has subscribed to, and the same board data is synced to all users subscribed to that board.
The relationship between users and boards is a many-to-many, specified via the board_subscriptions table.
To start with, in our PowerSync Sync Rules, we define a bucket and sync the posts. The parameter query is defined using the board_subscriptions table:
board_data:parameters: select board_id from board_subscriptions where user_id = request.user_id() data:- select * from posts where board_id = bucket.board_id
Avoiding joins in data queries: Denormalize relationships (comments)
Next, we also want to sync comments for those boards. There is a one-to-many relationship between boards and comments, via the posts table. This means conceptually we can add comments to the same board bucket. With general SQL, the query could be:
Unfortunately, joins are not supported in PowerSync’s Sync Rules. Instead, we denormalize the data to add a direct foreign key relationship between comments and boards: (Postgres example)
Now we can add it to the bucket definition in our Sync Rules:
board_data:parameters: select board_id from board_subscriptions where user_id = request.user_id() data:- select * from posts where board_id = bucket.board_id# Add comments:- select * from comments where board_id = bucket.board_id
Now we want to sync topics of posts. In this case we added board_id from the start, so post_topics is simple in our Sync Rules:
board_data:parameters: select board_id from board_subscriptions where user_id = request.user_id() data:- select * from posts where board_id = bucket.board_id- select * from comments where board_id = bucket.board_id# Add post_topics:- select * from post_topics where board_id = bucket.board_id
Now we need access to sync the topics for all posts synced to the device. There is a many-to-many relationship between posts and topics, and by extension boards to topics. This means there is no simple direct way to partition topics into buckets — the same topics be used on any number of boards.
If the topics table is limited in size (say 1,000 or less), the simplest solution is to just sync all topics in our Sync Rules:
global_topics: data:- select * from topics where board_id = bucket.board_id
Many-to-many strategy: Denormalize data (topics, user names)
If there are many thousands of topics, we may want to avoid syncing everything. One option is to denormalize the data by copying the topic label over to post_topics: (Postgres example)
Now we don’t need to sync the topics table itself, as everything is included in post_topics. Assuming the topic label never or rarely changes, this could be a good solution.
Next up, we want to sync the relevant user profiles, so we can show it together with comments and posts. For simplicity, we sync profiles for all users subscribed to a board.
One option is to add the author name to each board subscription, similar to what we’ve done for topics: (Postgres example)
board_data:parameters: select board_id from board_subscriptions where user_id = request.user_id() data:- select * from posts where board_id = bucket.board_id- select * from comments where board_id = bucket.board_id- select * from post_topics where board_id = bucket.board_id# Add subscriptions which include the names:- select * from board_subscriptions where board_id = bucket.board_id
Many-to-many strategy: Array of IDs (user profiles)
If we need to sync more than just the name (let’s say we need a last activity date, profile picture and bio text as well), the above approach doesn’t scale as well. Instead, we want to sync the users table directly. To sync user profiles directly in the bucket for the board, we need a new array.
By using an array instead of or in addition to a join table, we can use it directly in Sync Rules:
board_data:parameters: select board_id from board_subscriptions where user_id = request.user_id() data:- select * from posts where board_id = bucket.board_id- select * from comments where board_id = bucket.board_id- select * from post_topics where board_id = bucket.board_id# Add participating users:- select name, last_activity, profile_picture, bio from users where bucket.board_id in subscribed_board_ids
This approach does require some extra effort to keep the array up to date. One option is to use a trigger in the case of Postgres:
CREATEORREPLACEFUNCTION recalculate_subscribed_boards()RETURNSTRIGGERAS $$BEGIN-- Recalculate subscribed_board_ids for the affected userUPDATE usersSET subscribed_board_ids =(SELECT array_agg(board_id)FROM board_subscriptionsWHERE user_id =COALESCE(NEW.user_id, OLD.user_id))WHERE id =COALESCE(NEW.user_id, OLD.user_id);RETURNNULL;END;$$ LANGUAGE plpgsql;CREATETRIGGER trg_board_subscriptions_changeAFTERINSERTORUPDATEORDELETEON board_subscriptionsFOR EACH ROWEXECUTEFUNCTION recalculate_subscribed_boards();
Note that this approach does have scaling limitations. When the number of board subscriptions per user becomes large (say over 100 rows per user), then:
Updating the subscribed_board_ids array in Postgres becomes slower.
The overhead is even more pronounced on PowerSync, since PowerSync maintains a separate copy of the data in each bucket.
In those cases, another approach may be more suitable.