Skip to main content

Introduction

The PowerSync protocol is schemaless, and not directly affected by schema changes. Replicating data from the source database to buckets may be affected by server-side changes to the schema (in the case of Postgres), and may need reprocessing in some cases. The client-side schema is just a view on top of the schemaless data. Updating this client-side schema is immediate when the new version of the app runs, with no client-side migrations required. The developer is responsible for keeping client-side schema changes backwards-compatible with older versions of client apps. PowerSync has some functionality to assist with this:
  1. Different stream queries can be applied based on connection parameters such as client version. (In Sync Rules, this uses client parameters.)
  2. Stream queries can apply simple data transformations to keep data in a format compatible with older clients, for example by aliasing or casting columns. (In Sync Rules, this is done via data query expressions.)

Client-Side Impact of Schema and Sync Config Changes

As mentioned above, the PowerSync system itself is schemaless — the client syncs any data as received, in JSON format, regardless of the data model on the client. The schema as supplied on the client is only a view on top of the schemaless data.
  1. If tables/collections not described by the client-side schema are synced, it is stored internally, but not accessible.
  2. Same applies for columns/fields not described by the client-side schema.
  3. When there is a type mismatch, SQLite’s CAST functionality is used to cast to the type described by the schema.
    1. Data is internally stored as JSON.
    2. SQLite’s CAST is used to cast values to TEXT, INTEGER or REAL.
    3. Casting between types should never error, but it may not fully represent the original data. For example, casting an arbitrary string to INTEGER will likely result in a “0” value.
    4. Full rules for casting between types are described in the SQLite documentation here.
  4. Removing a table/collection is handled on the client as if the table exists with no data.
  5. Removing a column/field is handled on the client as if the values are undefined.
Nothing in PowerSync will fail hard if there are incompatible schema changes. But depending on how the app uses the data, app logic may break. For example, removing a table/collection that the app actively uses may break workflows in the app. To avoid certain types of breaking changes on older clients, data transformations may be used — via column aliasing/casting in Sync Streams, or data query expressions in Sync Rules.

Postgres Specifics

PowerSync keeps the buckets up to date with any incremental data changes, as recorded in the Postgres WAL / received in the logical replication stream. This is also referred to as DML (Data Manipulation Language) queries. However, this does not include DDL (Data Definition Language), which includes:
  1. Creating, dropping or renaming tables.
  2. Changing replica identity of a table.
  3. Adding, dropping or renaming columns.
  4. Changing the type of a column.

Postgres schema changes affecting Sync Streams

DROP table

Dropping a table is not directly detected by PowerSync, and previous data may be preserved. To make sure the data is removed, TRUNCATE the table before dropping, or remove the table from your Sync Streams (or legacy Sync Rules).

CREATE table

The new table is detected as soon as data is inserted.

DROP and re-CREATE table

This is a special case of combining DROP and CREATE. If a dropped table is created again, and data is inserted into the new table, the schema change is detected by PowerSync. PowerSync will delete the old data in this case, as if TRUNCATE was called before dropping.

RENAME table

A renamed table is handled similarly to dropping the old table, and creating a new table with the new name. The rename is only detected when data is inserted, updated or deleted to the new table. At this point, PowerSync effectively does a TRUNCATE of the old table, and replicates the new table. This may be a slow operation if the table is large, and all other replication will be blocked until the new table is replicated.

Change REPLICA IDENTITY

The replica identity of a table is considered changed if either:
  1. The type of replica identity changes (DEFAULT, INDEX, FULL, NOTHING).
  2. The name or type of columns part of the replica identity changes.
The latter can happen if:
  1. Using REPLICA IDENTITY FULL, and any column is added, removed, renamed, or the type changed.
  2. Using REPLICA IDENTITY DEFAULT, and the type of any column in the primary key is changed.
  3. Using REPLICA IDENTITY INDEX, and the type of any column in the replica index is changed.
  4. The primary key or replica index is removed or changed.
When the replica identity changes, the entire table is re-replicated again. This may be a slow operation if the table is large, and all other replication will be blocked until the table is replicated again. Sync Streams/Sync Rules affected by schema changes will fail “soft” — an alert would be generated, but the system will continue processing changes.

Column changes

Column changes such as adding, dropping, renaming columns, or changing column types, are not automatically detected by PowerSync (unless it affects the replica identity as described above). Adding a column with a NULL default value will generally not cause issues. Existing records will have a missing value instead of NULL value, but those are generally treated the same on the client. Adding a column with a different default value, whether it’s a static or computed value, will not have this default automatically replicated for existing rows. To propagate this value, make an update to every existing row. Removing a column will not have the values automatically removed for existing rows on PowerSync. To propagate the change, make an update to every existing row. Changing a column type, and/or changing the value of a column using an ALTER TABLE statement, will not be automatically replicated to PowerSync. In some cases, the change will have no effect on PowerSync (for example changing between VARCHAR and TEXT types). When the values are expected to change, make an update to every existing row to propagate the changes.

Publication changes

A table is not replicated unless it is part of the powersync publication. If a table is added to the publication, it is treated the same as a new table, and any existing data is replicated. This may be a slow operation if the table is large, and all other replication will be blocked until the new table is replicated. There are additional changes that can be made to a table in a publication:
  1. Which operations are replicated (insert, update, delete and truncate).
  2. Which rows are replicated (row filters).
Those changes are not automatically picked up by PowerSync during replication, and can cause PowerSync to miss changes if the changes are filtered out. PowerSync will not automatically recover the data when for example removing a row filter. Use these with caution.

MongoDB Specifics

Since MongoDB is schemaless, schema changes generally do not impact PowerSync. However, adding, dropping, and renaming collections require special consideration.

Adding Collections

Sync Streams/Sync Rules can include collections that do not yet exist in the source database. These collections will be created in MongoDB when data is first inserted. PowerSync will begin replicating changes as they occur in the source database.

Dropping Collections

Due to a limitation in the replication process, dropping a collection does not immediately propagate to synced clients. To ensure the change is reflected, any additional insert, update, replace, or delete operation must be performed in any collection within a synced database.

Renaming Collections

Renaming a synced collection to a name that is not included in Sync Streams (or legacy Sync Rules) has the same effect as dropping the collection. Renaming an unsynced collection to a name that is included in your Sync/Streams/Sync Rules triggers an initial snapshot replication. The time required for this process depends on the collection size. Circular renames (e.g., renaming todostodos_oldtodos) are not directly supported. To reprocess the database after such changes, a Sync Streams/Sync Rules update must be deployed.

MySQL Specifics

MySQL support is currently in a Beta release.
PowerSync keeps the buckets up to date with any incremental data changes as recorded in the MySQL binary log. The binary log also provides DDL (Data Definition Language) query updates, which include:
  1. Creating, dropping or renaming tables.
  2. Truncating tables. (Not technically a schema change, but they appear in the query updates regardless.)
  3. Changing replica identity of a table. (Creation, deletion or modification of primary keys, unique indexes, etc.)
  4. Adding, dropping, renaming or changing the types of columns.
For MySQL, PowerSync detects schema changes by parsing the DDL queries in the binary log. It may not always be possible to parse the DDL queries correctly, especially if they are complex or use non-standard syntax. In such cases, PowerSync will ignore the schema change, but will log a warning with the schema change query. If required, the schema change would then need to be manually handled by redeploying your Sync Streams (or legacy Sync Rules). This triggers a re-replication.

MySQL schema changes affecting Sync Streams

DROP table

PowerSync will detect when a table is dropped, and automatically remove the data from the buckets.

CREATE table

Table creation is detected and handled the first time row events for the new table appear on the binary log.

TRUNCATE table

PowerSync will detect truncate statements in the binary log, and consequently remove all data from the buckets for that table.

RENAME table

A renamed table is handled similarly to dropping the old table, and then creating a new table with existing data under the new name. This may be a slow operation if the table is large, since the “new” table has to be re-replicated. Replication will be blocked until the new table is replicated.

Change REPLICA IDENTITY

The replica identity of a table is considered to be changed if either:
  1. The type of replica identity changes (DEFAULT, INDEX, FULL, NOTHING).
  2. The name or type of columns which form part of the replica identity changes.
The latter can happen if:
  1. Using REPLICA IDENTITY FULL, and any column is added, removed, renamed, or the type changed.
  2. Using REPLICA IDENTITY DEFAULT, and the type of any column in the primary key is changed.
  3. Using REPLICA IDENTITY INDEX, and the type of any column in the replica index is changed.
  4. The primary key or replica index is removed or changed.
When the replication identity changes, the entire table is replicated again. This may be a slow operation if the table is large, and all other replication will be blocked until the table is replicated again. Sync Streams/Sync Rules affected by schema changes will fail “soft” — an alert would be generated, but the system will continue processing changes.

Column changes

Column changes such as adding, dropping, renaming columns, or changing column types, are detected by PowerSync but will generally not result in re-replication. (Unless the replica identity was affected as described above). Adding a column with a NULL default value will generally not cause issues. Existing records will have a missing value instead of NULL value, but those are generally treated the same on the client. Adding a column with a different default value, whether it’s a static or computed value, will not have this default automatically replicated for existing rows. To propagate this value, make an update to every existing row. Removing a column will not have the values automatically removed for existing rows on PowerSync. To propagate the change, make an update to every existing row. Changing a column type, and/or changing the default value of a column using an ALTER TABLE statement, will not be automatically replicated to PowerSync. In some cases, the change will have no effect on PowerSync (for example, changing between VARCHAR and TEXT types). When the values are expected to change, make an update to every existing row to propagate the changes.

SQL Server Specifics

SQL Server support is currently in a Beta release.Schema change handling for SQL Server is supported from PowerSync Service v1.20.2.
SQL Server CDC replication is designed to protect downstream consumers from schema changes. This means some schema changes, like changing the data type of a primary key column are blocked on the database level if CDC is enabled on a table. Other schema changes are allowed but are not automatically propagated to the capture instance for a table. This means that the only sure-fire way to update a capture instance with the latest schema changes is to drop and recreate it, or to create a new capture instance. PowerSync will automatically detect and use the new capture instance for the source table. Note that SQL Server CDC allows a maximum of 2 capture instances per table.

Dropping and Recreating a Capture Instance

-- Disable CDC for the source table
EXEC sys.sp_cdc_disable_table
      @source_schema = N'dbo',
      @source_name = N'<YOUR_TABLE_NAME>',
      @capture_instance = 'all'; -- This drops all capture instances for the source table. If you only want to drop a specific capture instance, use the capture instance name instead of 'all'.

-- Re-enable CDC for the source table
EXEC sys.sp_cdc_enable_table
        @source_schema = N'dbo',
        @source_name   = N'<YOUR_TABLE_NAME>',
        @role_name     = N'cdc_reader',
        @supports_net_changes = 0;

Creating a New Capture Instance

-- Create a new capture instance for the source table by specifying a new capture instance name
EXEC sys.sp_cdc_enable_table
        @source_schema = N'dbo',
        @source_name   = N'<YOUR_TABLE_NAME>',
        @role_name     = N'cdc_reader',
        @supports_net_changes = 0,
        @capture_instance = N'<NEW_CAPTURE_INSTANCE_NAME>'; -- If a capture instance for the table already exists, you have to specify a different name for the new capture instance.

Supported SQL Server schema changes:

CREATE table

Table creation is automatically detected when a new capture instance for a source table that matches your Sync Streams/Sync Rules is created. The table is snapshotted before replication can resume.

DROP table

PowerSync can detect a table drop by checking for the table existence when the capture instance for a table is dropped. This only works if PowerSync is running at the time of the table drop.
PowerSync cannot detect that a table was dropped if it was dropped while the PowerSync Service was stopped. Your sync config (Sync Streams/Sync Rules) will need to be redeployed to ensure that corresponding table’s buckets are removed.

RENAME table

Renaming a table is automatically detected and results in the removal of the bucket data for the old table, followed by a snapshot of the newly renamed table. Once the snapshot is completed, replication will resume.

Column changes

Some column changes are blocked on the database level if CDC is enabled on a table. These include:
  • column renames
  • changing the primary key
  • changing the data type of the primary key column
To perform these changes, CDC needs to be disabled and then re-enabled for the table. Column changes that are not blocked:
  • adding a new column: Until the capture instance has been recreated, the new column will not be replicated.
  • dropping a column: Until the capture instance has been recreated, replicated rows will contain a NULL value for the dropped column.
  • changing the data type of a column to another compatible type: PowerSync will replicate updated rows with the new data type, but historic rows will not be updated. To propagate the changes, make an update to every existing row to propagate the changes.
PowerSync will automatically detect and warn about the above changes, but to update the capture instance with the latest schema changes, CDC needs to be disabled and then re-enabled for the table. See the Dropping and Recreating a Capture Instance section for more details.

New Capture Instance

New capture instances are automatically detected for tables in your Sync Streams/Sync Rules. If a capture instance is newer than the one currently in use, PowerSync will automatically use the new capture instance. Switching to a new capture instance requires re-snapshotting the source table before replication can resume.

Capture instance missing or deleted

If a table in your Sync Streams/Sync Rules is not enabled for CDC, the table cannot be replicated and PowerSync will log a warning. If a table in your Sync Streams/Sync Rules is being replicated and the capture instance is dropped, PowerSync logs a warning and stops replication for that table. Existing data will not be removed, but if CDC is enabled on the table again, the data will be removed and the table re-snapshotted.

See Also