> ## Documentation Index
> Fetch the complete documentation index at: https://docs.powersync.com/llms.txt
> Use this file to discover all available pages before exploring further.

# Data Queries

> Write Data Queries in Sync Rules to select and filter data for buckets using bucket parameters.

Data Queries select the data that form part of a [bucket](/architecture/powersync-service#bucket-system), using the bucket [parameters](/sync/rules/overview#parameters).

Multiple Data Queries can be specified for a single [bucket definition](/sync/rules/overview#bucket-definition).

## Every Data Query Must Use Every Bucket Parameter

Data Queries are used to group data into buckets, so each Data Query must use every bucket [parameter](/sync/rules/overview#parameters).

<Accordion title="Why does each Data Query have to use every bucket parameter?">
  When PowerSync does [incremental replication](/architecture/powersync-service#initial-replication-vs-incremental-replication) of data from your source database, it evaluates every row/document received on the <Tooltip tip="Change data capture - the generic term for tracking deltas on a database. The specific mechanism depends on the source database type: Postgres logical replication, MongoDB change streams, the MySQL binlog, or SQL Server Change Data Capture">CDC stream</Tooltip>, and computes a list of [buckets](/architecture/powersync-service#bucket-system) that row/document belongs to. This allows PowerSync to efficiently update only the specific buckets that are affected by each change event received. PowerSync uses the Data Queries in the Sync Rules bucket definitions to determine which rows/documents belong to which buckets. Therefore, if it was possible for a certain bucket parameter to *not* be used in the `WHERE` clause of a Data Query, the bucket IDs to which the row/document belongs would be ambiguous — we would have to assume "all possible values" for an ambiguous parameter value in the bucket ID – and the row/document would have to be exploded into many buckets. To avoid this, PowerSync imposes the constraint that every Data Query needs to use every parameter defined on the bucket.
</Accordion>

## Supported SQL

The supported SQL in Data Queries is based on a small subset of the SQL standard syntax. Not all SQL constructs are supported. See [Supported SQL](/sync/supported-sql) for full details.

## Examples

#### Grouping by Parameter Query Values

```yaml theme={null}
bucket_definitions:
  owned_lists:
    parameters: |
        SELECT id as list_id FROM lists WHERE
           owner_id = request.user_id()
    data:
      - SELECT * FROM lists WHERE lists.id = bucket.list_id
      - SELECT * FROM todos WHERE todos.list_id = bucket.list_id
```

#### Selecting Output Columns/Fields

When specific columns/fields are selected, only those columns/fields are synced to the client.

This is good practice, to ensure the synced data does not unintentionally change when new columns are added to the schema (in the case of Postgres) or to the data structure (in the case of MongoDB).

Note: An `id` column must always be present, and must have a `text` type. If the primary key is different, use a column alias and/or transformations to output a `text` id column.

```yaml theme={null}
bucket_definitions:
  global:
    data:
      - SELECT id, name, owner_id FROM lists
```

<Note>
  MongoDB uses `_id` as the name of the ID field in collections. Therefore, PowerSync requires using `SELECT _id as id` in the data queries when [using MongoDB](/configuration/source-db/setup) as the backend source database.
</Note>

#### Renaming Columns/Fields

Different names (aliases) may be specified for columns/fields:

```yaml theme={null}
bucket_definitions:
  global:
    data:
      - SELECT id, name, created_timestamp AS created_at FROM lists
```

#### Transforming Columns/Fields

A limited set of operators and functions are available to transform the output value of columns/fields.

```yaml theme={null}
bucket_definitions:
  global:
    data:
      # Cast number to text
      - SELECT id, item_number :: text AS item_number FROM todos
      # Alternative syntax for the same cast
      - SELECT id, CAST(item_number as TEXT) AS item_number FROM todos
      # Convert binary data (bytea) to base64
      - SELECT id, base64(thumbnail) AS thumbnail_base64 FROM todos
      # Extract field from JSON or JSONB column
      - SELECT id, metadata_json ->> 'description' AS description FROM todos
      # Convert time to epoch number
      - SELECT id, unixepoch(created_at) AS created_at FROM todos
```
