title	titleSuffix	description	author	ms.author	ms.reviewer	ms.subservice	ms.topic	ms.custom	ms.date
Exists transformation in mapping data flow	Azure Data Factory & Azure Synapse	Check for existing rows using the exists transformation in Azure Data Factory and Synapse Analytics mapping data flow	kromerm	makromer	daperlov	data-flows	conceptual	synapse	01/05/2024

Exists transformation in mapping data flow

[!INCLUDEappliesto-adf-asa-md]

[!INCLUDEdata-flow-preamble]

The exists transformation is a row filtering transformation that checks whether your data exists in another source or stream. The output stream includes all rows in the left stream that either exist or don't exist in the right stream. The exists transformation is similar to SQL WHERE EXISTS and SQL WHERE NOT EXISTS.

[!VIDEO https://www.microsoft.com/en-us/videoplayer/embed/RE4vZKz]

Configuration

Choose which data stream you're checking for existence in the Right stream dropdown.
Specify whether you're looking for the data to exist or not exist in the Exist type setting.
Select whether or not your want a Custom expression.
Choose which key columns you want to compare as your exists conditions. By default, data flow looks for equality between one column in each stream. To compare via a computed value, hover over the column dropdown and select Computed column.

:::image type="content" source="media/data-flow/exists.png" alt-text="Exists settings":::

Multiple exists conditions

To compare multiple columns from each stream, add a new exists condition by clicking the plus icon next to an existing row. Each additional condition is joined by an "and" statement. Comparing two columns is the same as the following expression:

source1@column1 == source2@column1 && source1@column2 == source2@column2

Custom expression

To create a free-form expression that contains operators other than "and" and "equals to", select the Custom expression field. Enter a custom expression via the data flow expression builder by clicking on the blue box.

:::image type="content" source="media/data-flow/exists1.png" alt-text="Exists custom settings":::

If you are building dynamic patterns in your data flows by using "late binding" of columns via schema drift, you can use the byName() expression function to use the exists transformation without hardcoding (i.e. early binding) the column names. Example: toString(byName('ProductNumber','source1')) == toString(byName('ProductNumber','source2'))

Broadcast optimization

:::image type="content" source="media/data-flow/broadcast.png" alt-text="Broadcast Join":::

In joins, lookups and exists transformation, if one or both data streams fit into worker node memory, you can optimize performance by enabling Broadcasting. By default, the spark engine will automatically decide whether or not to broadcast one side. To manually choose which side to broadcast, select Fixed.

It's not recommended to disable broadcasting via the Off option unless your joins are running into timeout errors.

Data flow script

Syntax

<leftStream>, <rightStream>
    exists(
        <conditionalExpression>,
        negate: { true | false },
        broadcast: { 'auto' | 'left' | 'right' | 'both' | 'off' }
    ) ~> <existsTransformationName>

Example

The below example is an exists transformation named checkForChanges that takes left stream NameNorm2 and right stream TypeConversions. The exists condition is the expression NameNorm2@EmpID == TypeConversions@EmpID && NameNorm2@Region == DimEmployees@Region that returns true if both the EMPID and Region columns in each stream matches. As we're checking for existence, negate is false. We aren't enabling any broadcasting in the optimize tab so broadcast has value 'none'.

In the UI experience, this transformation looks like the below image:

:::image type="content" source="media/data-flow/exists-script.png" alt-text="Exists example":::

The data flow script for this transformation is in the snippet below:

NameNorm2, TypeConversions
    exists(
        NameNorm2@EmpID == TypeConversions@EmpID && NameNorm2@Region == DimEmployees@Region,
	    negate:false,
	    broadcast: 'auto'
    ) ~> checkForChanges

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

data-flow-exists.md

data-flow-exists.md

Exists transformation in mapping data flow

Configuration

Multiple exists conditions

Custom expression

Broadcast optimization

Data flow script

Syntax

Example

Related content

Files

data-flow-exists.md

Latest commit

History

data-flow-exists.md

File metadata and controls

Exists transformation in mapping data flow

Configuration

Multiple exists conditions

Custom expression

Broadcast optimization

Data flow script

Syntax

Example

Related content