Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve sync plugin supported CSV python script #919

Merged
merged 5 commits into from
Apr 12, 2024

Conversation

cindyyuanjiang
Copy link
Collaborator

Fixes #914

Changes

  1. For rows which exist in tools CSV but not the plugin side, keep them in the final output
  2. After generating report, post-process the dataframe results so that rows with S in Supported Column will have None for Notes in final output

@cindyyuanjiang cindyyuanjiang added bug Something isn't working core_tools Scope the core module (scala) labels Apr 9, 2024
@cindyyuanjiang cindyyuanjiang self-assigned this Apr 9, 2024
Signed-off-by: cindyyuanjiang <[email protected]>
Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang

I have a question.
I tested the new changes on the most recent code and I got the report below.

In the generated files, The row InMemoryTableScanExec actually became None which is the correct value. Still, the report does not list it as one of the changes that has been done.
I am asking to see if this intentional? or did you miss that corner case?

**supportedExecs.csv (FROM TOOLS TO PLUGIN)**
Row is removed: MapInArrowExec, S, None, Input/Output, S, S, S, S, S, S, S, S, PS, S, NS, NS, NS, NS, PS, NS, PS, NS, NS, NS

**supportedExprs.csv (FROM TOOLS TO PLUGIN)**
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, str, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NS, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, pos, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, len, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NS, NA, NA, NA, NA, NA, NS, NS

@cindyyuanjiang
Copy link
Collaborator Author

cindyyuanjiang commented Apr 10, 2024

I have a question. I tested the new changes on the most recent code and I got the report below.

In the generated files, The row InMemoryTableScanExec actually became None which is the correct value. Still, the report does not list it as one of the changes that has been done. I am asking to see if this intentional? or did you miss that corner case?

Thanks @amahussein!

I didn't add this change in the report, but I added a note:
3. The "Notes" column for rows with "S" for "Supported" will be updated to "None" in the final output.

I was thinking of the report as a documentation of changes "from tools to plugin" without our own post-processing info. I can update this if we want to include those changes in the report.

@amahussein
Copy link
Collaborator

I have a question. I tested the new changes on the most recent code and I got the report below.
In the generated files, The row InMemoryTableScanExec actually became None which is the correct value. Still, the report does not list it as one of the changes that has been done. I am asking to see if this intentional? or did you miss that corner case?

Thanks @amahussein!

I didn't add this change in the report, but I added a note: 3. The "Notes" column for rows with "S" for "Supported" will be updated to "None" in the final output.

I was thinking of the report as a documentation of changes "from tools to plugin" without our own post-processing info. I can update this if we want to include those changes in the report.

mmm, isn't the report supposed to document the changes going into tools?
In other words it shows the difference between the newly generated files and the CSV files in tools.
This way, we can tell what are the changes being introduced by the new sync. It is like a "diff" but non-textual one.

@cindyyuanjiang
Copy link
Collaborator Author

mmm, isn't the report supposed to document the changes going into tools? In other words it shows the difference between the newly generated files and the CSV files in tools. This way, we can tell what are the changes being introduced by the new sync. It is like a "diff" but non-textual one.

Thanks @amahussein! I have updated the script to reflect the changes to the report.

Copy link
Collaborator

@amahussein amahussein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @cindyyuanjiang !

I like it better now.
It reports several rows with updated "Notes" that we could not have seen before.

**supportedDataSource.csv (FROM TOOLS TO PLUGIN)**

**supportedExecs.csv (FROM TOOLS TO PLUGIN)**
Row is changed: InMemoryTableScanExec, S, This is disabled by default because there could be complications when using it with AQE with Spark-3.5.0 and Spark-3.5.1. For more details please check https://github.com/NVIDIA/spark-rapids/issues/10603, Input/Output, S, S, S, S, S, S, S, S, PS, S, S, NS, NS, NS, PS, PS, PS, NS, S, S
    Notes: This is disabled by default because there could be complications when using it with AQE with Spark-3.5.0 and Spark-3.5.1. For more details please check https://github.com/NVIDIA/spark-rapids/issues/10603 -> None
Row is removed: MapInArrowExec, S, None, Input/Output, S, S, S, S, S, S, S, S, PS, S, NS, NS, NS, NS, PS, NS, PS, NS, NS, NS

**supportedExprs.csv (FROM TOOLS TO PLUGIN)**
Row is changed: ArrayExcept, S, `array_except`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayExcept, S, `array_except`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayExcept, S, `array_except`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayIntersect, S, `array_intersect`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayIntersect, S, `array_intersect`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayIntersect, S, `array_intersect`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayUnion, S, `array_union`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayUnion, S, `array_union`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArrayUnion, S, `array_union`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArraysOverlap, S, `arrays_overlap`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array1, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArraysOverlap, S, `arrays_overlap`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, array2, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: ArraysOverlap, S, `arrays_overlap`, This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+, project, result, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation treats -0.0 and 0.0 as equal; but the CPU implementation currently does not (see SPARK-39845). Also; Apache Spark 3.1.3 fixed issue SPARK-36741 where NaNs in these set like operators were not treated as being equal. We have chosen to break with compatibility for the older versions of Spark in this instance and handle NaNs the same as 3.1.3+ -> None
Row is changed: InitCap, S, `initcap`, This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly., project, input, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly. -> None
Row is changed: InitCap, S, `initcap`, This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly., project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly. -> None
Row is changed: Lower, S, `lcase`; `lower`, This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly., project, input, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly. -> None
Row is changed: Lower, S, `lcase`; `lower`, This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly., project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly. -> None
Row is changed: StringTranslate, S, `translate`, This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094), project, input, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094) -> None
Row is changed: StringTranslate, S, `translate`, This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094), project, from, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094) -> None
Row is changed: StringTranslate, S, `translate`, This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094), project, to, NA, NA, NA, NA, NA, NA, NA, NA, NA, PS, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094) -> None
Row is changed: StringTranslate, S, `translate`, This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094), project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation supports all unicode code points. In Spark versions < 3.2.0; translate() does not support unicode characters with code point >= U+10000 (See SPARK-34094) -> None
Row is changed: Upper, S, `ucase`; `upper`, This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly., project, input, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly. -> None
Row is changed: Upper, S, `ucase`; `upper`, This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly., project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the Unicode version used by cuDF and the JVM may differ; resulting in some corner-case characters not changing case correctly. -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, aggregation, input, NA, S, S, S, S, S, S, NS, NS, NA, S, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, aggregation, percentage, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, aggregation, accuracy, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, aggregation, result, NA, S, S, S, S, S, S, NS, NS, NA, S, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, reduction, input, NA, S, S, S, S, S, S, NS, NS, NA, S, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, reduction, percentage, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, reduction, accuracy, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is changed: ApproximatePercentile, S, `approx_percentile`; `percentile_approx`, This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark, reduction, result, NA, S, S, S, S, S, S, NS, NS, NA, S, NA, NA, NA, PS, NA, NA, NA, NS, NS
    Notes: This is not 100% compatible with the Spark version because the GPU implementation of approx_percentile is not bit-for-bit compatible with Apache Spark -> None
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, str, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NS, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, pos, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, len, NA, NA, NA, S, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NA, NS, NS
Row is removed: EphemeralSubstring, S, `substr`; `substring`, None, project, result, NA, NA, NA, NA, NA, NA, NA, NA, NA, S, NA, NA, NS, NA, NA, NA, NA, NA, NS, NS

@amahussein amahussein merged commit 26f5f85 into NVIDIA:dev Apr 12, 2024
15 checks passed
@cindyyuanjiang cindyyuanjiang deleted the spark-rapids-tools-914 branch April 12, 2024 17:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working core_tools Scope the core module (scala)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] Sync supported CSV script should keep tools added operators in generated results
2 participants