Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Incorrectly escaped capture instance name #83

Closed
HaydenNess opened this issue Sep 23, 2024 · 2 comments · Fixed by #85
Closed

Incorrectly escaped capture instance name #83

HaydenNess opened this issue Sep 23, 2024 · 2 comments · Fixed by #85

Comments

@HaydenNess
Copy link

HaydenNess commented Sep 23, 2024

When using LOG_BASED synchronization, the tap incorrectly reads the minimum as LSN 0x0. It seems this value isn't used beyond logging so the proper behavior remains intact.

2024-09-23T23:21:03.228424Z [info ] time=2024-09-24 09:21:03 name=singer level=INFO message=Data available in cdc table "dbo_SalHistorySumCtl" from lsn 00000000000000000000 cmd_type=elb consumer=False job_name=dev:lbsproductionsyspro-dbo-0-to-target-parquet-lbsproductionsyspro name=lbsproductionsyspro-dbo-0 producer=True run_id=6eb2fba4-6a80-4e16-a1b9-4d9d5c2b59ba stdio=stderr string_id=lbsproductionsyspro-dbo-0 2024-09-23T23:21:03.228665Z [info ] time=2024-09-24 09:21:03 name=singer level=INFO message=The last lsn processed as per the state file 000025100007f570001a, minimum available lsn for extract table 00000000000000000000, and the maximum lsn is 0000254c004852300001. cmd_type=elb consumer=False job_name=dev:lbsproductionsyspro-dbo-0-to-target-parquet-lbsproductionsyspro name=lbsproductionsyspro-dbo-0 producer=True run_id=6eb2fba4-6a80-4e16-a1b9-4d9d5c2b59ba stdio=stderr string_id=lbsproductionsyspro-dbo-0

I've investigated, and found that the minimum LSN is read by a query at log_based.py:87. The query used for the above example is...

SELECT sys.fn_cdc_get_min_lsn ( '"dbo_SalHistorySumCtl"' ) lsn_from , sys.fn_cdc_get_max_lsn () lsn_to ;

and the capture_instance_name used to format the function argument is "dbo_SalHistorySumCtl".

The function works correctly with the double quotes removed. The quotes are appended by common.escape.

On the topic, I suspect it might be best to separate the escaping of function parameters from SQL object names and/or to use the params arguments supported by pymssql.cursor.execute.

@mjsqu
Copy link
Collaborator

mjsqu commented Sep 24, 2024

Thanks for raising this I understand off the first read of this that the capture instance argument value provided to fn_cdc_get_min_lsn does not need to be escaped in the same way as it does when it is part of an object name.

I raised this on the prior PR, but I don't think I phrased the comment that implied it required action:
#79 (comment)

@s7clarke10 recalls using or creating an "unescaped" variable of the capture instance name at some point...

@s7clarke10
Copy link
Collaborator

Apologies for introducing this breaking change and the delay in resolution - I have raised a PR ready for review to resolve this issue.

@mjsqu mjsqu closed this as completed in #85 Oct 9, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants