Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Non-versioned fields should display latest value #51

Closed
midenok opened this issue Oct 18, 2016 · 9 comments
Closed

Non-versioned fields should display latest value #51

midenok opened this issue Oct 18, 2016 · 9 comments

Comments

@midenok
Copy link

midenok commented Oct 18, 2016

For historical SELECT:

  1. At setup fields phase:
    • replace non-versioned fields with AS_OF_NOW(field) func;
    • store non-nonversioned fields as keys in now_fields map in select_lex context.
  2. at some later phase (must be discovered when):
    • translate now_fields to separate SELECT;
    • execute that SELECT and store results to original historical select_lex context into now_fields map.
  3. When AS_OF_NOW is executed it selects field value from now_fields map.

AS_OF_NOW(field)

This function is non-interactive, so no need to modify parser. Function is local to select_lex context, so it holds pointer to now_fields inside itself as well as field param which will be key for now_fields. When executed, function searches value by field key in now_fields map.

now_fields map

Key: Field pointer, stored at p.1
Value: Item pointer, NULL at p.1, field values at p.2

Actual name in select_lex struct: vers_now_fields.

Questions

Do we need to store now_fields pointer into AS_OF_NOW() or we can store only reference to value from now_fields.

@midenok
Copy link
Author

midenok commented Oct 18, 2016

Please, think over. Should be scheduled to 0.5 or later.

@kevgs
Copy link

kevgs commented Oct 23, 2016

Case 1

create table t1(
  A int with system versioning,
  B int
);

insert into t1 values(1, 1);
set @after_first_insert=now(6);
update t1 set A=2, B=2;
select * from t1 for system_time as of timestamp @after_first_insert;

Returned row should be: 1, 2. It's obvious and simple case.

Case 2

create or replace table t1(
  A int with system versioning,
  B int
);

insert into t1 values(1, 1);
set @after_first_insert=now(6);
select *, sys_trx_start, sys_trx_end from t1;
begin;
delete from t1;
insert into t1 values (2, 2);
commit;
select * from t1 for system_time as of timestamp @after_first_insert;

Resulted row should be: 1, 1. Also simple.

The difficult part is how to distinguish between these two cases if tables have basically the same contents in both cases? I don't know the answer.

@midenok
Copy link
Author

midenok commented Oct 23, 2016

Case 3

create or replace table t1(
  A int with system versioning,
  B int
);

insert into t1 values(1, 1);
set @after_first_insert=now(6);
delete from t1;
select * from t1 for system_time as of timestamp @after_first_insert;

Expected: 1, NULL.

Explain your thought on case 2.

@kevgs
Copy link

kevgs commented Oct 23, 2016

In case 1 we updated row. It's clear when we update row result row is the same but with some differences.

In case 2 we deleted row and then inserted a different one. And it doesn't matter that the new row has the same values.

Case 4

create or replace table t1(
  A int with system versioning,
  B int
);

insert into t1 values(1, 1);
set @after_first_insert=now(6);
select *, sys_trx_start, sys_trx_end from t1;
delete from t1;
insert into t1 values (2, 2);
select * from t1 for system_time as of timestamp @after_first_insert;

1, NULL or 1, 1? Or why not 1, 2?

I see no way to distinguish between different rows and different versions of a row. I don't know how to get all versions of some row.

@midenok
Copy link
Author

midenok commented Oct 23, 2016

If there is no way to distinguish, then there is no way to use non-versioned columns. So, tables with non-versioned columns must have PRIMARY KEY. You may code a requirement for such tables to have it. But it's too complex to implement such requirement properly. So, you can just return NULL for tables without PK and track latest value for tables with PK. Again, it's too complex. Maybe there is no reason to implement such functionality.

At least, you may just return NULL in non-versioned columns for historical queries. This is simple and may be enough to maintain semantic consistency.

@kevgs
Copy link

kevgs commented Oct 23, 2016

Even PRIMARY KEY can't fix DELETE/INSERT case. Or we need to push another restriction and say user that DB treats rows with the same PK the same.

@midenok
Copy link
Author

midenok commented Oct 23, 2016

Usually people distinguish rows by ID. Agree, it's hard to implement (my prev comment).

@kevgs
Copy link

kevgs commented Oct 23, 2016

Another option is simple SELECT but complex UPDATE: when updating unversioned field we could find all historical versions of a row and update them too.

@midenok
Copy link
Author

midenok commented Oct 24, 2016

As we already discussed this will negate sense of optimization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants