Include indices of duplicate rows in the `Non_Unique_Primary_Key` error #6437

radeusgd · 2023-04-26T15:10:17Z

As discussed, it could be useful to include the indices of duplicate rows in the error indicating that the selected primary key was not unique.

To not stall that PR, since this has a lower priority, it's separated into this task.

Add duplicate_rows : Vector Integer field to Non_Unique_Primary_Key.Error.
Adapt its to_display_text to display what rows were duplicated.
Fill in this field when reporting the error.
- We should fill in row indices of one group that has duplicate entries. See below.

If we have a table

#	X
0	a
1	b
2	a
3	a
4	b
5	c

and set the primary key to be X, we have values a and b duplicated - so duplicated_rows should contain rows of one of these groups - i.e. it should either be [0, 2, 3] or [1, 4] (and it is unspecified which one). It shouldn't be [0, 1, 2, 3, 4] - if we include duplicates from all groups, in some cases we could just list all rows of a table (if every row has some duplicate) and that would not be helpful.

The text was updated successfully, but these errors were encountered:

radeusgd · 2023-05-05T13:13:09Z

After further thought and discsussion with @jdunkerley we decided to display the values associated with the primary key of the first row that has a clashing primary key. That is because it can be easily used to find related rows, is easy to display and is portable (row ids were not viable in Database as they are ill-defined there).

So for the table above, the result would be: clashing_primary_key=["a"] and the following message: The primary key [X] is not unique. For example, the key [a] corresponds to more than one row.

…nto read/write, improve SQLite format detection (#6604) Closes #6437 Related to #6410 - Add example duplicate row to `Non_Unique_Primary_Key`. - Ensure `File.read` fails if the file does not exist, always. - Ensure SQLite fails if file is empty or nonexistent or malformed. - Split file format detection into read and write modes, so that the read mode can depend on actual file _contents_.

enso-bot · 2023-05-09T18:56:28Z

Radosław Waśko reports a new STANDUP for yesterday (2023-05-08):

Progress: Improved Non_Unique_Primary_Key error. Improved SQL File Format detection and in general format detection. It should be finished by 2023-05-09.

Next Day: Next day I will be working on the #6543 task. Work on Date_Range

github-project-automation bot added this to Issues Board Apr 26, 2023

github-project-automation bot moved this to ❓New in Issues Board Apr 26, 2023

radeusgd added p-lowest Should be completed at some point -libs Libraries: New libraries to be implemented l-db-write Libraries: database writer labels Apr 26, 2023

radeusgd mentioned this issue Apr 26, 2023

Create database table from memory #6429

Merged

5 tasks

jdunkerley added this to the Beta Release milestone May 2, 2023

jdunkerley assigned radeusgd May 2, 2023

jdunkerley moved this from ❓New to 📤 Backlog in Issues Board May 2, 2023

jdunkerley mentioned this issue May 3, 2023

Design for Write Database #5161

Closed

enso-bot bot mentioned this issue May 4, 2023

Ability to connect and read from S3 #5777

Closed

3 tasks

radeusgd moved this from 📤 Backlog to 🔧 Implementation in Issues Board May 5, 2023

radeusgd moved this from 🔧 Implementation to 👁️ Code review in Issues Board May 8, 2023

radeusgd mentioned this issue May 8, 2023

Improve Non_Unique_Primary_Key error, split file format detection into read/write, improve SQLite format detection #6604

Merged

5 tasks

mergify bot closed this as completed in #6604 May 9, 2023

github-project-automation bot moved this from 👁️ Code review to 🟢 Accepted in Issues Board May 9, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Include indices of duplicate rows in the `Non_Unique_Primary_Key` error #6437

Include indices of duplicate rows in the `Non_Unique_Primary_Key` error #6437

radeusgd commented Apr 26, 2023

radeusgd commented May 5, 2023 •

edited

Loading

enso-bot bot commented May 9, 2023

Include indices of duplicate rows in the Non_Unique_Primary_Key error #6437

Include indices of duplicate rows in the Non_Unique_Primary_Key error #6437

Comments

radeusgd commented Apr 26, 2023

radeusgd commented May 5, 2023 • edited Loading

enso-bot bot commented May 9, 2023

Include indices of duplicate rows in the `Non_Unique_Primary_Key` error #6437

Include indices of duplicate rows in the `Non_Unique_Primary_Key` error #6437

radeusgd commented May 5, 2023 •

edited

Loading