Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feat: data list metadata #2529

Open
wants to merge 3 commits into
base: master
Choose a base branch
from
Open

Conversation

chrismclarke
Copy link
Member

@chrismclarke chrismclarke commented Nov 15, 2024

PR Checklist

  • PR title descriptive (can be used in release notes)

Description

Add methods to parser to add a _metadata column to data_lists and track data types of non-string columns

This is designed to improve on the type infer methods used in #2513 (can update post-merge), so that set_data operations can correctly use the expected column data types.

Author Notes

Use
There are 2 methods now included to infer data types within the parser

  1. If the author includes an @metadata row can manually specify type: number/boolean within the corresponding column cell.
  2. Find first non-empty data entry and use type from that (will check all rows if required).

Note - only number or boolean types need to be specified as assume string by default, an if using nested object types these will automatically be detected when creating the data_list (columns would be formatted with . notation, e.g. time.hour and time.minute.

Data Changes
The corresponding generated _metadata will be included in the parsed data_lists github. Metadata is only included if specific entries exist (so will not appear for simple data_lists where every column is represented by string values)

Quality Control
I had originally planned to include errors or warnings when the data_type could not be inferred (e.g. no data in the column), however realised this is probably a fairly common case and having lots of warnings would not be so helpful (example screenshot)

Example - warning log output - not currently in use
image

So instead, when data types cannot be inferred it will simply be assumed that they are designed to be strings. The data types shouldn't really be all that important unless using set_data operation to update data from action list strings (or in the future when adding new rows).

Similarly metadata is only included for columns that are not formatted as strings, to avoid overload of potentially less helpful information.

Future Plans
With the use of the metadata column we could look to expand in the future to include more things features such as data validation or default values

Review Notes

The easiest way to examine the new metadata is to sync and review the deployment repo diffs generated

yarn workflow sync;
yarn workflow repo open

Dev Notes

If this happens to be merged before #2513 I can update, but I don't think it should block it in any way (non-breaking changes, can just be implemented in future update when reviewing add_data action)

Git Issues

Closes #

Screenshots/Videos

Example - metadata now included in data_lists for all non-string columns (as inferred from data)
image

@github-actions github-actions bot added feature Work on app features/modules scripts Work on backend scripts and devops labels Nov 15, 2024
@chrismclarke chrismclarke marked this pull request as ready for review November 15, 2024 01:31
@github-actions github-actions bot added feature Work on app features/modules and removed feature Work on app features/modules labels Nov 15, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature Work on app features/modules scripts Work on backend scripts and devops
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant