Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lightweight JS dataframe library #75

Open
paddymul opened this issue Oct 24, 2023 · 1 comment
Open

Lightweight JS dataframe library #75

paddymul opened this issue Oct 24, 2023 · 1 comment
Labels
enhancement New feature or request Future-exploration good first issue Good for newcomers JS requires js work to fix Lightweight-JS-Dataframe

Comments

@paddymul
Copy link
Owner

paddymul commented Oct 24, 2023

I would like to have better frontend dataframe manipulation for simple tasks.

  1. I want a standard API that sends data to ag-grid, abstracting over serialization format.
  2. An API similar to https://github.com/data-apis/dataframe-api (it's a good baseline)
  3. Allow some type of syntax for combining columns into arrays so you can have [cleanedVal, origVal, annotation] from raw columns, the combination done in JS. Auto-cleaning affordance in the table #74
  4. Very advanced function, lazy loading of additonal data
  5. Filtering

I want to research the existing js dataframe like libraries. I don't think they offer this functionality, but I want to check.

Once the API is decided on and implemented, this library will enable performance increases through better serialization.

@paddymul paddymul added enhancement New feature or request good first issue Good for newcomers JS requires js work to fix Future-exploration Lightweight-JS-Dataframe labels Oct 24, 2023
@paddymul
Copy link
Owner Author

For notes, I'm adding my thoughts on this that I sent to @MarcoGorelli


Looking through the dataframe standard, the next time I improve on the JS deserialization of dataframes, I'm going to implement the dataframe_standard in JS.

This will allow me to decouple the application side from the serialization side. Then I can change the core serialization. I'm specifically thinking of higher performance serializations based on TypedArrays and base64. I notice that the standard doesn't have a "get_row_by_id" or "get_as_list_of_dicts". I understand why those aren't important for numeric python/C programming, but they are very common in JS.

FWIW I have written multiple versions of Dataframe to JSON to JS serialization over the past decade. All jank, and slow. I want to build one properly, also the python side.

No promises on when I will build this, but I will use your code as the basis for it next time I do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request Future-exploration good first issue Good for newcomers JS requires js work to fix Lightweight-JS-Dataframe
Projects
None yet
Development

No branches or pull requests

1 participant