Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix MemoryDataSet not displaying on metadata panel #1113

Merged
merged 12 commits into from
Oct 3, 2022
Merged

Conversation

rashidakanchwala
Copy link
Contributor

@rashidakanchwala rashidakanchwala commented Oct 3, 2022

Description

Resolves #1093

Development notes

@deepyaman , @tynandebold -- I have made an attempt to fix this issue at the backend. Let me know if this is good approach.

Basically, in the kedro-viz backend we add datasets to the node based on the catalog entries. If there's no catalog entry for a dataset, we just leave it blank. I have fixed this part by adding MemoryDataSet() to a dataset without any catalog entry.

Also MemoryDataSets don't have any file path which was a sort of required field in the ResponseAPI for DataNodes (which is why u got the pydantic error), so I have made that optional.

QA notes

I will add the necessary tests once everyone's ok with this fix.

Checklist

  • Read the contributing guidelines
  • Opened this PR as a 'Draft Pull Request' if it is work-in-progress
  • Updated the documentation to reflect the code changes
  • Added new entries to the RELEASE.md file
  • Added tests to cover my changes

@rashidakanchwala rashidakanchwala removed the request for review from limdauto October 3, 2022 10:51
Copy link
Contributor

@antonymilne antonymilne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This feels like the right fix to me but is exactly what @deepyaman tried here and apparently didn't solve it? #1093 (comment)

modifying https://github.com/kedro-org/kedro-viz/blob/v5.1.1/package/kedro_viz/data_access/repositories/catalog.py#L59 to dataset_obj = MemoryDataSet() causes the flicker regardless of whether or not the MemoryDataSet is explicitly defined in the catalog.

Comment on lines 59 to 62
dataset_obj = None
# if dataset has no catalog entry, it is a MemoryDataSet
if not dataset_obj:
dataset_obj = MemoryDataSet()
Copy link
Contributor

@antonymilne antonymilne Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this any different from the below? Can dataset_obj ever evaluate to false if the exception doesn't get thrown?

Suggested change
dataset_obj = None
# if dataset has no catalog entry, it is a MemoryDataSet
if not dataset_obj:
dataset_obj = MemoryDataSet()
dataset_obj = MemoryDataSet()

Copy link
Contributor Author

@rashidakanchwala rashidakanchwala Oct 3, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it work's if you just add it in the exception. the reason why it flickered was because filepath is none for MemoryDataSets and it is a required field in DataNodeMetaDataResponseAPI which now I have made Optional.

I will make the above change :)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with this. Yeah, I just didn't understand/get to finding what was happening with the validation error, but this solves it.

@@ -2,8 +2,9 @@
centralise access to Kedro data catalog."""
# pylint: disable=missing-class-docstring,missing-function-docstring,protected-access
from typing import Dict, Optional
from joblib import Memory
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
from joblib import Memory

Comment on lines 5 to 7

X_train:
type: MemoryDataSet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
X_train:
type: MemoryDataSet

I guess we can revert it?

@@ -55,8 +56,9 @@ def get_dataset(self, dataset_name: str) -> Optional[AbstractDataSet]:
dataset_obj = self._catalog._get_dataset(dataset_name, suggest=False)
else: # pragma: no cover
dataset_obj = self._catalog._get_dataset(dataset_name)
# if dataset has no catalog entry, it is a MemoryDataSet
except DataSetNotFoundError: # pragma: no cover
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wonder if this should be covered in tests, given that it does make a functional difference.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yes, i will write the tests now !

@@ -55,8 +56,9 @@ def get_dataset(self, dataset_name: str) -> Optional[AbstractDataSet]:
dataset_obj = self._catalog._get_dataset(dataset_name, suggest=False)
else: # pragma: no cover
dataset_obj = self._catalog._get_dataset(dataset_name)
# if dataset has no catalog entry, it is a MemoryDataSet
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: I think this is bordering on a redundant comment (the code seems really self-explanatory), but I suppose one could argue that it adds a bit of value. Not strong feelings, just a hunch.

Copy link
Member

@deepyaman deepyaman left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM!

@deepyaman deepyaman changed the title Fix MemoryDataSet Fix MemoryDataSet not displaying on metadata panel Oct 3, 2022
Copy link
Member

@tynandebold tynandebold left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's working nicely. Great job!

@deepyaman deepyaman merged commit d3a76ba into main Oct 3, 2022
@deepyaman deepyaman deleted the fix-memory-dataset branch October 3, 2022 14:18
@tynandebold tynandebold mentioned this pull request Jan 18, 2023
5 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Status: Done
Development

Successfully merging this pull request may close these issues.

Give MemoryDataSet as dataset type
4 participants