-
Notifications
You must be signed in to change notification settings - Fork 127
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Update copy analysis result and experiment data behavior #1432
base: main
Are you sure you want to change the base?
Update copy analysis result and experiment data behavior #1432
Conversation
cb2091b
to
3c26036
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I had a few questions but this looks like the right fix.
@@ -204,16 +204,26 @@ def clear(self): | |||
with self._lock: | |||
self._data = pd.DataFrame(columns=self.DEFAULT_COLUMNS) | |||
|
|||
def copy(self): | |||
def copy(self, new_ids: bool = True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nothing uses new_ids=False
, so should we just not include that option for now? It's a question of why a user is calling copy
. If making a copy, it seems likely the user would want to change something and then in that case likely that new IDs should be used?
new_ids: Whether to generate new IDs for copied entries. Defaults to True. | ||
|
||
Returns: | ||
A new shallow copied DataFrame object. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't like the use of DataFrame
here because the DataFrame
is an internal container of the data not directly exposed to the user. Also, maybe the point about being shallow is better in a .. note::
block like it was and here it could just say A copy of the analysis result table
? I think the original note was not clear enough. The default columns of the data frame are all strings and numbers which are all immutable so the only difference between being shallow and deep is saving some memory. However, I think it's possible that the user could add a column with mutable objects? The concern there is that then mutating the objects in the copied table would also mutate the ones in the original. Maybe this is a pretty minor concern -- would a user add mutable objects to an analysis result table?
exp_data.add_data(self._get_job_result(1)) | ||
copied = exp_data.copy(copy_results=False) | ||
self.assertEqual(exp_data.data(), copied.data()) | ||
self.assertFalse(copied.analysis_results()) | ||
self.assertEqual(exp_data.provider, copied.provider) | ||
|
||
def test_copy_analysis_results(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is good. Should there be a test of copy in test_analysis_results_table.py
as well?
Summary
Fixes #1396. When an
ExperimentData
object was copied, it didn't update the analysis result IDs and their associated experiment IDs, resulting in the copied experiment entry not having analysis results in the cloud service.Details and comments
AnalysisResultTable
isn't very efficient because it has to loop through the rows and call_create_unique_hash()
for each ID. Maybe this logic can be refactored.