-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Metrics calculations change data structure #82
Comments
Thanks for posting this, @vcloitre. So it appears that this dataframe-to-series "conversion" happens whether or not you have the no-write evaluator, which would have been my first assumption. It also happens whether it is a metric or a root metric, which may be irrelevant because something funny has to be happening with the Evaluator class and/or how it handles pandas. There's a lot going on that I don't understand in the Evaluator class so I'm wondering if @scopatz could weigh in here? |
Just a precision : resetting the index of the 'resources' DataFrame in my example fixes somewhat the problem. It is not a perfect fix because then I have two index columns. |
Status update @opotowsky? is this blocking for 1.4? |
@opotowsky - if you can get to it within the next ~day, then please do - otherwise we'll 1.5 it. |
I'll have to move it to 1.5. Other more pressing issues.... (testing failures) |
I noticed that whenever I calculate a metric (let's take the example of the 'Materials' metric), the structure of the dependencies that are used to calculate the metrics are changed (for 'Materials', dependencies are the 'Resources' and the 'Compositions' metrics).
Here is a gist that explains what change occur :
https://gist.github.com/vcloitre/b3b55df38568708be4a6
As you can see, resources['Quantity'](which is basically the column 'Quantity' of the 'Resources' metric) does not have the same structure before and after calculating the 'Materials' metric.
Indeed, instead of the 'Quantity' column represented as a Pandas Series, resources['Quantity'] gives a MultiIndex Series after calculating the 'Materials' metric. The problem is that you cannot do the same operations with this new structure. That is the reason why it can lead to errors when manipulating the metrics represented as Pandas DataFrames.
The text was updated successfully, but these errors were encountered: