-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
What to do when no variance in weight
bounds
#218
Conversation
Can you please give a bit more context for this issue? Why/when this happens? |
If you are synthesising a small network, you might end up with, e.g., all edges with the same slope (this is more likely than it seems because of the method we use to penalise different slopes - see https://doi.org/10.1016/j.compenvurbsys.2019.101370) - I guess even if that wasn't the case, it is entirely plausible that other types of weights will be added in the future which may also suffer this. Weight's are all normalised between 0 and 1 - so if we have all the same slope the normalisation will throw a divide by 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've a comment in relation to the tests - which also affect other tests. Otherwise, it this works, that's OK.
In my experience, this manipulation is tricky and might end up in unexpected results since dividing by something really small will result in something very big, but still meaningful. It might be more robust to address the issue where the division by zero is taking place.
In graph theory, it's common practice to set edge weights to values higher than 1. It depends on the application, but in this case, when weight doesn't carry specific physical meaning, and it's just a cost measure in path traversal, why normalize between 0-1 and not 1-5, or something like that. It doesn't even have to be a float, it can be an integer that acts as a classifier. So, as @dalonsoa said, it's better to prevent this numerical issue from occurring instead of ad-hoc solutions. |
@dalonsoa I guess you are suggesting to make the check here: weight = max((d[attr] - bds[0]) / (bds[1] - bds[0]), eps) Though the Actually perhaps the tidiest alternative would be that if they are the same to just drop that |
tidy up
Right, my main point is that normalizing weight is not as straight forward as just using a min-max normalizer. Since the weight is not a physical property, you'd have to be careful about how does the normalizer interpret different values. In my methodology, I opted for a discrete normalizer, so I can clearly assign a physical meaning to weight. For example, for slope, we can use low, mild, and steep as descriptors for physical interpretation, then based on a pre-defined range bin the values and assign weights of 1, 2, and 3. This way, when the slope difference is in the order 1e-3, values might fall into the same category. |
Is that not exactly what is being done in this paper - albeit with continuous numerical values rather than discrete categories? All the slopes that are between - I mean I can't quite tell, but say 0.5% and 2% are assigned a weight of 0 and thus treated equally. Regardless though - I'll copy your comment into the other issue: #221 . Even using discrete weights the code still needs to know how to handle no variation in base data used for weight calculation - which is what this issue handles. And I think now the way it's done following the edits to this PR (i.e., don't include that weight in the shortest path calculation) - makes sense. |
You're having issue with normalizing data and weight calculation, and I am suggesting using a more robust and physically based normalization approach than a simple min-max normalizer. If you prefer using min-max normalizer and think it gives you good results as a general solution, that's fine. |
Don't worry - it's on the radar #221 - I just don't have time to solve everything for the paper |
Description
fix and test bug
Fixes #215