Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issue with thread-safety #76

Open
fabgeyer opened this issue May 9, 2019 · 7 comments
Open

Issue with thread-safety #76

fabgeyer opened this issue May 9, 2019 · 7 comments

Comments

@fabgeyer
Copy link

fabgeyer commented May 9, 2019

I have some code using the DNC library where I perform analyses of multiple networks in different threads. This leads to errors in the analyses which do not appear when only one thread is used.

Some debugging lead me to ArrivalBoundDispatch.java and the arrivalbound classes. Singletons are used for doing the computations, and use setServerGraph(server_graph) and setConfiguration(configuration) to configure it, before calling computeArrivalBound. The issue is that if there are multiple threads, you might have the case that setServerGraph is executed by one thread, while another thread is still executing computeArrivalBound. I validated my theory by creating a new arrivalbound instance for each computation, resulting in no errors. ArrivalBoundCache, in its current implementation, is also not thread-safe.

Hence, I suggest to modify DNC library in order to use thread-safe data structures, and thus enable parallelization of (some) computations.

@matyesz
Copy link
Collaborator

matyesz commented May 9, 2019

First of all, thanks for the feedback, we are happy to see the first users.
When we were creating v2.5 we recognized that singleton usage. Fortunatelly in none of the docs we mention that the code is thread-safe. Be aware that the config itself is singleton, so you can compute only with the same config on different threads. I do not know whether we plan any parallelism at the moment, would leave that to Steffen.

@sbondorf
Copy link
Collaborator

To me, the wrap-up of the problem is: Having an internal state in a singleton is very error prone.

There are already thoughts on my side, inspired by the way we moved Calculator.java "to the top-most level" in v2.5. It probably requires a considerable rework:

  • Analysis.java is currently just an interface for tandem analyses and it is therefore located in the package org.networkcalculus.dnc.tandem. It should be renamed to TandemAnalysis.java.
  • Then, there should be a new Analysis.java that really does what it’s name implies. An Analysis instance (or, in a more refined hierarchy, an instance of the intermediate AlgebraicAnalysis.java) would then have member variables for everything that is currently configured in AnalysisConfig.java: the Server Graph instance, instances of the ArrivalBoundings, and an instance of the Cache.
  • AnalysisConfig.java would then be "degraded" to an easy storage mechanism for the analysis setting, being used as the argument for the (Algebraic)Analysis.java constructor. Therefore, the existing code that executes the analysis should stop checking an AnalysisConfig.java instance. Instead, it should operate on an instance of (Algebraic)Analysis.java: work with the instances of analysis-parts stored in there, do not check the AnalysisConfig’s flag and try to use a singleton whose internal state you may manipulate concurrently without recognizing it.

@matyesz
Copy link
Collaborator

matyesz commented May 10, 2019

Agreed. We should remove those singletons and take a task to check classes for thread-safety. Hopefully no other issues.

@matyesz
Copy link
Collaborator

matyesz commented May 16, 2019

The problem as I see after checking in ArrivalBound Dispatch.computeArrivalBounds:
not thread safe ->
case AGGR_PBOO_PER_SERVER: AggregatePboo_PerServer aggr_pboo_per_server = AggregatePboo_PerServer.getInstance(); aggr_pboo_per_server.setServerGraph(server_graph); aggr_pboo_per_server.setConfiguration(configuration); arrival_bounds_tmp = aggr_pboo_per_server.computeArrivalBound(turn, flows_to_bound, flow_of_interest); break;

Same switch but thread-safe:
case SEGR_PBOO: for (Flow flow : flows_to_bound) { SeparateFlowAnalysis sfa = new SeparateFlowAnalysis(server_graph); sfa.performAnalysis(flow, flow.getSubPath(flow.getSource(), turn.getSource())); arrival_bounds_tmp = getPermutations(arrival_bounds_tmp, singleFlowABs(configuration, flow.getArrivalCurve(), sfa.getLeftOverServiceCurves())); }

Why do we have 2 approaches? What is the difference between the 2?
I do not see how AnalysisConfig would solve the problem of getInstance approach, here I would just switch to new instances for each call. I do not understand as point of design why getinstance is good here.

@sbondorf
Copy link
Collaborator

These are two different ways to compute an arrival curve bounding the arrivals of flows in flows_to_bound coming via turn. The former tries to aggregate flows as much as possible but to do so, it cannot analyze these flows in an end-to-end (end-to-point of interference) fashion. The latter does things vice versa.

Employing only the PBOO principle, the SEGR_PBOO alternative is always worse. See

@inproceedings{BS15-2,
  author    = {Steffen Bondorf and Jens B. Schmitt},
  title     = {Calculating Accurate End-to-End Delay Bounds – You Better Know Your Cross-Traffic},
  booktitle = {Proceedings of the 9th International Conference on Performance Evaluation Methodologies and Tools (VALUETOOLS 2015)},
  year      = {2015}
}

When we can make use of PMOO,, however, the SEGR_PMOO alternative can be better in some corner cases. See

@inproceedings{BNS18,
	Author = {Steffen Bondorf and Paul Nikolaus and Jens B. Schmitt},
	Booktitle = {Proceedings of 19th International GI/ITG Conference on Measurement, Modelling and Evaluation of Computing Systems (MMB)},
	Title = {{Catching Corner Cases in Network Calculus -- Flow Segregation Can Improve Accuracy}},
	Year = {2018}
}

As to why the design is inconsistent w.r.t. thread-safety, the answer is simply because the code grew organically with the understanding of this matter and thread-safety had never been in the focus.
Also, the tool was never (re-)built based on a full insight on the topic (maay it be the best computation or doing them in a thread-safe way). In particular, for the AGGR_* arrival bounds, at the beginning hopes were high that we just need to call the static computation of a tandem left-over service curve. But then the backtracking of flow aggregates had to be add in between the dispatcher and the left-over service curve computation. That intermediate layer, in turn, needed an internal state (in contrast to the SEGR_* code) and all this led to today's code.

For the proposed new Analysis.java class I intended to have it store a single config and one server graph only. Additionally, it can store instances of arrival bounding methods. When the Analysis.java object's config or server graph changes, then the stored arrival bounding instances get updated (above: AggregatePboo_PerServer.getInstance(); and aggr_pboo_per_server.setServerGraph(server_graph);). The switch-case statement in the dispatcher will not query the list of enum items in AnalysisConfig anymore. Instead it will iterate over the instances arrival bounding in the Analysis.java object and that them with the current argument list turn, flows_to_bound, flow_of_interest. These adaptations make different Analysis.java object independent from each other; they cannot interfere with each other by changing a singleton's state anymore, right? Last, in order to really offer a thread-safe version, we need to lock the Analysis.java object for the time the computations are executed in order to prevent a modifying anything half-way through an analysis.

@matyesz
Copy link
Collaborator

matyesz commented May 16, 2019

In my opinion the analysisconfig will not solve the issue as the problem is you update the references/values in a a class that has only one instance in the memory while another thread is using it. The solution is to remove singleton and create new instances with each calculation as it is done for the second example I gave. That is why I asked whether it was a must to use singleton in the first one. Singleton classes with instance variables are not thread-safe.

@sbondorf
Copy link
Collaborator

Sure, the singleton will be removed (sorry if that wasn't clearly stated). Creating a new instance every time we reach this location in the code introduces unnecessary overhead, too. I think one instance per Analysis object is sufficient because the concurrent modifications of the singleton come from different analysis configurations. So I propose to embed the instance the an analysis uses into it and shield it from the other potential analysis instances. The same holds for the arrival bounds cache that I intend to merge into the v2.6-dev branch soon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants