Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add solver parallelism support #2404

Closed
montyly opened this issue Mar 24, 2021 · 5 comments · Fixed by #2420
Closed

Add solver parallelism support #2404

montyly opened this issue Mar 24, 2021 · 5 comments · Fixed by #2420
Assignees

Comments

@montyly
Copy link
Member

montyly commented Mar 24, 2021

dev-evm-experiments contain experimental support for solver parallelism.

The solution I followed was to create a ALLSolver object inheriting from SMTLIBSolver, which use a Smtlib2Proc instead of SmtlibProc

Smtlib2Proc follows SmtlibProc, but:

  • Keep a list of _procs, which will contain a list of solvers processes.
  • When a command is sent, it will be sent to all the solvers alive
  • When recv is called, it is a bit more complex:

for idx, proc in enumerate(self._procs):
if proc:
p = Process(target=_recv, args=(proc, res, names[idx]))
p.start()
all_process[names[idx]] = p
# if self._proc[0]:
# p0 = Process(target=_recv, args=(self._proc[0], res, 'yices', is_sat))
# p0.start()
# if self._proc[1]:
# p1 = Process(target=_recv, args=(self._proc[1], res, 'cvc4', is_sat))
# p1.start()
# if self._proc[2]:
# p1 = Process(target=_recv, args=(self._proc[1], res, 'boolector', is_sat))
# p1.start()
start = time.time()
while True:
for key, p in all_process.items():
if not p.is_alive():
won = True
for other_key, other_p in all_process.items():
if other_key == key:
continue
if other_p.is_alive():
other_p.terminate()
self._stop(self._procs[names_to_idx[other_key]].popen)
#self._proc[names_to_idx[other_key]] = None
self._procs[names_to_idx[other_key]] = self._start(self._commands[names_to_idx[other_key]])
self._procs[names_to_idx[other_key]].popen.stdout.flush()
self._procs[names_to_idx[other_key]].popen.stdin.write(f"{self._buffer}\n") # type: ignore
self._procs[names_to_idx[other_key]].popen.stdin.write(f"{self._last_cmd}\n") # type: ignore
if self._last_cmd:
self._procs[names_to_idx[other_key]].is_late = True
else:
won = False
if res[key] == "unknown":
SOLVERS_RESULTS["unknown"] += 1
elif won:
SOLVERS_RESULTS[key] += 1
print(SOLVERS_RESULTS)
return res[key]
# if p0 and not p0.is_alive():
# if p1 and p1.is_alive():
# p1.terminate()
# self._stop(self._proc[1])
# # self._proc[1] = self._start(self._commands[1])
# # self._proc[1].stdout.flush() # type: ignore
# # self._proc[1].stdin.write(f"{self._buffer}\n") # type: ignore
# self._proc[1] = None
# Constant.YICES_WON += 1
# print(f'yices: {Constant.YICES_WON} cvc4: {Constant.CVC4_WON}')
# return res['yices']
# if p1 and not p1.is_alive():
# if p0 and p0.is_alive():
# p0.terminate()
# self._stop(self._proc[0])
# # self._proc[0] = self._start(self._commands[0])
# # self._proc[0].stdout.flush() # type: ignore
# # self._proc[0].stdin.write(f"{self._buffer}\n") # type: ignore
# self._proc[1] = None
# Constant.CVC4_WON += 1
# print(f'yices: {Constant.YICES_WON} cvc4: {Constant.CVC4_WON}')
# #traceback.print_exc()
# return res['cvc4']
time.sleep(0.01)
if time.time() - start > 2.5 * 60:
break

The main idea is to wrap the solvers to a Process (from multiprocessing.context) and then loop until one of the solver replies.

Once a solver replies, all the other processes that did not reply will be killed. If no solver found a solution after 2.5minutes, the process stops.

Once a process is killed, we restart a new process, and we send back the formulas. One trick here is to NOT send check-sat on solvers that did timeout

if cmd != '(check-sat)' and "get-value" not in cmd:
self._buffer += cmd
self._last_cmd = ""
else:
self._last_cmd = '(check-sat)'

The reasoning behind the choice is that we have a lot of situation where we loop over the solver results, like

- build path
- check-sat
- get-value A
- assert A != last value returned
- check-sat
- get-value A

In this situation, we do not want to send again the first check-sat, but the solver might performs better once assert A != last value returned is added.

We also probably want a way to select some solvers that are going to be run in parallel. Currently, the user uses --smt.solver all, but we should support something like --smt.solver boolector,yices

Overall the code in dev-evm-experiments is not clean, nor robust enough. It has a lot of hardcoded values/todo and need to be addressed

@ggrieco-tob
Copy link
Contributor

pysmt has a similar feature, which is implemented using multiprocessing, but in particular, with Pipes:

https://github.com/pysmt/pysmt/blob/master/pysmt/solvers/portfolio.py

Any comments on this approach? (perhaps why it was not used?)

@montyly
Copy link
Member Author

montyly commented Mar 26, 2021

In favor of implementing this ourselves:

  • It's not so difficult to implement
  • Interfacing with pysmt might require some refactoring in our codebase, which defeats the purpose
  • Relying on an external dependency can slow us down

But on the other hand, pysmt can be a good solution if:

  • It turns out it's difficult to implement this feature (which I don't think it is seeing my initial test in dev-evm-experiments)
  • pysmt provides optimizations that we don't have, and we won't have in the short term
  • it provides a better framework to manipulate our formulas

I don't have a strong opinion, but I would be in favor of not adding another dependency here, unless we have strong benefits

@ggrieco-tob
Copy link
Contributor

Just to be clear, I'm not suggesting using pysmt, but re-using their approach using Pipe (and probably some code if the licenses are compatible).

@ggrieco-tob
Copy link
Contributor

@ekilmer himself worked on the Portfolio class for pysmt, so perhaps he has something to recommend 😃

@ekilmer
Copy link
Contributor

ekilmer commented Mar 26, 2021

I was only playing around with the portfolio feature for some light experimentation and found an inconsistency that I was able to pretty quickly fix.

I didn't dig too deep into the code itself, but it's a relatively short file with what looks to be a good and working design using standard Python libraries, so I think implementing something similar shouldn't be too hard 🙂

Would be a good excuse to look more into Python handling of concurrency.

@ehennenfent ehennenfent linked a pull request Apr 6, 2021 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants