Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Plan for external communication / isolation #800

Closed
4 of 5 tasks
RalfJung opened this issue Jun 29, 2019 · 23 comments
Closed
4 of 5 tasks

Plan for external communication / isolation #800

RalfJung opened this issue Jun 29, 2019 · 23 comments
Labels
A-shims Area: This affects the external function shims C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps

Comments

@RalfJung
Copy link
Member

RalfJung commented Jun 29, 2019

Currently, the Miri interpreter only allows one kind of communication between the executed program and the outside world: printing to stdout/stderr. In the long run, that will not be enough. There were already requests for access to the system time, and it seems reasonable to ask for file system or network access. Getting proper randomness into the program might also be interesting.

This issue is basically the current status of #653, now that we [soon will] have a deterministically seeded RNG available in Miri all the time -- so much of the discussion there no longer applies.

It probably makes sense to allow external communication per default, just to get more programs running. But isolation is also a useful property, so I propose a -Zmiri-isolate flag to "turn off" external communication.

Assuming that's not very controversial, we have to decide what to do with the two existing systems we have in place that try hard to avoid allowing external communication:

  • Environment variables. Should we just forward setting/getting env vars to the outside OS per default, and only keep using our current env var emulation layer when -Zmiri-isolate is set? That would resolve Support for accessing host environment variables #670.
  • getrandom. Should we just ask the OS for "real" randomness per default, and only ask our internal RNG when -Zmiri-isolate is set?

Current status

  • Add off-by-default "external communication" mode. Tentative name of the flag: ´-Zmiri-enable-communication.
  • Allow interpreted program to access host env vars if external communication is enabled.
  • Allow interpreted program to access host randomness if external communication is enabled.
  • Come up with a less clumsy name for the flag that enables communication. Maybe -Zmiri-disable-isolation? Is that really less clumsy? -Zmiri-no-isolate?
  • Turn external communication on by default?
@RalfJung RalfJung added A-shims Area: This affects the external function shims C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out labels Jun 29, 2019
@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

I started working on the environment variables part of this issue. I added a new -Zmiri-enable-communication flag for this and I'm updating the get/set/unset env shims.

I am going to try to write the raw bytes of each env-var value and see what happens.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 6, 2019

I'm updating the get/set/unset env shims.

I suggest introducing a new module, shims/env.rs, and moving all the env-related logic there.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

Will do, I haven't decided what's the best way of handle allocations for the host environment variables. Currently I am creating temporary allocations for the variables and then deallocating them after writing.

Is it there any way to write some bytes directly into a PlaceTy instead of having to call write_scalar?

@oli-obk
Copy link
Contributor

oli-obk commented Aug 6, 2019

Can't you invoke the existing setenv logic?

@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

What do you mean?

@RalfJung
Copy link
Member Author

RalfJung commented Aug 6, 2019

There's already code that does all the work of "given a C string, create an allocation for it". You should be able to re-use that. It's in the setenv shim.

The new thing here is that instead of doing this on setenv, it'll have to happen on getenv. So each getenv creates a new allocation that is then leaked. We should probably find a way to collect those, but for a first implementation that would be enough.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

Oh ok yeah, That is what I am doing right now. I am deallocating each temporary allocation after using it.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 6, 2019

Ah, I thought we'd just iterate over the env at startup and clone it into the miri env.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

Well that might be easier and it would work unles someone decides to modify an env var while miri is running or something.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 6, 2019

Ah, I thought we'd just iterate over the env at startup and clone it into the miri env.

Ah. That would fail to reflect later changes from "the outside" -- which is possible in principle. But I would be fine with this strategy as well.

@oli-obk
Copy link
Contributor

oli-obk commented Aug 6, 2019

Uh... I didn't know you could change env vars of a program while it was running

@RalfJung
Copy link
Member Author

RalfJung commented Aug 6, 2019

Hm, maybe you're not supposed to. But on Linux I suppose you could try to write to /proc/$PID/environ...

EDIT: nope, that file is read-only. I guess you are right.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

Apparently you can do it using gdb, but that seems to be an unlikely scenario.

@RalfJung
Copy link
Member Author

RalfJung commented Aug 6, 2019

Agreed. Scanning the host env on initialization is fine. That's great as it makes the "communicate" and "dont communicate" cases much more similar: the only difference is that one starts with an empty env, the other with the real one.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 6, 2019

Ok I will start working on this

Edit: In order to be able to allocate the env vars, we would need access to Memory and TyCtxt. Should I do this inside create_ecx instead of doing it inside Evaluator::new?

bors added a commit that referenced this issue Aug 14, 2019
@RalfJung
Copy link
Member Author

Status update: accessing host env vars is now possible with -Zmiri-enable-communication.

The name for that flag is preliminary, as is the fact that the current default for communication is off.

Also, IMO that flag should also make getrandom calls from the program use host randomness. Any objections to that?

@oli-obk
Copy link
Contributor

oli-obk commented Aug 15, 2019

Having a bunch of env vars given at startup feel very different to arbitrary communication at runtime, since everything is still deterministic given the same env vars

@RalfJung
Copy link
Member Author

RalfJung commented Aug 15, 2019

As far as I am concerned, the purpose of this "external communication" flag is to give up on determinism.

If we want to allow more things than "total isolation" while still being deterministic, I am open for that, but that's a gray area. For example, you could argue that allowing file access is also still deterministic if all accessed files are the same.

@pvdrz
Copy link
Contributor

pvdrz commented Aug 15, 2019

I think one could exploit something like the order of variables in #756 to do stuff in a non deterministic way.

Edit: nvm, if we are using the host's randomness this is not even a real concern

@oli-obk
Copy link
Contributor

oli-obk commented Aug 15, 2019

True. We can always add that back later if it seem desirable.

So..

Any objections to that?

Nope, bring on the cosmic rays

@RalfJung
Copy link
Member Author

@christianpoveda do you want to try implementing that?

@pvdrz
Copy link
Contributor

pvdrz commented Aug 16, 2019

I'm working on it.

Edit: Now we are using the host's RNG when the communication flag is enabled

bors added a commit that referenced this issue Aug 20, 2019
Use host's rng when communication is enabled

This uses the host's randomness when the communication enabled flag is used. I am not sure about the error handling. I was thinking about fallbacking to `rand` if `getrandom` fails and also print something so the user knows miri is not using the host's rng because it failed. Let me know what you think.

Related issue: #800.

r? @RalfJung @oli-obk
bors added a commit that referenced this issue Aug 23, 2019
Use host's rng when communication is enabled

This uses the host's randomness when the communication enabled flag is used. I am not sure about the error handling. I was thinking about fallbacking to `rand` if `getrandom` fails and also print something so the user knows miri is not using the host's rng because it failed. Let me know what you think.

Related issue: #800.

r? @RalfJung @oli-obk
@RalfJung RalfJung added C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps and removed C-proposal Category: a proposal for something we might want to do, or maybe not; details still being worked out labels Aug 23, 2019
@RalfJung
Copy link
Member Author

Closing this: we now have working external communication, and when/if we want to enable that by default, that should be a new discussion IMO. For now, off-by-default seems to still work fine.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
A-shims Area: This affects the external function shims C-project Category: a larger project is being tracked here, usually with checkmarks for individual steps
Projects
None yet
Development

No branches or pull requests

3 participants