Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Complex case leads to free before allocation #997

Open
scolsen opened this issue Nov 22, 2020 · 21 comments
Open

Complex case leads to free before allocation #997

scolsen opened this issue Nov 22, 2020 · 21 comments
Labels
bug haskell memory Borrow checker and lifetimes

Comments

@scolsen
Copy link
Contributor

scolsen commented Nov 22, 2020

I've run into a case where a String is freed before it's actually used, here's the code:

(defn binary-partial [f x]
      (fn [y]
        (f x y)))
(doc lift-binary
  "Lifts a binary function into an applicative structure.")
    (defn lift-binary [f cx cy]
      (sequence (fmap &(binary-partial binary-partial f) cx) cy))

(deftype (Box a) [of a])

(defmodule Box
  (implements fmap fmap)
  (defn fmap [f b]
    (let [val (~f @(Box.of &b))]
    (Box.init val)))

  (implements fmap! fmap!)
  (defn fmap! [f b]
    (let [val (~f @(Box.of b))]
    (Box.set-of! b val)))

  (implements lift Box.init)

  (implements sequence seq)
  (defn seq [box-f box]
    (let [f @(Box.of &box-f)
          v @(Box.of &box)]
      (Box.init (f v))))

  (implements sequence! seq!)
  (defn seq! [box-f box]
    (let [f @(Box.of box-f)
          v @(Box.of box)]
      (Box.set-of! box (f v))))

)

Loading up this code and then running:

鲤 (defn foo [] (Applicative.lift-binary (fn [x y] (String.append &x &y)) (Box.init @"foo") (Box.init @"bar")))
鲤 (foo)
Compiled to 'out/Untitled' (executable)
Untitled(3776,0x10c0a7dc0) malloc: *** error for object 0x7fc502c05830: pointer being freed was not allocated
Untitled(3776,0x10c0a7dc0) malloc: *** set a breakpoint in malloc_error_break to debug
[RUNTIME ERROR] '"out/Untitled"' exited with return value -6.

It emits a free before allocation here:

void FP__Lambda_binary_MINUS_partial__String_String_String_12_env_delete(FP__Lambda_binary_MINUS_partial__String_String_String_12_env* p) {
    Function_delete__String_String_String(p->f);
    String_delete(p->x);
}

commenting out String_delete(p->x); makes it run fine:

./foo
(Box @"foobar")

I thought this might have to do with capturing vars, but it perhaps doesn't since this works fine:

鲤 (defn foo [] (Applicative.lift-binary (fn [x y] (+ x y)) (Box.init 1) (Box.init 2)))
鲤 (foo)
Compiled to 'out/Untitled' (executable)
(Box 3)
=> 0

Seems specifically related to Strings or String.appendas well, not references in general:

鲤 (defn foo [] (Applicative.lift-binary (fn [x y] (add-ref &x &y)) (Box.init 1) (Box.init 2)))
鲤 (foo)
Compiled to 'out/Untitled' (executable)
(Box 3)
=> 0
@scolsen scolsen added the bug label Nov 22, 2020
@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

Ah, the key is managed v. non-managed members:

void FP__Lambda_binary_MINUS_partial__int_int_int_12_env_delete(FP__Lambda_binary_MINUS_partial__int_int_int_12_env* p) {
    Function_delete__int_int_int(p->f);
    /* Ignore non-managed member 'x' : Int */
}

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

Relevant code is in Concretize.hs lines 94-170, in particular line 147 which generates the deleter

@eriksvedang
Copy link
Collaborator

Interesting find! Do you have a fix in mind?

@eriksvedang eriksvedang added this to the 0.7.0 – No major bugs milestone Nov 22, 2020
@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

no great ideas yet--I think we should fix this particular case, but it did give me a more general feature idea that might be interesting. We could add a persist, manual, or leak that turns off management for a particular binding so that one would need to allocate and delete manually and none of the automatic emissions for such things would happen.

Another really interesting way we could support it is by defining a special Mem type that takes a type as an argument and emits no allocations/deallocations unless explicitly called, for example one would have to do something like:

(sig allocate (Fn [a] (Mem a)))
(sig free (Fn [(Mem a)] ()))

(let-do [s (allocate "foo")]
        ...do some things
        (free s))
    

of course, one is just bypassing the borrow checker at that point--it'd be similar to unsafe calls/bypass mechanisms in other languages. But maybe something like this is already possible?

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

actually, taking a quick look at which types are managed, something like what I'm describing is potentially already possible with the Ptr type

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

I figured out a workaround but it's a bit complicated. Assuming address works in the REPL (I have a branch ready to enable this), one can do the following:

(def a @"")
(def b @"")
(Applicative.lift-binary (fn [x y] (String.append &(Pointer.to-value x) &(Pointer.to-value y))) (fmap &(adr a) (Box.init @"foo")) (fmap &(adr b) (Box.init @"bar")))
Compiled to 'out/Untitled' (executable)
(Box @"foobar")

here's how the adr works:

(defmacro adr [proxy] (list 'fn (array 'x) (list 'do (list 'set! proxy 'x) (list 'address proxy))))

And here's the form it produces:

鲤 (adr a)
=> (fn <> [x] (do (set! a x) (address a)))

The idea here is that we get around the deletion by fmapping the address of a proxy definition that holds the initial value in global scope, ensuring the lambda doesn't reclaim it.

EDIT: So this is a workaround of sorts, but it isn't entirely useful unless we have a form that allows producing global definitions in a local context.

@eriksvedang
Copy link
Collaborator

Huh, interesting workaround – but we really should make the original code work (or not typecheck at all).

Could this be a lifetime error? I don't really see (at a glance) where the problematic situation occurs in the original code.

@eriksvedang
Copy link
Collaborator

Does this lambda need to capture anything? (is that the error?)

(fn [x y] (String.append &x &y)) (Box.init @"foo") (Box.init @"bar"))

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

Yeah, I don't fully understand it myself yet -- I think the error stems from delete being called on a reference to some value after the original value is already deleted

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

Basically, I think what happens is that binary-partial binary-partial expands into a nested lambda, which closes over an x--but x is deleted once the outer lambda returns, even though it's needed in the inner lambda that's returned...I think that's roughly what's happening but I'm not certain--or rather, the first deletion isn't actually problematic, but since x is managed in both the outer and inner lambda contexts it results in two calls to delete on the same underlying memory.

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

I'm actually able to produce a simpler case using address:

This case is actually probably OK since ensuring Pointer.to-value is safe is up to the programmer, but it's a simpler example of the same underlying problem, I think:

(def a "foo")
鲤 (Pointer.to-value (address @a))
Compiled to 'out/Untitled' (executable)
foo
Untitled(2155,0x10c2cbdc0) malloc: *** error for object 0x7fa7a7c05810: pointer being freed was not allocated
Untitled(2155,0x10c2cbdc0) malloc: *** set a breakpoint in malloc_error_break to debug
[RUNTIME ERROR] '"out/Untitled"' exited with return value -6.

Here's the emitted C:

int main(int argc, char** argv) {
    carp_init_globals(argc, argv);
    String _12 = String_copy(a);
    String _14 = Pointer_to_MINUS_value__String(&_12); // _14 is just a dereferenced _12
    String* _15 = &_14; // ref
    String _16 = String_str(_15);
    String* _17 = &_16; // ref
    IO_println(_17);
    int _20 = 0;
    String_delete(_12);
    String_delete(_14); // !!!!! This is the problematic line! _14 is the same as _12, which was freed just above.
    // Commenting out the above line makes the program run fine, since we avoid the double free.
    String_delete(_16);
    return _20;
}

@eriksvedang fyi I think a similar thing to the forced example here happens when working with lambdas that return lambdas that close over some argument from an initial lambda (like binary-partial in this issue)

@scolsen
Copy link
Contributor Author

scolsen commented Nov 22, 2020

So, I think this should be a lifetime error, perhaps? Otherwise I think it'd require us to implement lifetime/value tracking in closures that's quite a bit more complicated than the current impl?

@scolsen
Copy link
Contributor Author

scolsen commented Nov 24, 2020

See also #1012 (comment) which explains in more detail what the issue might be--the functions in the corresponding PR also offer a workaround

@scolsen
Copy link
Contributor Author

scolsen commented Nov 25, 2020

After thinking about this a bit more, the only solution we need here is to allow Refs as members in user defined types.

Ultimately, what's happening is that because it's impossible to have a Ref as a member type, anonymous functions that shouldn't take ownership must take ownership because the definition of a function like fmap has no knowledge of the contents of the type its mapping over and must map back into the same type, which cannot hold a Ref by definition.

Consider the problematic case, slightly simplified to using direct lambdas to better illustrate the issue, we'll use Maybe as an example case:

(defn fmap [f maybe] 
  (match m (Just x) (Just (~f x)
           (Nothing) (Nothing))))

(defn pure [x] (Just x))

(defn sequence [maybe-f maybe]
  (match maybe-f (Just f) (fmap f maybe) 
                 (Nothing) (Nothing)))

(defn bi-lift [f applicative-a applicative-b]
  (let [lam (fn [x] (fn [y] (~f x y)))]                                                                                                                                                                            
       (sequence (lift &lam applicative-a) applicative-b)))

Now, imagine we want to use bi-lift to append the contents of (Just @"foo") and (Just @"bar"):

We can't use String.append directly, because it takes refs:

(bi-lift &String.append (Just @"foo") (Just @"bar")) ;; Won't typecheck!

We can't store refs in the Maybe directly, because that's not permitted:

(bi-lift &String.append (Just "foo") (Just "bar")) ;; Not allowed!

We can't fmap the original Maybe values to Refs since this would still yield a Maybe (Ref) which is not allowed:

(bi-lift &String.append (fmap ref (Just @"foo")) (fmap ref (Just @"bar"))) ;; Not allowed!

So, our only remaining option is to use a wrapper lambda that takes ownership of its arguments and then passes them by reference to the reference taking function String.append:

(bi-lift &(fn [x y] (String.append &x &y)) (Just @"foo") (Just @"bar"))

This type checks, but produces code that contains a double-free. The double free occurs because by the definition of bi-lift, the x argument will be captured and deleted twice:

(defn bi-lift [f applicative-a applicative-b]
  (let [lam (fn [x] (fn [y] (~f x y) ;; x is captured here! Thus, it will be deleted on cleanup of `lam`))]                                                                                                                                                                            
       (sequence (lift &lam applicative-a) applicative-b)))

(bi-lift &(fn [x y] (String.append &x &y)) (Just @"foo") (Just @"bar"))
;; but x is *also* deleted by the anonymous function above, since it took ownership of x! 
;; unfortunately, it is called against' `lams` `env.x` argument--which will first be deleted here, then deleted in `lam`!

This is why allocating a pointer works as a means to get around this limitation. Allocating a pointer makes the value unmanaged which means that even though the anonymous (fn [x y] (String.append &x &y)) takes ownership of the pointer, it won't delete it, allowing the user to call a single deletion in lam:

(defn working-bi-lift [f applicative-a applicative-b]
  (let [lam (fn [x] (fn [y] (let-do [res (~f (Pointer.to-value x) y)] (Pointer.free x) res)))]                                                                                                                     
      (sequence (lift &lam (fmap &Pointer.unsafe-alloc applicative-a)) applicative-b))) 

So, one way to solve this issue is to allow types to have Ref members, so that anonymous functions such as the one in this example that shouldn't take ownership don't need to.

EDIT: To see exactly what's happening here, here's how foo passes through these lambdas:

(bi-lift &(fn [x y] (String.append &x &y)) (Just @"foo") (Just @"bar"))
;; First, bi-lift maps `lam` above into to `Just @foo`, capturing `foo` and the anonymous function
bi-lifting: (fn [x] (fn [y] (~(fn [x y] (String.append &x &y)) [x in lam] [y in lam]))) (Just @"foo") ;; `x` is captured! will be deleted
bi-lifting: => (Just (fn [y] (~(fn [x y] (String.append &x &y)) @"foo"[x in lam] y))
bi-lifting: (sequence (Just (fn [y] (~(fn [x y] (String.append &x &y)) @"foo"[x in lam] y)) (Just @"bar"))
bi-lifting: (fn [y] (~(fn [x y] (String.append &x &y)) @"foo"[x in lam] y) (Just @"bar") ;; `y` is *not* captured! Appears in arg list, and a lambda is not returned--a function is applied.
bi-lifting: (~(fn [x y] (String.append &x &y) @"foo"[x in lam] @"bar" [y in lam]) ;; apply the inner fn `x` is owned! will be deleted
bi-lifting: (String.append &(@"foo")[x and [x in lam]] &(@"bar")[y]) ;; called 

@scolsen
Copy link
Contributor Author

scolsen commented Nov 25, 2020

Casting the captured x in the outer lambda env to a reference before passing it to f also works:

;; bi-lifts body
(let [lam (fn [x] (fn [y] (~f &x y)))]                                                                                                                                                                           
       (sequence (lift &lam applicative-a) applicative-b)))

(Applicative.bi-lift &(fn [x y] (String.append x &y)) (Maybe.Just @"foo") (Maybe.Just @"bar"))
Compiled to 'out/Untitled' (executable)
(Just @"foobar")

So we don't actually need to change anything here other than maybe documentation to help people out in some of these thornier cases.

@scolsen
Copy link
Contributor Author

scolsen commented Nov 25, 2020

Though I suppose the double free shouldn't happen regardless.

probably we should enforce that captured values must be passed by reference? So that a lambda always has ownership over its captured env?

@eriksvedang

@eriksvedang
Copy link
Collaborator

@scolsen Thanks a lot for this very clear and thorough explanation of the problem – I feel pretty confident that we can fix this somewhat quickly now that we know what is up!

Though I suppose the double free shouldn't happen regardless.

For sure, this has to be fixed.

probably we should enforce that captured values must be passed by reference? So that a lambda always has ownership over its captured env?

Yes, this is what has been my intent when implenting the memory checks for lambdas. Apparently I failed :)

@scolsen
Copy link
Contributor Author

scolsen commented Nov 25, 2020

@scolsen Thanks a lot for this very clear and thorough explanation of the problem – I feel pretty confident that we can fix this somewhat quickly now that we know what is up!

Though I suppose the double free shouldn't happen regardless.

For sure, this has to be fixed.

probably we should enforce that captured values must be passed by reference? So that a lambda always has ownership over its captured env?

Yes, this is what has been my intent when implenting the memory checks for lambdas. Apparently I failed :)

Thanks! What's the right place for a fix here? Am I right in thinking it's somewhere in Concretize? My thinking was that we implicitly cast all the captured members of the lambda's env to Refs during concretization. Or is a better approach to handle this downstream in Emit somewhere?

@eriksvedang
Copy link
Collaborator

My intuition is that it should be handled in manageMemory and use similar checks as all other code, if possible.

@eriksvedang
Copy link
Collaborator

Just switching local types that late in the compiler pipeline will create mayhem for sure :)

@eriksvedang eriksvedang added the memory Borrow checker and lifetimes label Jun 8, 2021
@scolsen
Copy link
Contributor Author

scolsen commented Oct 31, 2022

PR #1440 will make it so that the compiler catches the lambda leaking captures ownership issue and prevents double frees. The next step to fully resolve this one involves making it possible to hold refs in types, since we'll need to pass refs to lambdas in cases like this anyway now for code to compile.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug haskell memory Borrow checker and lifetimes
Projects
None yet
Development

No branches or pull requests

2 participants