Jq: update interpreter, merge eval-ast, fix some env errors #685

asarhaddon · 2024-09-29T16:10:25Z

Hello.
This fixes #657 for jq, as well as a few regressions (even step0 fails with the current interpreter version).
I have also done some source cleaning, but env.jq and steps 3 to 6 stil share too much similar and partially broken code.
I have failed to figure how the stuff is supposed to work.
I suggest to first ask help by the original author in removing the code duplications and documenting the design choices,
and only then fix REGRESS=1 and/or self hosting (#662), if stil necessary.

Original issue describing the change and converting the first set of implementations: kanaka#592 Tracking issue for other implementations: kanaka#657 All normal tests pass, but REGRESS and self-hosting fail. Steps: display the results from jq without python simplify/improve quasiquote simplify replenv construction Cosmetic: Update the interpreter from latest Debian/Ubuntu. move first core functions from steps4-A to core.jq simplify interprocess communication between run and utils.jq merge run and rts.py, simplify it

kanaka · 2024-10-01T14:46:32Z

Hi @alimpfard, I know it's been a few years since you did the mal implementation in jq. We are working on refactoring all the implementations with some important simplifications to the process/guide (and often taking the opportunity to upgrade to newer versions of the target language). Any chance you would be willing to refresh your memory on this code base to help us with those changes? @asarhaddon has made a first pass in this PR do do that refactoring and gotten the basic tests to work but there is a failure during regression testing (running tests for earlier steps using stepA). Neither of us are jq experts and could use some assistance if you're willing. I think mostly it would be helping to understand and/or cleanup some of the environment related code in step3 and step6.

alimpfard · 2024-10-01T14:48:13Z

Sure! I'll take a look later today.

alimpfard

As for self-hosting...well...I'll let the diff speak for its own mysteries:

diff --git a/impls/mal/env.mal b/impls/mal/env.mal
index dc0ee2ef..48216eb6 100644
--- a/impls/mal/env.mal
+++ b/impls/mal/env.mal
@@ -29,12 +29,6 @@
 (def! env-as-map (fn* [env]
   (dissoc @env :outer)))
 
-(def! env-get-or-nil (fn* [env k]
-  (let* [ks (str k)
-         e (env-find-str env ks)]
-    (if e
-      (get @e ks)))))
-
 ;;  Private helper for env-get and env-get-or-nil.
 (def! env-find-str (fn* [env ks]
   (if env
@@ -43,6 +37,12 @@
         env
         (env-find-str (get data :outer) ks))))))
 
+(def! env-get-or-nil (fn* [env k]
+  (let* [ks (str k)
+         e (env-find-str env ks)]
+    (if e
+      (get @e ks)))))
+
 (def! env-get (fn* [env k]
   (let* [ks (str k)
          e (env-find-str env ks)]
diff --git a/impls/mal/step7_quote.mal b/impls/mal/step7_quote.mal
index b1c6fb8f..a4c9efda 100644
--- a/impls/mal/step7_quote.mal
+++ b/impls/mal/step7_quote.mal
@@ -11,16 +11,6 @@
 
 ;; eval
 
-(def! qq-loop (fn* [elt acc]
-  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
-    (if (= 2 (count elt))
-      (list 'concat (nth elt 1) acc)
-      (throw "splice-unquote expects 1 argument"))
-    (list 'cons (QUASIQUOTE elt) acc))))
-(def! qq-foldr (fn* [xs]
-  (if (empty? xs)
-    ()
-    (qq-loop (first xs) (qq-foldr (rest xs))))))
 (def! QUASIQUOTE (fn* [ast]
   (cond
     (vector? ast)            (list 'vec (qq-foldr ast))
@@ -31,6 +21,16 @@
                                (nth ast 1)
                                (throw "unquote expects 1 argument"))
     "else"                   (qq-foldr ast))))
+(def! qq-loop (fn* [elt acc]
+  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
+    (if (= 2 (count elt))
+      (list 'concat (nth elt 1) acc)
+      (throw "splice-unquote expects 1 argument"))
+    (list 'cons (QUASIQUOTE elt) acc))))
+(def! qq-foldr (fn* [xs]
+  (if (empty? xs)
+    ()
+    (qq-loop (first xs) (qq-foldr (rest xs))))))
 
 (def! LET (fn* [env binds form]
   (if (empty? binds)
diff --git a/impls/mal/step8_macros.mal b/impls/mal/step8_macros.mal
index e1c88392..4df91755 100644
--- a/impls/mal/step8_macros.mal
+++ b/impls/mal/step8_macros.mal
@@ -11,16 +11,6 @@
 
 ;; eval
 
-(def! qq-loop (fn* [elt acc]
-  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
-    (if (= 2 (count elt))
-      (list 'concat (nth elt 1) acc)
-      (throw "splice-unquote expects 1 argument"))
-    (list 'cons (QUASIQUOTE elt) acc))))
-(def! qq-foldr (fn* [xs]
-  (if (empty? xs)
-    ()
-    (qq-loop (first xs) (qq-foldr (rest xs))))))
 (def! QUASIQUOTE (fn* [ast]
   (cond
     (vector? ast)            (list 'vec (qq-foldr ast))
@@ -31,6 +21,16 @@
                                (nth ast 1)
                                (throw "unquote expects 1 argument"))
     "else"                   (qq-foldr ast))))
+(def! qq-loop (fn* [elt acc]
+  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
+    (if (= 2 (count elt))
+      (list 'concat (nth elt 1) acc)
+      (throw "splice-unquote expects 1 argument"))
+    (list 'cons (QUASIQUOTE elt) acc))))
+(def! qq-foldr (fn* [xs]
+  (if (empty? xs)
+    ()
+    (qq-loop (first xs) (qq-foldr (rest xs))))))
 
 (def! LET (fn* [env binds form]
   (if (empty? binds)
diff --git a/impls/mal/step9_try.mal b/impls/mal/step9_try.mal
index 54403bfd..6e8351cb 100644
--- a/impls/mal/step9_try.mal
+++ b/impls/mal/step9_try.mal
@@ -11,16 +11,6 @@
 
 ;; eval
 
-(def! qq-loop (fn* [elt acc]
-  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
-    (if (= 2 (count elt))
-      (list 'concat (nth elt 1) acc)
-      (throw "splice-unquote expects 1 argument"))
-    (list 'cons (QUASIQUOTE elt) acc))))
-(def! qq-foldr (fn* [xs]
-  (if (empty? xs)
-    ()
-    (qq-loop (first xs) (qq-foldr (rest xs))))))
 (def! QUASIQUOTE (fn* [ast]
   (cond
     (vector? ast)            (list 'vec (qq-foldr ast))
@@ -31,6 +21,16 @@
                                (nth ast 1)
                                (throw "unquote expects 1 argument"))
     "else"                   (qq-foldr ast))))
+(def! qq-loop (fn* [elt acc]
+  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
+    (if (= 2 (count elt))
+      (list 'concat (nth elt 1) acc)
+      (throw "splice-unquote expects 1 argument"))
+    (list 'cons (QUASIQUOTE elt) acc))))
+(def! qq-foldr (fn* [xs]
+  (if (empty? xs)
+    ()
+    (qq-loop (first xs) (qq-foldr (rest xs))))))
 
 (def! LET (fn* [env binds form]
   (if (empty? binds)
diff --git a/impls/mal/stepA_mal.mal b/impls/mal/stepA_mal.mal
index 43f69409..c816431b 100644
--- a/impls/mal/stepA_mal.mal
+++ b/impls/mal/stepA_mal.mal
@@ -11,16 +11,6 @@
 
 ;; eval
 
-(def! qq-loop (fn* [elt acc]
-  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
-    (if (= 2 (count elt))
-      (list 'concat (nth elt 1) acc)
-      (throw "splice-unquote expects 1 argument"))
-    (list 'cons (QUASIQUOTE elt) acc))))
-(def! qq-foldr (fn* [xs]
-  (if (empty? xs)
-    ()
-    (qq-loop (first xs) (qq-foldr (rest xs))))))
 (def! QUASIQUOTE (fn* [ast]
   (cond
     (vector? ast)            (list 'vec (qq-foldr ast))
@@ -31,6 +21,16 @@
                                (nth ast 1)
                                (throw "unquote expects 1 argument"))
     "else"                   (qq-foldr ast))))
+(def! qq-loop (fn* [elt acc]
+  (if (if (list? elt) (= (first elt) 'splice-unquote)) ; 2nd 'if' means 'and'
+    (if (= 2 (count elt))
+      (list 'concat (nth elt 1) acc)
+      (throw "splice-unquote expects 1 argument"))
+    (list 'cons (QUASIQUOTE elt) acc))))
+(def! qq-foldr (fn* [xs]
+  (if (empty? xs)
+    ()
+    (qq-loop (first xs) (qq-foldr (rest xs))))))
 
 (def! LET (fn* [env binds form]
   (if (empty? binds)

alimpfard · 2024-10-01T16:36:48Z

impls/jq/step5_tco.jq

+            (
+                select(.kind == "vector") |
+                .value |
+                reduce .[] as $x ({expr:[], env:env};


The issue with REGRESS is simple enough, this should be $_menv (env is the initial env, which is why you get "b not found" in the regression test), you use the right value for later steps.

alimpfard · 2024-10-01T16:41:17Z

impls/jq/step5_tco.jq

                . | TCOWrap($_menv; $_orig_retenv; false) # return this unchanged, since it can only be applied to
            ) //
-            (select(.kind == "symbol") |
+            (
+                select(.kind == "symbol") |
                .value | env_get($_menv) | TCOWrap($_menv; null; false)


The other half of REGRESS failures are not your fault, they were here from before.
This should be

Suggested change

.value | env_get($_menv) | TCOWrap($_menv; null; false)

.value | env_get($_menv) | TCOWrap($_menv; $_orig_retenv; false)

(as the evaluation of a symbol does not affect the resulting env)

alimpfard · 2024-10-01T16:53:06Z

I don't quite understand why it's okay for QUASIQUOTE to mention qq-foldr without a definition but not the other way around, maybe something causes eager evaluation that is suppressed by macros, I'll take a look at the jq impl in a bit.

@asarhaddon @kanaka what cleanups are we talking about exactly? I can help out with refactors/cleanups but I'm not familiar with whatever has changed in MAL since I last looked at it 😅

Fix by alimpfard.

instead of abusing env_find

for readability

The procedure is short and its name not explicit.

probably once useful for empty lists

asarhaddon · 2024-10-05T18:57:12Z

Hello.
I have applied your patches, this merge request fixes all regression tests.

Maintenance would be easyer if the obsolete code was removed and the diff between files reduced.
The blocking point for me is that step5 differs a lot from step4 (which is normal) but also from step6 (this causes duplicate efforts). It would be nice if you could reduce the diffs between steps and/or document why duplicated code remains in step5 so that we could clean it.
Some examples:
The difference between env_set and env_set_ is not easy to guess.
The diff between step6 and step5 makes me wonder if
the evaluation of the "if" condition should use $_menv (as in current
step6) instead of $env (as in current step5.
Same question for evaluation of hashmaps (env in step5, $_menv in
step6).

alimpfard · 2024-10-05T19:07:12Z

I believe most of the env differences between step 5 & 6 are due to the existence of atoms, and how they're supposed to interact with different scoping rules (env vs. $_menv is almost entirely about this), e.g. let* wouldn't "leak" atoms if we passed the old env through.

I imagine most of the languages that lack global state will have this difference between step 5 & 6?

kanaka · 2024-10-07T14:00:13Z

Nice work. That was a big lift!

@alimpfard do you think you might be able to add some documentation about the step 4, 5, 6 differences from the typical language pattern? I have an intuitive sense about why that is the case (due to jq's strict immutable nature), but some concrete explanation from the person who wrote it would certainly be appreciated.

alimpfard · 2024-10-07T16:26:20Z

I'll see if I can draft something up (might take a couple days though, a little busy IRL)

asarhaddon added 3 commits September 29, 2024 17:40

jq: fix evaluation of vectors and maps in regression tests

a9f7c88

jq: remove more unused stuff

9dd6884

alimpfard reviewed Oct 1, 2024

View reviewed changes

alimpfard and others added 17 commits October 5, 2024 20:41

jq: fix TCO for vectors, breaking let* in step5

711d393

jq: fix TCO for symbols, breaking let* in steps 5 to A

06a2a69

Fix by alimpfard.

jq: inline some short undocumented functions

5d43f64

jq: use env_req for env searches that should not fail

cd48aab

instead of abusing env_find

jq: start to merge most env_get variants

75cee4a

jq: merge most env_get variants

fff6e7a

jq: merge all addToEnv variants

450a074

jq: remove the addEnv function

36c457e

jq: inline pureChildEnv

6b5bb92

for readability

jq: inline env_step6

b6387d3

The procedure is short and its name not explicit.

jq: remove obsolete fallback clause

7bbd3de

probably once useful for empty lists

jq: improve readability of addToEnv

6c24434

jq: simplify try*

fc1502c

jq: implement TCO for do

db187e0

jq: remove uneeded parenthesis

d0545f1

jq: wrap one more construction with TCOWrap

de1a79a

jq: backport last changes from stepA to step 3-9

f325441

kanaka merged commit 4319594 into kanaka:master Oct 7, 2024
4 checks passed

kanaka mentioned this pull request Oct 7, 2024

Remaining impls to combine eval-ast/macroexpand into eval #657

Open

29 tasks

asarhaddon deleted the jq branch October 8, 2024 21:23

alimpfard mentioned this pull request Oct 21, 2024

jq: Add some design/implementation documentation #700

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Jq: update interpreter, merge eval-ast, fix some env errors #685

Jq: update interpreter, merge eval-ast, fix some env errors #685

asarhaddon commented Sep 29, 2024

kanaka commented Oct 1, 2024

alimpfard commented Oct 1, 2024

alimpfard left a comment

alimpfard Oct 1, 2024

alimpfard Oct 1, 2024

alimpfard commented Oct 1, 2024 •

edited

Loading

asarhaddon commented Oct 5, 2024

alimpfard commented Oct 5, 2024

kanaka commented Oct 7, 2024

alimpfard commented Oct 7, 2024

	.value \| env_get($_menv) \| TCOWrap($_menv; null; false)
	.value \| env_get($_menv) \| TCOWrap($_menv; $_orig_retenv; false)

Jq: update interpreter, merge eval-ast, fix some env errors #685

Jq: update interpreter, merge eval-ast, fix some env errors #685

Conversation

asarhaddon commented Sep 29, 2024

kanaka commented Oct 1, 2024

alimpfard commented Oct 1, 2024

alimpfard left a comment

Choose a reason for hiding this comment

alimpfard Oct 1, 2024

Choose a reason for hiding this comment

alimpfard Oct 1, 2024

Choose a reason for hiding this comment

alimpfard commented Oct 1, 2024 • edited Loading

asarhaddon commented Oct 5, 2024

alimpfard commented Oct 5, 2024

kanaka commented Oct 7, 2024

alimpfard commented Oct 7, 2024

alimpfard commented Oct 1, 2024 •

edited

Loading