Replies: 4 comments 5 replies
-
|
Beta Was this translation helpful? Give feedback.
-
Related discussion - #113 - lexer also needs to store all the tokens in order for nimpretty to work correctly, and this is best done via DOD |
Beta Was this translation helpful? Give feedback.
-
Currently an issue with all these approaches is we don't know how to collect memory for the purposes of nimsuggest. |
Beta Was this translation helpful? Give feedback.
-
The sketch PR has been rebased and so far things seem to be mostly working, see: #144 |
Beta Was this translation helpful? Give feedback.
-
Discussion for roadmapping out the move of the compiler internals to a data oriented design approach.
(this is an evolving summary of the discussion below)
Big Idea
A data oriented design approach to compiler internals, starting first and foremost with the AST would be a significant step towards cleaning up the code base, speeding up compilation, unlocking IC, fixing bugs, making it more approachable for others, and much much more. But that's a lot of scope for what need to be humble beginnings.
Where to start
Currently the compiler uses
PNode
,PSym
,PType
, and a bunch of other refs to data, which are passed around forgotten (garbage collected), copied, mutated etc... as part of semantic analysis and backend code generation. We want to move the memory layout of AST towards a DOD approach, which would mean a single data structure, with storage package/optimized for a nim project's AST needs. Starting with PNode this would look something like this:Few key notes:
Why carry around a ref to the "global" ast state instead of just keeping it as a global? This is because we have metaprogramming, which is basically compilation within compilation. The expectation is that the precise data that an ID has along with it will evolve over time and instead of pointing to "everything" it will point to the particular compilation it's a part of. This will make it easier to control environments, roll forward and backwards symbol tables, and many other useful features.
What is Data Oriented Design (DOD)?
It's not quite a formalism, so it's hard to pin down. Here is a definition by analogy with some relevant references following that
tl;dr: Data Oriented Design is treating your data layout and storage in memory like a database, with a slight bias towards column oriented vs row oriented storage (structs of arrays). With additoinal considerations such as instead of refs or pointers to memory favour opaque identifiers (array/sequence/keys offsets) that act much like primary and foriegn keys would in a database. Following this approach a module should know all of its data and hand out aforementioned ids to data with necessary accessors for various work -- put another way a module encapsulates data and memory management.
Beta Was this translation helpful? Give feedback.
All reactions