-
-
Notifications
You must be signed in to change notification settings - Fork 164
Project Goals
andychu edited this page Apr 1, 2017
·
39 revisions
- Immediate goal: Implement a bash-compatible shell called OSH.
- Long term goal: Design a modern Unix shell language called Oil that can do everything bash/zsh/etc. can do, and more.
Oil treats shell seriously as a programming language, in terms of both its implementation and defining its semantics.
For a more immediate view of the project, see the Oil blog. In particular, this blog entry was written at the same time as this page.
- System Administration
- Building Linux distributions (e.g. Arch Linux uses bash for PKGBUILD).
- Startup scripts
- Configure and build scripts. Reproducible and distributed builds.
- Distributed Computing
- Building containers
- Specifying remote jobs
- Feedback and Monitoring: performance measurement, security testing.
- Data Science / Scientific Computing
- Heterogeneous "big data" and small data pipelines. The language should scale down as well as scale up, i.e. low startup latency for small jobs.
- Incorporate features of "workflow languages" and systems in the MapReduce family.
- Concise data cleaning, transformation, and summarization.
- Reproducible Research.
- Non-goal: mathematical modeling. That should be left to specialized languages like R, Julia, and Matlab. Communicate with those languages through coprocesses (to avoid startup overhead and concurrency.)
- Interactive Computing
- A general purpose REPL (terminal and probably a Jupyter kernel).
- Document Publishing
- http://oilshell.org/ and many programming books are built and orchestrated with shell scripts / Makefiles
- Easy upgrade path from bash, the most popular shell in the world.
- To do this, I've written a very compatible bash parser, which will allow automatic conversion of bash (osh) to oil. So the language has a different syntax and a superset of bash semantics.
- Consistent syntax.
- POSIX sh and bash have evolved many quirks.
- Fix sh and bash semantics to be more developer-friendly (in a backward compatible way).
- Proper Arrays
- Strict mode for developer productivity (enhanced set -o errexit, nounset, pipefail)
- Enhance the shell language; treat it as a real programming language.
- Fill in obvious gaps, like abspath, etc.
- Compound data structures
- Example: Completion functions in bash have a bad API involving globals and are difficult to write. It should feel more like writing completion functions in Python or JavaScript.
- Selected influences: Python, R, Ruby, Perl 6, Lua (API), ML, C and C++. Power Shell.
- Reduce language cacophony in shell programming by reimplementing tools closely related to the shell.
- Example: combine shell, awk, and make.
- Also combine tools like find (which has its own expression parser and starts processes), and xargs/GNU parallel, which start processes in parallel. GNU parallel is actually mentioned in the bash manual.
- Richer constructs for concurrency and parallelism.
- Folding in
make -j
andxargs -P
goes a long way.
- Folding in
- Allow secure programs to be written.
- In emitting strings: escaping
- In reading strings: error checking should be easy, better control over "read" delimiters, etc.
- Fix issues with globs and flags, i.e. untrusted file system and untrusted variables
- C and C++ bindings
- provide access to advanced Linux kernel features - namespaces, cgroups, seccomp, tracing, /proc, etc. (but remain portable to other Unices)
- It should be possible to write a busybox in oil.
- Should be the best language for writing quick command line tools.
- In particular, replace the getopt interface in bash with something much better.
- Expand the range of things that can be done with the "polyglot" model.
- Coprocesses
- Built-in serialization formats like CSV, JSON, maybe HTML
- Maybe some binary formats as libraries
- No extra "macro processing" on top of the parser. History substitution will be built in, but disabled in batch mode. procs can be used instead of aliases.
- Imperative on the scale of code, but declarative/functional/concurrent on scale of architecture, not unlike
sh
itself.
- Proper error messages like Clang/Swift. Static Parsing.
- Provide end-to-end tracing and profiling tools (e.g. for pipelines that run for hours)
- Library-based design like LLVM. Example: the same parser is used in batch mode as well as completion mode, which is not true of all shell implementations. The parser can be used for auto-formatting and linting, which is also not true of other implementations.
- Few dependencies so it can be used in bootstrapping Unix systems and clusters. (e.g. distributed as a C++ file and optional oil source.)
- Much of oil should be written in oil (which means the VM needs to be fast enough for this).
- Expose our toolkit for little languages -- lexing, parsing, AST representation, etc. So that other languages can be built in the same way.
- Metaprogramming with ASTs as first class data structures.
- FastCGI Scripts on shared hosting (using strict input validation and hygienic text generation).