- Proposal: HXP-0010
- Author: Aurel Bílý
- Status: to be implemented
Improved API for both synchronous and asynchronous filesystem operations; improved networking API; improved threading and process API; asynchrony primitives; I/O streams.
There is currently no good way to asynchronously perform many sys
-related tasks (without manually creating Thread
s). Two basic primitives are added to the library:
- signals (and listeners)
- unified callback style
The current Haxe API contains haxe.io.Input
and haxe.io.Output
for input and output streams. These lack:
- ability to express a read and write stream (
sys.io.File
has two separate streams rather than one RW stream) - pipelining without manual chunking
- proper asynchronous operations
- automatically pacing streams with different data emission / consumption rates
The current filesystem APIs in Haxe lack a number of important features:
- asynchronous tasks
- changing permissions, owners of files
- symlink operations
- watching for changes
Non-blocking socket operations are inconvenient to use in the current API even though they are the only (non-Thread
) solution to some real-time network communication problems. IPC communication is not possible, UDP sockets are not fully featured, DNS lookup is always synchronous and not fully featured.
There is a lack of proper unit testing of the networking APIs. Certain platforms also miss full implementations of various parts of the networking API. (See HaxeFoundation/haxe#6933, HaxeFoundation/haxe#6816)
Some Haxe targets (e.g. eval) have problematic implementations of threads which can result in unexpected deadlocks or crashes. It is not possible to pass handles (sockets or open files) to open processes (IPC); there is no standardised message passing for child processes.
The APIs will be implemented as direct wrappers of libuv (which is the foundation of Node.js APIs) on targets which allow this, i.e. eval, Neko, HashLink, hxcpp, and Lua. The hxnodejs library will be updated to map Node.js APIs to the new sys
APIs.
Java, C#, PHP, and Python may at first expose the new sys
APIs by requiring a native library (dll
, so
, dylib
). Proper target-native APIs can be added over time, particularly after an in-depth test suite is available.
The full implementation status is available in the haxe-sys repository.
A haxe.Error
class is added to unify error reporting in the system APIs. It has a message
field which contains the human-readable description of the error. It also includes a type
field which can be switch
-ed on.
try {
sys.FileSystem.someOperation();
} catch (err:haxe.Error) {
trace("error!", err);
}
// or
try {
sys.FileSystem.someOperation();
} catch (err:haxe.Error) {
switch (err.type) {
case FileNotFound: // it's fine
case _: throw err;
}
}
Unresolved question:
There are multiple ways of expressing proper type-safe errors for the filesystem API:
- errors represented by a single
enum
(sys.FileSystemError
), with the individual cases containing all the information of that particular error
- awkward to catch individual errors (any
catch
would need aswitch
)- fewer classes to maintain, less work to throw errors (the case names the error, so no message is needed)
- errors represented by sub-classes of a single base class
- possible to catch individual subclasses in separate
catch
blocks- many classes in the package (could be moved into a sub-package for errors?)
- base class
Error
+ enum for types, as implemented in the draft nowThe primary aim for any solution is to be able to catch specific types of errors without having to rely on string comparison.
A type-safe system for emitting signals (similar to events) is added, similar to tink_core
. A Signal<T>
is simply an abstract over an array of listeners (Listener<T>
). A signal-emitting object has a number of final
signal instances.
class Example {
public final fooSignal = new Signal<NoData>();
public final barSignal = new Signal<String>();
public function new() {}
public function emit() {
fooSignal.emit(new NoData());
barSignal.emit("hello");
}
}
class Main {
static function main():Void {
var example = new Example();
example.fooSignal.on(() -> trace("signal foo"));
example.barSignal.on(str -> trace("signal bar", str));
example.emit();
}
}
Currently no efforts were made to "hide" the emit
method (like the Signal
and SignalTrigger
distinction made in tink_core
).
Asynchronous methods are identical to their synchronous counter-parts, except:
- their return type is
Void
- they have an additional, required
callback
argument of typeCallback<DataType>
orCallback<NoData>
- first argument passed to the callback is a
haxe.Error
, ornull
if no error occurred - any additional arguments represent the data returned by the call, analogous to the return type of the synchronous method; if the synchronous method has a
Void
return type, the callback takes no additional arguments Callback<T>
is an abstract which has somefrom
methods, allowing a callback to be created from functions with a simpler signature (e.g. aCallback<NoData>
from(err:Error)->Void
)
- first argument passed to the callback is a
Several methods in the API accept constants or a combination of flags. Constants (where the argument is exactly one of a set of options) have been converted to an enum
or enum abstract
. Flags (where the argument is zero or more of a set of options) have been converted to an abstract
over Int
, with an overloaded |
operator.
At the core of a lot of Node.js APIs lie streams, which are abstractions for data consumers (Writable
), data producers (Readable
), or a mix of both (Duplex
or Transform
). Streams enable better composition of data operations with methods such as pipeline
. There is also a mechanism to minimise buffering of data in memory (highWaterMark
, drain
) when combining streams.
The libuv API has a concept of file descriptors, represented by a single integer. To avoid issues with platforms without explicit file descriptor numbers, sys.io.File
is an abstract
, similar to the new threading API.
Various methods which take a file descriptor as their first argument are moved into their own methods in the File
abstract.
To avoid the someMethod
+ someMethodSync
naming scheme present in Node.js, the two versions are more clearly split:
asys.FileSystem
andasys.AsyncFileSystem
(static methods)asys.io.File
andasys.io.AsyncFile
(instance methods)
asys.io.File
exposes an async
field to access the asys.io.AsyncFile
corresponding to a particular file.
// synchronously:
var file = asys.FileSystem.open("file.txt", Read);
var data = file.readFile();
// asynchronously:
asys.AsyncFileSystem.open("file.txt", Read, (err, file) -> {
file.async.readFile((err, data) -> {
// ...
});
});
In libuv, wherever a path is expected as an argument, a char *
can be provided, equivalent to haxe.io.Bytes
. Similarly, whenever paths are to be returned, either a char *
is returned.
It would be awkward to require Bytes
objects as file paths in Haxe, so instead, the assumption is made that filepaths will be valid Unicode most of the time, and haxe.io.FilePath
(an abstract
over String
) is used consistently in the new API. In the rare cases that non-Unicode paths are returned, they are escaped into a Unicode string. The original Bytes
can be obtained with FilePath.decode(path)
. There is also the inverse FilePath.encode(bytes)
.
The new APIs reserved for system targets will be available in a new top-level package asys
. Some cross-platform types will be added to the haxe
package. A sys-compat
library will be provided to map the old sys
APIs to the new asys
package for easier transitioning and testing, although the old sys
APIs will remain untouched when the library is not used.
The majority of tests for the current sys
classes should be adapted and reused. It may be worthwhile to adapt the existing tests to test both implementations (with a forced synchronous operation on sys.async
) so tests are not duplicated. Additional tests should be written to test async-specific features, such as writing multiple files in parallel.
For methods that were not present in the original APIs, some tests may be based on the extensive libuv test suite or the Node.js test suite.
Existing code should not be affected, unless it uses an asys
package.
Wrapping libuv allows easily supporting new APIs without several separate implementations. This approach may reduce portability on some of our targets, see detailed design.
There are currently no alternatives in Haxe libraries with a similar feature range. It might be possible on some of Haxe targets to back the new APIs with target-native features, but it would also seriously increase the complexity of this project.
- better haxelib
- libuv available in the OCaml code of the compiler - threading and parallelisation may be possible
- error reporting style
- currently all filesize and file position arguments are
Int
, but this only allows sizes of up to 2 GiB- use
haxe.Int64
? (dependent on better support on all sys targets, e.g. HashLink)
- use