Skip to content

Commit

Permalink
Conversion of VarName to/from string (#100)
Browse files Browse the repository at this point in the history
* Fix typo in comment

* Conversion of VarName to/from string

* Add one more test

* More thorough serialisation

* Add doctests

* Add API docs

* Add warning to docstring

Co-authored-by: Tor Erlend Fjelde <[email protected]>

* Add alternate implementation with StructTypes

* Reduce calls to Meta.parse()

It's only called for ConcretizedSlice now, which could potentially be
removed too.

* Restrict allowed ranges for ConcretizedSlice

* Fix name of wrapper type

* More tests

* Remove unneeded extra method for ConcretizedSlice

* Add StepRange support

* Support arrays of integers as indices

* Simplify implementation even more

* Bump to 0.9.0

* Clean up old code, add docs

* Allow de/serialisation methods to be extended

* Update docs

* Name functions more consistently

---------

Co-authored-by: Tor Erlend Fjelde <[email protected]>
  • Loading branch information
penelopeysm and torfjelde authored Oct 1, 2024
1 parent a77e247 commit 68ad707
Show file tree
Hide file tree
Showing 5 changed files with 225 additions and 3 deletions.
3 changes: 2 additions & 1 deletion Project.toml
Original file line number Diff line number Diff line change
Expand Up @@ -3,12 +3,13 @@ uuid = "7a57a42e-76ec-4ea3-a279-07e840d6d9cf"
keywords = ["probablistic programming"]
license = "MIT"
desc = "Common interfaces for probabilistic programming"
version = "0.8.4"
version = "0.9.0"

[deps]
AbstractMCMC = "80f14c24-f653-4e6a-9b94-39d6b0f70001"
Accessors = "7d9f7c33-5ae7-4f3b-8dc6-eff91059b697"
DensityInterface = "b429d917-457f-4dbc-8f4c-0cc954292b1d"
JSON = "682c06a0-de6a-54ab-a142-c8b1cf79cde6"
Random = "9a3f8284-a2c9-5f02-9a11-845980a1fd5c"

[compat]
Expand Down
9 changes: 9 additions & 0 deletions docs/src/api.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,6 +14,15 @@ vsym
@vsym
```

## VarName serialisation

```@docs
index_to_dict
dict_to_index
varname_to_string
string_to_varname
```

## Abstract model functions

```@docs
Expand Down
6 changes: 5 additions & 1 deletion src/AbstractPPL.jl
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,11 @@ export VarName,
varname,
vsym,
@varname,
@vsym
@vsym,
index_to_dict,
dict_to_index,
varname_to_string,
string_to_varname


# Abstract model functions
Expand Down
147 changes: 146 additions & 1 deletion src/varname.jl
Original file line number Diff line number Diff line change
@@ -1,5 +1,6 @@
using Accessors
using Accessors: ComposedOptic, PropertyLens, IndexLens, DynamicIndexLens
using JSON: JSON

const ALLOWED_OPTICS = Union{typeof(identity),PropertyLens,IndexLens,ComposedOptic}

Expand Down Expand Up @@ -302,7 +303,7 @@ subsumes(t::ComposedOptic, u::ComposedOptic) =
# If `t` is still a composed lens, then there is no way it can subsume `u` since `u` is a
# leaf of the "lens-tree".
subsumes(t::ComposedOptic, u::PropertyLens) = false
# Here we need to check if `u.outer` (i.e. the next lens to be applied from `u`) is
# Here we need to check if `u.inner` (i.e. the next lens to be applied from `u`) is
# subsumed by `t`, since this would mean that the rest of the composition is also subsumed
# by `t`.
subsumes(t::PropertyLens, u::ComposedOptic) = subsumes(t, u.inner)
Expand Down Expand Up @@ -752,3 +753,147 @@ function vsym(expr::Expr)
error("Malformed variable name `$(expr)`!")
end
end

# String constants for each index type that we support serialisation /
# deserialisation of
const _BASE_INTEGER_TYPE = "Base.Integer"
const _BASE_VECTOR_TYPE = "Base.Vector"
const _BASE_UNITRANGE_TYPE = "Base.UnitRange"
const _BASE_STEPRANGE_TYPE = "Base.StepRange"
const _BASE_ONETO_TYPE = "Base.OneTo"
const _BASE_COLON_TYPE = "Base.Colon"
const _CONCRETIZED_SLICE_TYPE = "AbstractPPL.ConcretizedSlice"
const _BASE_TUPLE_TYPE = "Base.Tuple"

"""
index_to_dict(::Integer)
index_to_dict(::AbstractVector{Int})
index_to_dict(::UnitRange)
index_to_dict(::StepRange)
index_to_dict(::Colon)
index_to_dict(::ConcretizedSlice{T, Base.OneTo{I}}) where {T, I}
index_to_dict(::Tuple)
Convert an index `i` to a dictionary representation.
"""
index_to_dict(i::Integer) = Dict("type" => _BASE_INTEGER_TYPE, "value" => i)
index_to_dict(v::Vector{Int}) = Dict("type" => _BASE_VECTOR_TYPE, "values" => v)
index_to_dict(r::UnitRange) = Dict("type" => _BASE_UNITRANGE_TYPE, "start" => r.start, "stop" => r.stop)
index_to_dict(r::StepRange) = Dict("type" => _BASE_STEPRANGE_TYPE, "start" => r.start, "stop" => r.stop, "step" => r.step)
index_to_dict(r::Base.OneTo{I}) where {I} = Dict("type" => _BASE_ONETO_TYPE, "stop" => r.stop)
index_to_dict(::Colon) = Dict("type" => _BASE_COLON_TYPE)
index_to_dict(s::ConcretizedSlice{T,R}) where {T,R} = Dict("type" => _CONCRETIZED_SLICE_TYPE, "range" => index_to_dict(s.range))
index_to_dict(t::Tuple) = Dict("type" => _BASE_TUPLE_TYPE, "values" => map(index_to_dict, t))

"""
dict_to_index(dict)
dict_to_index(symbol_val, dict)
Convert a dictionary representation of an index `dict` to an index.
Users can extend the functionality of `dict_to_index` (and hence `VarName`
de/serialisation) by extending this method along with [`index_to_dict`](@ref).
Specifically, suppose you have a custom index type `MyIndexType` and you want
to be able to de/serialise a `VarName` containing this index type. You should
then implement the following two methods:
1. `AbstractPPL.index_to_dict(i::MyModule.MyIndexType)` should return a
dictionary representation of the index `i`. This dictionary must contain the
key `"type"`, and the corresponding value must be a string that uniquely
identifies the index type. Generally, it makes sense to use the name of the
type (perhaps prefixed with module qualifiers) as this value to avoid
clashes. The remainder of the dictionary can have any structure you like.
2. Suppose the value of `index_to_dict(i)["type"]` is `"MyModule.MyIndexType"`.
You should then implement the corresponding method
`AbstractPPL.dict_to_index(::Val{Symbol("MyModule.MyIndexType")}, dict)`,
which should take the dictionary representation as the second argument and
return the original `MyIndexType` object.
To see an example of this in action, you can look in the the AbstractPPL test
suite, which contains a test for serialising OffsetArrays.
"""
function dict_to_index(dict)
t = dict["type"]
if t == _BASE_INTEGER_TYPE
return dict["value"]
elseif t == _BASE_VECTOR_TYPE
return collect(Int, dict["values"])
elseif t == _BASE_UNITRANGE_TYPE
return dict["start"]:dict["stop"]
elseif t == _BASE_STEPRANGE_TYPE
return dict["start"]:dict["step"]:dict["stop"]
elseif t == _BASE_ONETO_TYPE
return Base.OneTo(dict["stop"])
elseif t == _BASE_COLON_TYPE
return Colon()
elseif t == _CONCRETIZED_SLICE_TYPE
return ConcretizedSlice(Base.Slice(dict_to_index(dict["range"])))
elseif t == _BASE_TUPLE_TYPE
return tuple(map(dict_to_index, dict["values"])...)
else
# Will error if the method is not defined, but this hook allows users
# to extend this function
return dict_to_index(Val(Symbol(t)), dict)
end
end

optic_to_dict(::typeof(identity)) = Dict("type" => "identity")
optic_to_dict(::PropertyLens{sym}) where {sym} = Dict("type" => "property", "field" => String(sym))
optic_to_dict(i::IndexLens) = Dict("type" => "index", "indices" => index_to_dict(i.indices))
optic_to_dict(c::ComposedOptic) = Dict("type" => "composed", "outer" => optic_to_dict(c.outer), "inner" => optic_to_dict(c.inner))

function dict_to_optic(dict)
if dict["type"] == "identity"
return identity
elseif dict["type"] == "index"
return IndexLens(dict_to_index(dict["indices"]))
elseif dict["type"] == "property"
return PropertyLens{Symbol(dict["field"])}()
elseif dict["type"] == "composed"
return dict_to_optic(dict["outer"]) dict_to_optic(dict["inner"])
else
error("Unknown optic type: $(dict["type"])")
end
end

varname_to_dict(vn::VarName) = Dict("sym" => getsym(vn), "optic" => optic_to_dict(getoptic(vn)))

dict_to_varname(dict::Dict{<:AbstractString, Any}) = VarName{Symbol(dict["sym"])}(dict_to_optic(dict["optic"]))

"""
varname_to_string(vn::VarName)
Convert a `VarName` as a string, via an intermediate dictionary. This differs
from `string(vn)` in that concretised slices are faithfully represented (rather
than being pretty-printed as colons).
For `VarName`s which index into an array, this function will only work if the
indices can be serialised. This is true for all standard Julia index types, but
if you are using custom index types, you will need to implement the
`index_to_dict` and `dict_to_index` methods for those types. See the
documentation of [`dict_to_index`](@ref) for instructions on how to do this.
```jldoctest
julia> varname_to_string(@varname(x))
"{\\"optic\\":{\\"type\\":\\"identity\\"},\\"sym\\":\\"x\\"}"
julia> varname_to_string(@varname(x.a))
"{\\"optic\\":{\\"field\\":\\"a\\",\\"type\\":\\"property\\"},\\"sym\\":\\"x\\"}"
julia> y = ones(2); varname_to_string(@varname(y[:]))
"{\\"optic\\":{\\"indices\\":{\\"values\\":[{\\"type\\":\\"Base.Colon\\"}],\\"type\\":\\"Base.Tuple\\"},\\"type\\":\\"index\\"},\\"sym\\":\\"y\\"}"
julia> y = ones(2); varname_to_string(@varname(y[:], true))
"{\\"optic\\":{\\"indices\\":{\\"values\\":[{\\"range\\":{\\"stop\\":2,\\"type\\":\\"Base.OneTo\\"},\\"type\\":\\"AbstractPPL.ConcretizedSlice\\"}],\\"type\\":\\"Base.Tuple\\"},\\"type\\":\\"index\\"},\\"sym\\":\\"y\\"}"
```
"""
varname_to_string(vn::VarName) = JSON.json(varname_to_dict(vn))

"""
string_to_varname(str::AbstractString)
Convert a string representation of a `VarName` back to a `VarName`. The string
should have been generated by `varname_to_string`.
"""
string_to_varname(str::AbstractString) = dict_to_varname(JSON.parse(str))
63 changes: 63 additions & 0 deletions test/varname.jl
Original file line number Diff line number Diff line change
Expand Up @@ -137,4 +137,67 @@ end
@inferred get(c, @varname(b.a[1]))
@inferred Accessors.set(c, @varname(b.a[1]), 10)
end

@testset "de/serialisation of VarNames" begin
y = ones(10)
z = ones(5, 2)
vns = [
@varname(x),
@varname(ä),
@varname(x.a),
@varname(x.a.b),
@varname(var"x.a"),
@varname(x[1]),
@varname(var"x[1]"),
@varname(x[1:10]),
@varname(x[1:3:10]),
@varname(x[1, 2]),
@varname(x[1, 2:5]),
@varname(x[:]),
@varname(x.a[1]),
@varname(x.a[1:10]),
@varname(x[1].a),
@varname(y[:]),
@varname(y[begin:end]),
@varname(y[end]),
@varname(y[:], false),
@varname(y[:], true),
@varname(z[:], false),
@varname(z[:], true),
@varname(z[:][:], false),
@varname(z[:][:], true),
@varname(z[:,:], false),
@varname(z[:,:], true),
@varname(z[2:5,:], false),
@varname(z[2:5,:], true),
]
for vn in vns
@test string_to_varname(varname_to_string(vn)) == vn
end

# For this VarName, the {de,}serialisation works correctly but we must
# test in a different way because equality comparison of structs with
# vector fields (such as Accessors.IndexLens) compares the memory
# addresses rather than the contents (thus vn_vec == vn_vec2 returns
# false).
vn_vec = @varname(x[[1, 2, 5, 6]])
vn_vec2 = string_to_varname(varname_to_string(vn_vec))
@test hash(vn_vec) == hash(vn_vec2)
end

@testset "de/serialisation of VarNames with custom index types" begin
using OffsetArrays: OffsetArrays, Origin
weird = Origin(4)(ones(10))
vn = @varname(weird[:], true)

# This won't work as we don't yet know how to handle OffsetArray
@test_throws MethodError varname_to_string(vn)

# Now define the relevant methods
AbstractPPL.index_to_dict(o::OffsetArrays.IdOffsetRange{I, R}) where {I,R} = Dict("type" => "OffsetArrays.OffsetArray", "parent" => AbstractPPL.index_to_dict(o.parent), "offset" => o.offset)
AbstractPPL.dict_to_index(::Val{Symbol("OffsetArrays.OffsetArray")}, d) = OffsetArrays.IdOffsetRange(AbstractPPL.dict_to_index(d["parent"]), d["offset"])

# Serialisation should now work
@test string_to_varname(varname_to_string(vn)) == vn
end
end

2 comments on commit 68ad707

@penelopeysm
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@JuliaRegistrator
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Registration pull request created: JuliaRegistries/General/116385

Tip: Release Notes

Did you know you can add release notes too? Just add markdown formatted text underneath the comment after the text
"Release notes:" and it will be added to the registry PR, and if TagBot is installed it will also be added to the
release that TagBot creates. i.e.

@JuliaRegistrator register

Release notes:

## Breaking changes

- blah

To add them here just re-invoke and the PR will be updated.

Tagging

After the above pull request is merged, it is recommended that a tag is created on this repository for the registered package version.

This will be done automatically if the Julia TagBot GitHub Action is installed, or can be done manually through the github interface, or via:

git tag -a v0.9.0 -m "<description of version>" 68ad70702a13ff9356fce9401a50ff0ad774dd0b
git push origin v0.9.0

Please sign in to comment.