Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration with Tables.jl #355

Closed
s-celles opened this issue Jan 4, 2024 · 8 comments · Fixed by #437
Closed

Integration with Tables.jl #355

s-celles opened this issue Jan 4, 2024 · 8 comments · Fixed by #437

Comments

@s-celles
Copy link

s-celles commented Jan 4, 2024

Hello,

I'd like to know if integration with Tables.jl https://tables.juliadata.org/dev/ have been considered to export a slice of an YAXArray to DataFrames.DataFrame, TimeSeries.TimeArray, TSFrames.TSFrame...
Maybe YAXArray could be both a source and a sink.
Any opinion ?

Kind regards

@lazarusA
Copy link
Collaborator

lazarusA commented Jan 4, 2024

it looks like is already supported https://rafaqz.github.io/DimensionalData.jl/dev/reference/?h=dimtable#tablesjltabletraitsjl-interface, maybe we could just tried out with some examples, and if it works add them to the docs? What simple examples do you have in mind?

@s-celles
Copy link
Author

s-celles commented Jan 5, 2024

I see two kind of example.

YAXArray as sink
Download 3 symbols data from MarketData.jl (for example) and get a "cube".

YAXArray as source
Take the previously obtained cube, swap 2 dimensions and get a DataFrame ohlcv at a given date, get a TSFrame of close prices with symbol as column...

This lib shouldn't be added to YAXArray so you will probably have to deal with package extensions
https://youtu.be/TiIZlQhFzyk?si=Lvm6RSp3WjuqtV-o

An other idea if you don't want to rely on remote data could be to generate similar data with a random walk.

@femtotrader
Copy link

femtotrader commented Apr 25, 2024

Here is some random data to build a 3D cube

julia> using MarketData

julia> data = Dict("Stock1" => random_ohlcv(), "Stock2" => random_ohlcv(), "Stock3" => random_ohlcv())
Dict{String, TimeArray{Float64, 2, DateTime, Matrix{Float64}}} with 3 entries:
  "Stock2" => 500×5 TimeArray{Float64, 2, DateTime, Matrix{Float64}} 2020-01-01T00:00:00 to 2020-01-21T19:00:00
  "Stock3" => 500×5 TimeArray{Float64, 2, DateTime, Matrix{Float64}} 2020-01-01T00:00:00 to 2020-01-21T19:00:00
  "Stock1" => 500×5 TimeArray{Float64, 2, DateTime, Matrix{Float64}} 2020-01-01T00:00:00 to 2020-01-21T19:00:00

julia> data["Stock1"]
500×5 TimeArray{Float64, 2, DateTime, Matrix{Float64}} 2020-01-01T00:00:00 to 2020-01-21T19:00:00
┌─────────────────────┬────────┬────────┬────────┬────────┬────────┐
│                     │ Open   │ High   │ Low    │ Close  │ Volume │
├─────────────────────┼────────┼────────┼────────┼────────┼────────┤
│ 2020-01-01T00:00:00654.02657.91652.74657.9147.8 │
│ 2020-01-01T01:00:00657.59663.22656.93658.2955.2 │
│ 2020-01-01T02:00:00658.09662.2649.3649.33.7 │
│ 2020-01-01T03:00:00649.57649.57634.44636.6513.9 │
│ 2020-01-01T04:00:00637.35639.31635.88635.8835.8 │
│ 2020-01-01T05:00:00635.6636.46626.38628.1668.8 │
│ 2020-01-01T06:00:00627.61629.29622.35629.2927.1 │
│ 2020-01-01T07:00:00630.18637.41630.18634.5939.0 │
│ 2020-01-01T08:00:00634.84635.42626.56626.5626.7 │
│ 2020-01-01T09:00:00625.98627.14622.37626.968.7 │
│ 2020-01-01T10:00:00627.76636.52627.67634.879.7 │
│ 2020-01-01T11:00:00634.71635.36629.06629.6570.6 │
│              │
│ 2020-01-21T08:00:00793.7795.42785.97786.9663.8 │
│ 2020-01-21T09:00:00787.38791.3785.83785.830.0 │
│ 2020-01-21T10:00:00786.02793.74784.98793.7471.2 │
│ 2020-01-21T11:00:00794.73795.11790.71790.7176.3 │
│ 2020-01-21T12:00:00789.92790.87786.32787.3842.7 │
│ 2020-01-21T13:00:00788.26788.33782.01782.4861.6 │
│ 2020-01-21T14:00:00781.58782.98777.93782.1331.2 │
│ 2020-01-21T15:00:00781.66782.95774.77779.6844.5 │
│ 2020-01-21T16:00:00779.35784.95773.43784.9534.2 │
│ 2020-01-21T17:00:00785.61789.73783.63787.850.2 │
│ 2020-01-21T18:00:00787.51794.35787.37792.833.5 │
│ 2020-01-21T19:00:00792.87794.0790.51793.1816.9 │
└─────────────────────┴────────┴────────┴────────┴────────┴────────┘
                                                    476 rows omitted

julia> data["Stock2"]
500×5 TimeArray{Float64, 2, DateTime, Matrix{Float64}} 2020-01-01T00:00:00 to 2020-01-21T19:00:00
┌─────────────────────┬────────┬────────┬────────┬────────┬────────┐
│                     │ Open   │ High   │ Low    │ Close  │ Volume │
├─────────────────────┼────────┼────────┼────────┼────────┼────────┤
│ 2020-01-01T00:00:00155.8167.25154.93165.4240.8 │
│ 2020-01-01T01:00:00164.48167.51162.54165.1929.5 │
│ 2020-01-01T02:00:00165.66171.29164.89165.1155.0 │
│ 2020-01-01T03:00:00164.35169.62164.35165.4813.2 │
│ 2020-01-01T04:00:00165.26168.44164.23165.3497.3 │
│ 2020-01-01T05:00:00166.05171.79166.0170.862.7 │
│ 2020-01-01T06:00:00170.63174.14170.17174.0266.8 │
│ 2020-01-01T07:00:00174.49179.76174.49178.5440.5 │
│ 2020-01-01T08:00:00177.8179.85175.84176.0163.8 │
│ 2020-01-01T09:00:00176.92181.39174.55176.2650.3 │
│ 2020-01-01T10:00:00175.69176.43171.21172.2859.0 │
│ 2020-01-01T11:00:00172.14177.01168.63175.2390.2 │
│              │
│ 2020-01-21T08:00:00149.9151.54146.31150.3498.0 │
│ 2020-01-21T09:00:00150.64151.86145.85148.6389.7 │
│ 2020-01-21T10:00:00149.62152.04144.73149.1987.3 │
│ 2020-01-21T11:00:00148.48150.29140.75141.6535.2 │
│ 2020-01-21T12:00:00142.39142.39137.89142.1447.5 │
│ 2020-01-21T13:00:00142.88151.71140.67150.3567.1 │
│ 2020-01-21T14:00:00150.02152.85148.64150.3112.8 │
│ 2020-01-21T15:00:00150.84157.52150.84156.6829.6 │
│ 2020-01-21T16:00:00157.44165.22157.44163.0974.6 │
│ 2020-01-21T17:00:00163.36167.37163.08165.9256.6 │
│ 2020-01-21T18:00:00166.68174.08166.68171.5822.0 │
│ 2020-01-21T19:00:00170.61174.85169.47171.4129.6 │
└─────────────────────┴────────┴────────┴────────┴────────┴────────┘
                                                    476 rows omitted

julia> data["Stock3"]
500×5 TimeArray{Float64, 2, DateTime, Matrix{Float64}} 2020-01-01T00:00:00 to 2020-01-21T19:00:00
┌─────────────────────┬────────┬────────┬───────┬────────┬────────┐
│                     │ Open   │ High   │ Low   │ Close  │ Volume │
├─────────────────────┼────────┼────────┼───────┼────────┼────────┤
│ 2020-01-01T00:00:0044.1546.0240.9244.8924.8 │
│ 2020-01-01T01:00:0045.0650.5743.4949.0945.2 │
│ 2020-01-01T02:00:0049.9654.7948.0653.7621.9 │
│ 2020-01-01T03:00:0053.259.8252.4256.416.2 │
│ 2020-01-01T04:00:0056.0459.0353.7454.7592.3 │
│ 2020-01-01T05:00:0054.856.2950.8155.7652.2 │
│ 2020-01-01T06:00:0056.3456.752.9553.0472.6 │
│ 2020-01-01T07:00:0052.8753.4946.9846.9821.1 │
│ 2020-01-01T08:00:0046.5150.5844.6749.9552.5 │
│ 2020-01-01T09:00:0049.3749.6843.7845.7368.3 │
│ 2020-01-01T10:00:0045.2450.7345.2450.7345.9 │
│ 2020-01-01T11:00:0051.2153.1148.0152.0544.9 │
│              │
│ 2020-01-21T08:00:0085.5488.584.5186.8491.9 │
│ 2020-01-21T09:00:0086.6386.6380.4784.9349.2 │
│ 2020-01-21T10:00:0085.787.3779.8680.9959.1 │
│ 2020-01-21T11:00:0081.583.2577.6179.8725.4 │
│ 2020-01-21T12:00:0080.0780.0774.4874.4865.7 │
│ 2020-01-21T13:00:0074.0476.1571.9975.584.9 │
│ 2020-01-21T14:00:0075.4282.6275.4278.9835.5 │
│ 2020-01-21T15:00:0078.8480.1675.1675.5270.6 │
│ 2020-01-21T16:00:0075.6375.6370.7273.4346.1 │
│ 2020-01-21T17:00:0073.175.3471.071.7714.9 │
│ 2020-01-21T18:00:0072.4374.5368.2868.2881.8 │
│ 2020-01-21T19:00:0068.2468.7963.7567.196.2 │
└─────────────────────┴────────┴────────┴───────┴────────┴────────┘
                                                   476 rows omitted

Unfortunately I don't know how to get this into YAXArrays.jl

@felixcremer
Copy link
Member

felixcremer commented Apr 25, 2024

You could construct a YAXArray from every separate stock with this:

s = data["Stock1"]
julia> d = (Ti(timestamp(s)), Dim{:colnames}(colnames(s)))

julia> YAXArray(d, values(s));

This would construct a two dimensional YAXArray from the data in the TimeArray.
If you would like to have a three dimensional YAXArray with a dimension for the stocks you could use cat(yaxlist, dims=Dim{:Stock}(["1", "2", "3"]) or you could use a Dataset which would behave more like a Dict and there you could have Arrays with different dimensions.

@femtotrader
Copy link

femtotrader commented Apr 26, 2024

using YAXArrays
d = (Ti(timestamp(s)), Dim{:colnames}(colnames(s)))

is broken. It raises

ERROR: UndefVarError: `Ti` not defined

@femtotrader
Copy link

using DimensionalData: DimensionalData as DD

and using DD.Ti should help

@felixcremer
Copy link
Member

Yes sorry, forgot the import of DD. Is this what you had in mind?

@femtotrader
Copy link

What I had is mind was to provide a full example like so

using MarketData
using DataStructures
using YAXArrays
using DimensionalData: DimensionalData as DD

d_data = OrderedDict("Stock1" => random_ohlcv(), "Stock2" => random_ohlcv(), "Stock3" => random_ohlcv())

yaxlist = YAXArray[]
for (stock, stock_data) in d_data
    d = (DD.Ti(timestamp(stock_data)), Dim{:colnames}(colnames(stock_data)))
    yax = YAXArray(d, values(stock_data))
    push!(yaxlist, yax)
end
data = cat(yaxlist, dims=Dim{:Stock}(keys(d_data)))

but last line is failing.

ERROR: MethodError: no method matching iterate(::Dim{:Stock, Base.KeySet{String, OrderedDict{String, TimeArray{Float64, 2, DateTime, Matrix{Float64}}}}})

Closest candidates are:
  iterate(::Base.AsyncGenerator, ::Base.AsyncGeneratorState)
   @ Base asyncmap.jl:362
  iterate(::Base.AsyncGenerator)
   @ Base asyncmap.jl:362
  iterate(::DataStructures.TrieIterator)
   @ DataStructures C:\Users\femto\.julia\packages\DataStructures\95DJa\src\trie.jl:112
  ...

same for

data = cat(yaxlist, dims=Dim{:Stock}(collect(keys(d_data))))
ERROR: MethodError: no method matching isless(::String, ::Int64)

Closest candidates are:
  isless(::Missing, ::Any)
   @ Base missing.jl:87
  isless(::Any, ::Missing)
   @ Base missing.jl:88
  isless(::ForwardDiff.Dual{Tx}, ::Integer) where Tx
   @ ForwardDiff C:\Users\femto\.julia\packages\ForwardDiff\PcZ48\src\dual.jl:144

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants