Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Un-documented behaviour of splitobs(...; at=1) #166

Open
mcabbott opened this issue Sep 19, 2023 · 0 comments
Open

Un-documented behaviour of splitobs(...; at=1) #166

mcabbott opened this issue Sep 19, 2023 · 0 comments

Comments

@mcabbott
Copy link
Contributor

The keyword at is described as a proportion, but secretly has quite different behaviour when it's an integer. IMO it would be clearest if these had distinct names, but if both are called at, the two paths should both be clearly documented.

julia> splitobs(100, at=1.0)
(1:100, 101:100)

julia> splitobs(100, at=1)
(1:1, 2:100)

help?> splitobs
search: splitobs splitext splitdir split split_rest splitpath splitdrive splice! splat rsplit

  splitobs(n::Int; at) -> Tuple

  Compute the indices for two or more disjoint subsets of the range 1:n with splits given by at.

  Examples
  ≡≡≡≡≡≡≡≡

  julia> splitobs(100, at=0.7)
  (1:70, 71:100)
  
  julia> splitobs(100, at=(0.1, 0.4))
  (1:10, 11:50, 51:100)

  ────────────────────────────────────────────────────────────────────────────────────────────────

  splitobs(data; at, shuffle=false) -> Tuple

  Split the data into multiple subsets proportional to the value(s) of at.

  If shuffle=true, randomly permute the observations before splitting.

  Supports any datatype implementing the numobs and getobs interfaces.

  Examples
  ≡≡≡≡≡≡≡≡

  # A 70%-30% split
  train, test = splitobs(X, at=0.7)
  
  # A 50%-30%-20% split
  train, val, test = splitobs(X, at=(0.5, 0.3))
  
  # A 70%-30% split with multiple arrays and shuffling
  train, test = splitobs((X, y), at=0.7, shuffle=true)
  Xtrain, Ytrain = train
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant