Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Renaming of normalization to standardization and implementation of actual normalization #123

Open
DoktorMike opened this issue Sep 26, 2022 · 6 comments
Labels
good first issue Good for newcomers

Comments

@DoktorMike
Copy link

Currently the implemented function called normalise in reality is doing standardization, i.e., transforming the numbers to mean 0 and standard deviation of 1. I propose that we

  1. Rename normalise to standardise
  2. Implement normalization and name that normalise.

As a reminder:

Normalization (Min-Max Scaling): $\hat{X} = (X - X_{min})/(X_{max} - X_{min})$
Standardization (Z-Score Normalization): $\hat{X} = (X - \mu_X)/\sigma_X$

I know this might be nitpicking but since I think we should have a function that does normalization it just seems odd to give that a different name.

@darsnack darsnack added the good first issue Good for newcomers label Sep 26, 2022
@darsnack
Copy link
Member

Seems like a good idea to me. First we can deprecate normalise for standardise, then release real normalization in a breaking release. We'll need PRs to downstream libraries like Flux as well.

@ToucheSir
Copy link
Contributor

I don't even think Flux uses normalise from MLUtils (it still has its own copy, which should itself be moved to NNlib), so there may not be much to fixup downstream.

@mcabbott
Copy link
Contributor

mcabbott commented Sep 26, 2022

Flux has its own definition
https://github.com/FluxML/Flux.jl/blob/d4f1d816563edd5f953a3fd1ef7dd960d507ed22/src/layers/stateless.jl#L32
and issue FluxML/Flux.jl#1952 about the name / location.

Julia already uses "normalize" to mean something different to either of those:

help?> LinearAlgebra.normalize
  normalize(a, p::Real=2)

  Normalize a so that its p-norm equals unity, i.e. norm(a, p) == 1. For scalars, this is similar
  to sign(a), except normalize(0) = NaN. See also normalize!, norm, and sign.

I recently saw this package https://github.com/brendanjohnharris/Normalization.jl#normalization-methods but have not investigated closely. Thought on whether farming this out might be better?

Edit: that uses JuliennedArrays for dims, thus I think it won't work well with GPU / AD. But there might be other packages. And perhaps one datapoint on what to name things, too.

@ToucheSir
Copy link
Contributor

The Flux function ought to be renamed to something, anything else. The existing name is just confusing. We can probably discuss that over in NNlib though.

@DoktorMike
Copy link
Author

Flux has its own definition https://github.com/FluxML/Flux.jl/blob/d4f1d816563edd5f953a3fd1ef7dd960d507ed22/src/layers/stateless.jl#L32 and issue FluxML/Flux.jl#1952 about the name / location.

Julia already uses "normalize" to mean something different to either of those:

help?> LinearAlgebra.normalize
  normalize(a, p::Real=2)

  Normalize a so that its p-norm equals unity, i.e. norm(a, p) == 1. For scalars, this is similar
  to sign(a), except normalize(0) = NaN. See also normalize!, norm, and sign.

I recently saw this package https://github.com/brendanjohnharris/Normalization.jl#normalization-methods but have not investigated closely. Thought on whether farming this out might be better?

Edit: that uses JuliennedArrays for dims, thus I think it won't work well with GPU / AD. But there might be other packages. And perhaps one datapoint on what to name things, too.

I think the Normalization.jl looks promising. Haven't investigated it but I like the idea of collecting a lot of array normalization methods into one package.

@CarloLucibello
Copy link
Member

Fastai and pytorch use the name Normalize as well for standardization
https://docs.fast.ai/data.transforms.html#normalize
https://pytorch.org/vision/stable/generated/torchvision.transforms.Normalize.html#torchvision.transforms.Normalize
so we are in good company.

I'm more worried about the proximity of LinearAlgebra.normalize and MLUtils.normalise lamented in FluxML/Flux.jl#1952, maybe that's reason enough to rename to standardize.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
good first issue Good for newcomers
Projects
None yet
Development

No branches or pull requests

5 participants