Implement HOGER homogenization #240

paulf81 · 2024-11-19T22:22:52Z

Implement HOGER homogenization

This PR implements the HOGER algorithm for homogenization of yaw/wind direction signals. It detects and removes changes in the calibration of these channels.

HOGER was developed by Paul Poncet (@engie-paul-poncet) and Thomas Duc (@engie-thomas-duc) of Engie, and Rubén González-Lope (@rglope) and Alvaro Gonzalez Salcedo (@alvarogonzalezsalcedo) of CENER within the TWAIN project.

This PR contains a new module northing_offset_change_hoger.py. Also a new example is included 03_northing_calibration_hoger.ipynb. This new example will be included in the online documentation as it shows the combined usage of different yaw/wd calibration modules.

Some notes on connections

This functionality implements something akin to what is proposed in #148 (@ejsimley )

It also could be seen to address Issue #106

It also mirrors some features included in wind-up (@aclerc)

@Bartdoekemeijer , you'll see in the new example how the new module can fit within the current calibration workflow, will be curious for your opinions there.

@engie-paul-poncet, @engie-thomas-duc, @rglope, @alvarogonzalezsalcedo @misi9170 @ejsimley @Bartdoekemeijer @aclerc please use this PR to provide any feedback comments.

Open Questions:

Are we happy with the defaults?
Do we want to continue with Feature/detect yaw shift #148?

Todo

Agree on defaults
Documentation and credit sufficient?
Make sure all new functions tested
Confirm example works as expected
Confirm documentation builds locally

Related issue, if one exists
#148
#106

rglope · 2024-11-20T07:04:33Z

flasc/data_processing/northing_offset_change_hoger.py

+
+    # Postprocess all the data to get the main jumps for each wind turbine
+    d2 = d.copy()
+    d2["Class"] = discretize(d["Knot"], threshold=100)


The value of the threshold argument for the discretize function should be the same as the one used for the decission tree regressor. It was a mistake on my part. Thus,

d2["Class"] = discretize(d["Knot"], threshold=threshold)

would be more appropiate.

Thanks for catching, fixed!

paulf81 · 2024-11-21T14:11:31Z

@rglope , I have a question in writing the tests, I realized I wasn't clear on the units in threshold. In the discretize function, it defines the minimum distance between nodes to differentiate classes. Here the units is number of indices right? It is also used as the min_samples_split, which would mean the number of points for a leaf right? I think I understand then, the two are consistent, if a leaf must have N points, then it stands to reason, to different knots should be N points apart. Do I have that right? Would you mind to see if you agree with the tests I've added?

rglope · 2024-11-21T15:05:32Z

@rglope , I have a question in writing the tests, I realized I wasn't clear on the units in threshold. In the discretize function, it defines the minimum distance between nodes to differentiate classes. Here the units is number of indices right? It is also used as the min_samples_split, which would mean the number of points for a leaf right?

That's right. The threshold is the number of indices or registries in the SCADA. In the case of the discretize function, as you say, it sets the distance between the nodes. For the Decision Tree Regressor is considered the same way to create a node (or knot, which is what I call them) and a leaf. The only thing is that to create a leaf, it is needed only half the number of indices.

regr = DecisionTreeRegressor(max_depth=max_depth, min_samples_split=threshold, min_samples_leaf=threshold // 2, ccp_alpha=ccp_alpha)

More information can be found here: https://scikit-learn.org/dev/modules/generated/sklearn.tree.DecisionTreeRegressor.html. The original R function only had a minimun length argument, so during the translation to Python it seemed a good proportion.

I think I understand then, the two are consistent, if a leaf must have N points, then it stands to reason, to different knots should be N points apart. Do I have that right?

As I said, the minimum number of indices to create a leaf is 'threshold // 2', so maybe the argument for the discretize function should also be 'threshold // 2'. That way it would follow the reasoning you write (and the one I was actually going for).

Would you mind to see if you agree with the tests I've added?

I'll try them as soon as I can!

Finally, just an observation about the ccp_alpha argument if it helps. It is a pruning parameter to reduce the complexity of the regression. In the example it was set to a specific value since it made the regression cleaner, but I don't think there is necessarily a default value for every case. More information can be found here: https://scikit-learn.org/1.5/auto_examples/tree/plot_cost_complexity_pruning.html

paulf81 · 2024-11-22T17:13:05Z

Thanks for these responses @rglope ! I implemented the following change:

As I said, the minimum number of indices to create a leaf is 'threshold // 2', so maybe the argument for the discretize function should also be 'threshold // 2'. That way it would follow the reasoning you write (and the one I was actually going for).

paulf81 · 2024-11-22T17:26:39Z

ok tests pass and I confirmed docs all build locally. Good to merge for me once a few reviews are in, thanks all!

rglope

I can't add a comment on the 03_northing_calibration_hoger.ipynb, but there is an "apparant" which I think is "apparent". Besides that detail, the tests work and check all the main features of the algorithm and the example shows exactly what it does. Everything looks good on my end!

examples_artificial_data/01_raw_data_processing/01_northing_calibration.ipynb

flasc/data_processing/northing_offset_change_hoger.py

misi9170 · 2024-11-25T20:57:13Z

examples_artificial_data/01_raw_data_processing/03_northing_calibration_hoger.ipynb

All of the examples here use a wind direction that is steadily ramping (with a very small amount of noise added). How difficult/feasible would it to be to move to a more "reasonable" wind direction signal (e.g., a Gaussian wind direction with a standard deviation of, say, 10 degrees and a non-zero mean to provide bias)? I think that could also help make the jump clearer.

It could also be interesting to compare the northing calibration procedure before and after the jump is detected by HOGER; i.e., demonstrate that the northing calibration doesn't work well if there is a jump in the bias

I can for sure change over the yaw from ramping to fixed. That's no problem. Running the calibration procedure twice could be instructive, but at the cost of a much longer notebook, since it generates a lot of text and plots, so I'd vote against that step

misi9170 · 2024-11-25T21:00:14Z

I think we should also add documentation describing the HOGER algorithm a bit (even just a paragraph would help a lot for new users).

This reverts commit 6efad2a.

This reverts commit 062e3da.

paulf81 · 2024-11-26T04:39:38Z

apparant

Fixed!

paulf81 · 2024-11-26T04:41:47Z

I think we should also add documentation describing the HOGER algorithm a bit (even just a paragraph would help a lot for new users).

You mean beyond what is in docstring? This will appear in autodocs...

paulf81 added 15 commits November 1, 2024 14:20

Add initial version

d4ec947

Debugging

86f134a

Add notebook

77d2742

updates

58a4fc5

Update hoger code

ab05283

Update reqs

4cbed9c

fix specifier

7a602a0

Update docs

a897da4

rename file

fca8ef6

update northing example

062e3da

restore 01

6efad2a

rename 03

67625b4

Fix defaults

c615348

Update 03

98b0383

Formatting

7a0ed96

paulf81 added the enhancement An improvement of an existing feature label Nov 19, 2024

paulf81 requested review from Bartdoekemeijer, misi9170 and ejsimley November 19, 2024 22:22

paulf81 self-assigned this Nov 19, 2024

paulf81 added 2 commits November 19, 2024 15:24

fix toc reference

f77aa04

Remove test code

a07a41e

rglope reviewed Nov 20, 2024

View reviewed changes

paulf81 added 6 commits November 21, 2024 06:20

pass threshold

acb1aa7

use hidden functions

c5bf57d

clean up

b49345c

Add initial tests

b09d7af

Update tests

643d915

Add future

9cd058d

Change threshold input to discretize to /2

214f4ff

paulf81 added 2 commits November 22, 2024 10:19

Update tests

98a6656

fix rsync

4a539a3

paulf81 requested a review from rglope November 22, 2024 17:26

rglope approved these changes Nov 25, 2024

View reviewed changes

misi9170 reviewed Nov 25, 2024

View reviewed changes

examples_artificial_data/01_raw_data_processing/01_northing_calibration.ipynb Outdated Show resolved Hide resolved

misi9170 reviewed Nov 25, 2024

View reviewed changes

flasc/data_processing/northing_offset_change_hoger.py Outdated Show resolved Hide resolved

misi9170 requested changes Nov 25, 2024

View reviewed changes

paulf81 added 2 commits November 25, 2024 21:32

Revert "restore 01"

a47122e

This reverts commit 6efad2a.

Revert "update northing example"

c47984e

This reverts commit 062e3da.

Better function name

4b28fab

paulf81 requested a review from misi9170 November 26, 2024 04:43

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement HOGER homogenization #240

Implement HOGER homogenization #240

paulf81 commented Nov 19, 2024 •

edited

Loading

rglope Nov 20, 2024

paulf81 Nov 21, 2024

paulf81 commented Nov 21, 2024

rglope commented Nov 21, 2024

paulf81 commented Nov 22, 2024

paulf81 commented Nov 22, 2024

rglope left a comment

misi9170 Nov 25, 2024 •

edited

Loading

paulf81 Nov 26, 2024

misi9170 commented Nov 25, 2024 •

edited

Loading

paulf81 commented Nov 26, 2024

paulf81 commented Nov 26, 2024

Implement HOGER homogenization #240

Are you sure you want to change the base?

Implement HOGER homogenization #240

Conversation

paulf81 commented Nov 19, 2024 • edited Loading

Implement HOGER homogenization

Some notes on connections

Open Questions:

Todo

rglope Nov 20, 2024

Choose a reason for hiding this comment

paulf81 Nov 21, 2024

Choose a reason for hiding this comment

paulf81 commented Nov 21, 2024

rglope commented Nov 21, 2024

paulf81 commented Nov 22, 2024

paulf81 commented Nov 22, 2024

rglope left a comment

Choose a reason for hiding this comment

misi9170 Nov 25, 2024 • edited Loading

Choose a reason for hiding this comment

paulf81 Nov 26, 2024

Choose a reason for hiding this comment

misi9170 commented Nov 25, 2024 • edited Loading

paulf81 commented Nov 26, 2024

paulf81 commented Nov 26, 2024

paulf81 commented Nov 19, 2024 •

edited

Loading

misi9170 Nov 25, 2024 •

edited

Loading

misi9170 commented Nov 25, 2024 •

edited

Loading