-
Notifications
You must be signed in to change notification settings - Fork 135
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implements data transformations in TSA module #2121
Conversation
@dylanjm and @GabrielSoto-INL , I'd love to get your input on how this is implemented. |
… adds the ability to use custom transformers from a separate python file
…se classes, makes transformer XML tags explicit
… classes are not included in User Manual
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few minor initial changes to consider; I haven't carefully checked all the implementation, but I did notice a few things.
I'm still not convinced fit
increases readability over characterize
, but I'm not completely against it. I do have questions about what it means to be a generator
, however.
@@ -192,24 +246,6 @@ def getParamsAsVars(self, params): | |||
rlz[f'{baseWave}__{stat}'] = value | |||
return rlz | |||
|
|||
|
|||
def generate(self, params, pivot, settings): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why did this need renaming?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With Fourier being implemented as a TimeSeriesTransformer instead of a TimeSeriesGenerator, I wanted to avoid using the same generate()
method name used in TimeSeriesGenerator. If we revert this back to being a generator (and relaxing the restriction on generators being stochastic), I'll revert the name change.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, help me understand that, and maybe talking this over in a meeting makes more sense. Fourier projects the signal from signal space to a Hilbert space, but the result is not a transformation of the data, but a characterization of the data, as it seems to me. Help me see how you're seeing Fourier as a transformer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I see it as a transformer in the sense that you give it a signal and it returns a modified signal. It certainly characterizes as well, but I see the characterization of the signal as separate thing from the operation of modifying the signal based on the results of that characterization. I'd certainly be open to discussing this and some of the broader design decisions in a meeting instead of through a Github PR review..
@@ -209,7 +191,8 @@ def generate(self, params, pivot, settings): | |||
@ In, settings, dict, additional settings specific to algorithm | |||
@ Out, synthetic, np.array(float), synthetic estimated model signal | |||
""" | |||
|
|||
# FIXME This method isn't currently tested or used anywhere. Trying to call this method results | |||
# in an error due to a mismatch of array shapes when calculating sigMatSynthetic. Remove this method? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
tests/framework/ROM/TimeSeries/SyntheticHistory/TrainingData/LogARMA_0.csv
Outdated
Show resolved
Hide resolved
###################################### | ||
# CONSTRUCTION # | ||
###################################### | ||
def createZeroFilter(targets): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add docstrings
if printComment: | ||
print(f'checking {str(dtype)} {comment} | {value} != {expected}') | ||
|
||
def checkFloat(comment, value, expected, tol=1e-10, update=True): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW, I've really wanted to move to using pytest and standardize all these checks across RAVEN unit tests, but have never had the bandwidth to do it. It's not required (by any means) for this work, but if there's spare bandwidth and interest, it would improve the unit tests dramatically.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Only one discussion point, on the Venn diagram of algorithms. Thanks for making those changes.
Can you convert this from WIP so the automated tests can run? I assume that's what's holding them back. |
Job Mingw Test on b7d80db : invalidated by @wangcj05 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One comment to check.
@@ -6,6 +6,7 @@ | |||
type = OrderedCSV | |||
output = 'Basic/chz.csv' | |||
windows_gold = 'Basic/windowsChz.csv' | |||
mac_gold = 'Basic/windowsChz.csv' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so windows and mac agree but linux differs now?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, seems that the difference is just a phase of one of the Fourier modes being pi on some OSes and -pi on others. Do you want me to change that file name now that that gold file is being used for the Mac test, too?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, I'm fine with this.
Pull Request Description
What issue does this change request address? (Use "#" before the issue to link it, i.e., #42.)
#2120
What are the significant changes in functionality due to this change request?
A new "Transformer" class is added to the TSA module which can apply any transformer class whose interface is defined by the sklearn.base.TransformerMixin base class to any target or group of target variables. The specifics of how the residual and composite signals are assembled are changed slightly to allow for non-additive signal composition so that nonlinear transformation functions can be accommodated.
For Change Control Board: Change Request Review
The following review must be completed by an authorized member of the Change Control Board.
<internalParallel>
to True.raven/tests/framework/user_guide
andraven/docs/workshop
) have been changed, the associated documentation must be reviewed and assured the text matches the example.