ColumnMapper Quality of Life #352

dougbrn · 2024-01-18T21:45:26Z

The concept of column mappings was an early component of TAPE design, with the idea being that an Ensemble will know which columns map to known timeseries quantities. The benefits of this being that any (internal) operation that leverages these can use these columns without the user needing to specify a column name. We use these most heavily in Ensemble.batch, when using TAPE analysis functions where the difference is showcased here:

Without Column Mapping:

def tape_analysis_function(time, flux, error, band):
    return result

ensemble.batch(tape_analysis_function, flux="flux_col", time="time_col", error="error_col", band="band_col")

With Column Mapping:

def tape_analysis_function(time, flux, error, band):
    return result

ensemble.batch(tape_analysis_function)

Of course, the above only applies to TAPE analysis functions, and not any externally defined user functions. And a further downside is that ColumnMapping requires users to set up the mapping manually up front, before any data is loaded. And it's probably the most cumbersome component of setting up a new Ensemble workflow, with code looking like this:

ens = Ensemble()

column_mapper = ColumnMapper(id_col="object_id", time_col="mjd", flux_col="flux", err_col="err", band_col="band")

ens.from_parquet(...)

This ticket is really just asking whether this is adding value to TAPE or not. @nevencaplar has had issues with it from a usability perspective. Do we think users would prefer to just deal with what operations are using what columns manually in all cases? In the future, we have had plans to build out the suite of default mappings for the major surveys. Where users of ZTF, LSST, PS1 data would only need to specify:

column_mapper = ColumnMapper().use_known_map("ZTF")

In this case, we would still only be choosing some column set, and it's possible that users will want to choose different columns to map.

The text was updated successfully, but these errors were encountered:

dougbrn · 2024-01-25T18:50:29Z

From LINCC-UP, the decision was made to deprecate column mapping, @dougbrn will investigate the scope of work needed in the internals to move away from it. Edit: It's possible we may actually keep this...

dougbrn added the question Further information is requested label Jan 18, 2024

dougbrn self-assigned this Jan 25, 2024

dougbrn mentioned this issue Feb 14, 2024

ColumnMapper Known Mappings #185

Closed

dougbrn removed their assignment Aug 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

ColumnMapper Quality of Life #352

ColumnMapper Quality of Life #352

dougbrn commented Jan 18, 2024

dougbrn commented Jan 25, 2024 •

edited

Loading

ColumnMapper Quality of Life #352

ColumnMapper Quality of Life #352

Comments

dougbrn commented Jan 18, 2024

dougbrn commented Jan 25, 2024 • edited Loading

dougbrn commented Jan 25, 2024 •

edited

Loading