Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add plot configuration class and time profile plots #871

Merged
merged 90 commits into from
May 11, 2022
Merged

Conversation

IndrajeetPatil
Copy link
Member

@IndrajeetPatil IndrajeetPatil commented Apr 11, 2022

This PR is coming from a branch that doesn't have issue number because it was supposed to be short-lived and was off shoot from #858, but it has now become the primary branch and #858 has been closed in favour of the current one.

TODO:

  • createPlotConfiguration() (removed after team discussion, which deemed it unnecessary)
  • DefaultPlotConfiguration
  • rename to something other than ospPlotConfiguration DefaultPlotConfiguration (Create DefaultPlotConfiguration class #877)
  • rely on tlf defaults; no need to choose defaults arguments for parameters ourselves
  • get rid of constructor
  • include units in the plot configuration
  • choose sensible defaults for initialization
  • individual time profile plot (plotIndividualTimeProfile())
  • population time profile plot (plotPopulationTimeProfile())
  • both of these profile plots work for only observed or only simulated dataset contexts
  • make sure datasets which are not part of any grouping get their own grouping in profile plots
  • add tests for plotIndividualTimeProfile()
  • add tests for plotPopulationTimeProfile()
  • add tests for plot config class
  • don't set mappings globally; this needs to be done on plot-by-plot basis
  • add examples to manual (Documenting DataCombined #756)
  • update NEWS.md
  • check comments once implementation is finalized

Part of #674

Related to #815
Related to #832
Related to #833

Closes #877
Closes #878
Closes #920

@IndrajeetPatil IndrajeetPatil marked this pull request as draft April 11, 2022 11:57
@codecov-commenter

This comment was marked as outdated.

@Yuri05
Copy link
Member

Yuri05 commented May 10, 2022

I wonder how this (still incomplete) picture would look like in black-white? 😄

But in fact I asked myself already 20 years ago: why would somebody print papers/reports in BW?

@IndrajeetPatil
Copy link
Member Author

I wonder how this (still incomplete) picture would look like in black-white? 😄

I know this is a rhetorical question, but here you go nonetheless 😅

image

@IndrajeetPatil
Copy link
Member Author

Can this be resolved?

@StephanSchaller Good catch. Yes, this is easy to resolve; I will take care of it.

@Yuri05
Copy link
Member

Yuri05 commented May 10, 2022

I know this is a rhetorical question, but here you go nonetheless 😅

Nice. As a color-blind person I cannot even handle the coloured version of it, so the BW-version - well... 😄

@msevestre
Copy link
Member

Argument: If you print a report/manuscript on paper and print in B&W (what many (most?) people still do)

That would be interested to see if this is the case. I would have implicitly said otherwise, without any data to back my claim.

That being said, the fact that some people may still print their report, and when they do, print in BW, should have no bearing on how we create our default plots.

At any rate, it should be very easy to create a plot that do not overload grouping criteria such as line types + colors, when this is logically not required

@StephanSchaller
Copy link
Member

At any rate, it should be very easy to create a plot that do not overload grouping criteria such as line types + colors, when this is logically not required

Yes, as I understand it now, everyone will be able to set their preferences so that everyone will be happy :-)

Copy link
Member

@PavelBal PavelBal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some changes to the documentation are required.

Also some code refactoring resp. questions.

R/default-plot-configuration.R Outdated Show resolved Hide resolved
R/default-plot-configuration.R Outdated Show resolved Hide resolved
#' R6 class defining the configuration for `{ospsuite}` plots. It holds values
#' for all relevant plot properties.
#'
#' Note that the values this objects contains are of general-purpose utility. In
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please remove this, it is confusing as it sounds as these values can be directly used in e.g. "plot()" function.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Removed.

R/default-plot-configuration.R Outdated Show resolved Hide resolved
#' - font face: `tlf::FontFaces`
#' - linetype: `tlf::Linetypes`
#'
#' The available units for `x`-and `y`-axes variables depend on the dimensions
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"Variables" - don't understand in this context.

"Available units for x and y axes depend on the dimensions of data sets to be plotted".

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changed accordingly.

# plot -----------------------------

# Which dataset types are present?
datasetTypePresent <- .extractPresentDatasetTypes(dataCombined)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From here on, same comments as for the "individual" function.

Lots of code duplication, I think most of it can be packed in an internal function. Actually only few lines of code differ.

dataTypeUnique <- unique(dataCombined$groupMap$dataType)

if (length(dataTypeUnique) == 2L && all(dataTypeUnique %in% c("observed", "simulated"))) {
datasetTypePresent <- .presentDataTypes$Both
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return early

if (length(dataTypeUnique) == 2L && all(dataTypeUnique %in% c("observed", "simulated"))) {
datasetTypePresent <- .presentDataTypes$Both
} else if (length(dataTypeUnique) == 1L && dataTypeUnique == "observed") {
datasetTypePresent <- .presentDataTypes$Observed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Return

datasetTypePresent <- .presentDataTypes$Both
} else if (length(dataTypeUnique) == 1L && dataTypeUnique == "observed") {
datasetTypePresent <- .presentDataTypes$Observed
} else if (length(dataTypeUnique) == 1L && dataTypeUnique == "simulated") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Last else if not required if you return early.

#' @field pointsColor,pointsShape,pointsSize,pointsAlpha A selection key or values for choice of color, fill, shape, size, linetype, alpha, respectively, for points.
#' @field ribbonsFill,ribbonsSize,ribbonsLinetype,ribbonsAlpha A selection key or values for choice of color, fill, shape, size, linetype, alpha, respectively, for ribbons.
#' @field errorbarsSize,errorbarsLinetype,errorbarsAlpha A selection key or values for choice of color, fill, shape, size, linetype, alpha, respectively, for errorbars.
#' @field plotSaveFileName,plotSaveFileFormat,plotSaveFileWidth,plotSaveFileHeight,plotSaveFileDimensionUnits,plotSaveFileDpi File name (without extension) format to which the plot needs to be saved, and the specifications for saving the plot.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also - how to change between output to a file and ouput to standard device (e.g. "plot" tab in RStudio)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, this is now mentioned in the new Saving plot section.

@PavelBal
Copy link
Member

I also wonder - why is it that slow? Creating a simple individual time profile takes like two seconds, the same plot in esqlabsR is generated almost instantaneously. Is it ggplot, or TLF, or..?

@IndrajeetPatil
Copy link
Member Author

Here are what the default plots look like.

I am not too happy with the fourth color and shape (yellow with downward triangle), but that can be changed in tlf itself and the plots will adjust accordingly.

Individual

indivProfilePlot

Population

popProfilePlot

@PavelBal
Copy link
Member

Last refactoring:

plotIndividualTimeProfile and plotPopulationTimeProfile are 95% identical.

Create one internal function. There are only two sections that differ between individual and population, and they are identified by checking whether quantiles is NULL

.plotTimeProfile <- function(dataCombined,
                                      defaultPlotConfiguration = NULL,
                                      quantiles = NULL) {
  # validation -----------------------------
  
  validateIsOfType(defaultPlotConfiguration, "DefaultPlotConfiguration", nullAllowed = TRUE)
  defaultPlotConfiguration <- defaultPlotConfiguration %||% DefaultPlotConfiguration$new()
  validateIsOfType(dataCombined, "DataCombined")
  validateIsSameLength(objectCount(dataCombined), 1L) # only single instance is allowed

  # data frames -----------------------------
  
  df <- dataCombined$toDataFrame()
  
  # Getting all units on the same scale
  df <- .unitConverter(df, defaultPlotConfiguration$xUnit, defaultPlotConfiguration$yUnit)
  
  # Datasets which haven't been assigned to any group will be plotted as a group
  # on its own. That is, the `group` column entries for them will be their names.
  df <- .addMissingGroupings(df)
  
  # `TimeProfilePlotConfiguration` object -----------------------------
  
  # Create an instance of `TimeProfilePlotConfiguration` class by doing a
  # one-to-one mapping of internal plot configuration object's public fields
  populationTimeProfilePlotConfiguration <- .convertGeneralToSpecificPlotConfiguration(
    data = df,
    specificPlotConfiguration = tlf::TimeProfilePlotConfiguration$new(),
    generalPlotConfiguration = defaultPlotConfiguration
  )
  
  # plot -----------------------------
  
  obsData <- as.data.frame(dplyr::filter(df, dataType == "observed"))
  
  if (nrow(obsData) == 0) {
    obsData <- NULL
  }
  
  simData <- as.data.frame(dplyr::filter(df, dataType == "simulated"))
  
  if (nrow(simData) == 0) {
    simData <- NULL
  } else  if (!is.null(quantiles)) {
    # Extract aggregated simulated data
    simData <- as.data.frame(.extractAggregatedSimulatedData(simData, quantiles))
  }
  
  if (!is.null(quantiles)) {
    dataMapping <- tlf::TimeProfileDataMapping$new(
      x = "xValues",
      y = "yValuesCentral",
      ymin = "yValuesLower",
      ymax = "yValuesHigher",
      group = "group"
    )
  } else {
    dataMapping <- tlf::TimeProfileDataMapping$new(
      x = "xValues",
      y = "yValues",
      group = "group"
    )
  }
  
  profilePlot <- tlf::plotTimeProfile(
    data = simData,
    dataMapping = dataMapping,
    observedData = obsData,
    observedDataMapping = tlf::ObservedDataMapping$new(
      x = "xValues",
      y = "yValues",
      group = "group",
      error = "yErrorValues"
    ),
    plotConfiguration = populationTimeProfilePlotConfiguration
  )
  
  # Extract current mappings in the legend (which are going to be incorrect).
  legendCaptionData <- tlf::getLegendCaption(profilePlot)
  
  # Update the legend data frame to have the correct mappings.
  newLegendCaptionData <- .updateLegendCaptionData(legendCaptionData, populationTimeProfilePlotConfiguration)
  
  # Update plot legend using this new data frame.
  profilePlot <- tlf::updateTimeProfileLegend(profilePlot, caption = newLegendCaptionData)
  
  return(profilePlot)
}

The exposed functions just forward to the internal:

plotIndividualTimeProfile <- function(dataCombined,
                                      defaultPlotConfiguration = NULL) {
  
  .plotTimeProfile(dataCombined, defaultPlotConfiguration)
}

plotPopulationTimeProfile <- function(dataCombined,
                                      defaultPlotConfiguration = NULL,
                                      quantiles = c(0.05, 0.5, 0.95)) {
  # validation -----------------------------
  
  validateIsNumeric(quantiles, nullAllowed = FALSE)
  validateIsOfLength(quantiles, 3L)
  
  .plotTimeProfile(dataCombined, defaultPlotConfiguration, quantiles)
}

@IndrajeetPatil
Copy link
Member Author

Great idea! Done.

@PavelBal PavelBal merged commit 94d6f6a into develop May 11, 2022
@PavelBal PavelBal deleted the old_plot_config branch May 11, 2022 15:49
@msevestre
Copy link
Member

I did not have time to review the code at all. I will comment on this later today or tomorrow

@PavelBal
Copy link
Member

I did not have time to review the code at all. I will comment on this later today or tomorrow

Sorry, this has been handing here for so long that I thought you already looked into it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
6 participants