-
Notifications
You must be signed in to change notification settings - Fork 103
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added geom_col and added color parameter to geom_smooth #5069
Changes from 5 commits
2420180
bb7b1ed
6b66db3
f645faf
390c4c3
7ccc696
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -119,6 +119,7 @@ Public Class ucrGeom | |
Dim clsgeom_boxplot As New Geoms | ||
Dim clsgeom_contour As New Geoms | ||
Dim clsgeom_count As New Geoms | ||
Dim clsgeom_col As New Geoms | ||
Dim clsgeom_crossbar As New Geoms | ||
Dim clsgeom_curve As New Geoms | ||
Dim clsgeom_density As New Geoms | ||
|
@@ -158,7 +159,6 @@ Public Class ucrGeom | |
Dim clsgeom_violin As New Geoms | ||
Dim clsgeom_vline As New Geoms | ||
|
||
'Am additng this here for easy working and can be moved later to an alphabetically correct postion | ||
|
||
Dim clsgeom_statsummary As New Geoms | ||
|
||
|
@@ -376,6 +376,32 @@ Public Class ucrGeom | |
|
||
lstAllGeoms.Add(clsgeom_boxplot) | ||
|
||
clsgeom_col.SetGeomName("geom_col") | ||
'Mandatory Aesthetics | ||
clsgeom_col.AddAesParameter("x", strIncludedDataTypes:=({"factor"}), bIsMandatory:=True) | ||
clsgeom_col.AddAesParameter("y", strIncludedDataTypes:=({"numeric"}), bIsMandatory:=True) | ||
'Optional aesthetics | ||
clsgeom_col.AddAesParameter("alpha", strIncludedDataTypes:=({"factor", "numeric"})) | ||
clsgeom_col.AddAesParameter("fill", strIncludedDataTypes:=({"factor", "numeric"})) | ||
clsgeom_col.AddAesParameter("colour", strIncludedDataTypes:=({"factor", "numeric"})) | ||
clsgeom_col.AddAesParameter("linetype", strIncludedDataTypes:=({"factor"})) 'Warning: This distinguishes bars by varying the outline, however, the distinguished bars only visibly look different if the colour and the fill aesthetics take different values. | ||
clsgeom_col.AddAesParameter("size", strIncludedDataTypes:=({"factor", "numeric"})) | ||
'Geom_col layer parameters | ||
clsgeom_col.AddLayerParameter("width", "numeric", "0.90", lstParameterStrings:={2, 0, 1}) 'The width of the bars is given as a proportion of the data resolution. | ||
'Global Layer parameters | ||
clsgeom_col.AddLayerParameter("stat", "list", Chr(34) & "count" & Chr(34), lstParameterStrings:={Chr(34) & "count" & Chr(34), Chr(34) & "identity" & Chr(34)}) | ||
clsgeom_col.AddLayerParameter("show.legend", "list", "TRUE", lstParameterStrings:={"NA", "TRUE", "FALSE"}) | ||
clsgeom_col.AddLayerParameter("position", "list", Chr(34) & "stack" & Chr(34), lstParameterStrings:={Chr(34) & "stack" & Chr(34), Chr(34) & "dodge" & Chr(34), Chr(34) & "identity" & Chr(34), Chr(34) & "jitter" & Chr(34), Chr(34) & "fill" & Chr(34)}) | ||
'See global comments about position. | ||
'Aesthetics as layer parameters... Used to fix colour, transparence, ... of the geom on that Layer. | ||
clsgeom_col.AddLayerParameter("fill", "colour", Chr(34) & "white" & Chr(34)) | ||
clsgeom_col.AddLayerParameter("colour", "colour", Chr(34) & "black" & Chr(34)) | ||
clsgeom_col.AddLayerParameter("linetype", "numeric", "1", lstParameterStrings:={0, 0, 6}) | ||
clsgeom_col.AddLayerParameter("alpha", "numeric", "1", lstParameterStrings:={2, 0, 1}) 'Note: alpha only acts on the fill for bars. The outline is not getting transparent. | ||
clsgeom_col.AddLayerParameter("size", "numeric", "0.5", lstParameterStrings:={1, 0}) ''Varies the size of outline. Note: negative size gives size 0 in general, but 'Warning: sometimesgive errors... | ||
|
||
lstAllGeoms.Add(clsgeom_col) | ||
|
||
'clsgeom_contour.SetGeomName("geom_contour") | ||
''Mandatory | ||
'clsgeom_contour.AddAesParameter("x", bIsMandatory:=TRUE) | ||
|
@@ -1240,6 +1266,7 @@ Public Class ucrGeom | |
'formula has to be an input and we dont have that currently. its passed in like this formula= y ~ x or formula= y ~ poly(x, 2) or formula= y ~ log(x) so the user has to type in stuff | ||
'clsgeom_smooth.AddLayerParameter("formula",) | ||
clsgeom_smooth.AddLayerParameter("se", "boolean", "TRUE") | ||
clsgeom_smooth.AddLayerParameter("colour", "colour", Chr(34) & "black" & Chr(34)) | ||
clsgeom_smooth.AddLayerParameter("na.rm", "boolean", "FALSE") | ||
clsgeom_smooth.AddLayerParameter("show.legend", "list", "TRUE", lstParameterStrings:={"NA", "TRUE", "FALSE"}) | ||
clsgeom_smooth.AddLayerParameter("inherit.aes", "boolean", "FALSE") | ||
|
@@ -1438,6 +1465,7 @@ Public Class ucrGeom | |
clsgeom_vline.AddLayerParameter("size", "numeric", "0.5", lstParameterStrings:={1, 0}) ''Varies the size of outline. Note: negative size gives size 0 in general, but 'Warning: sometimesgive errors... | ||
|
||
lstAllGeoms.Add(clsgeom_vline) | ||
|
||
End Sub | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @Ogik99 This needs to be moved to the correct alphabetical position, else it will always be the last on the list and we do not want that. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. geom_col is now correctly placed There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Great, this looks good. |
||
|
||
Public Event GeomChanged() | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I just tried a modified example from the
geom_col
help with x as numeric and y as factor and it worked. So this doesn't look correct yet.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have tried testing using dodoma data and for the x variable for example, both character and numeric data types are acceptable. The y variable only accepts numeric data types. This conforms to the above code.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My example shows that the y variable can be a factor.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true, however it does not give meaningful results.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've taken your comment as a challenge! Here's an example which I claim is meaningful. A data frame with the ages of 10 objects and their quality from low, medium, high. To compare age and quality I have produced a bar chart with age (numeric) on the x axis and quality (factor) on the y axis. This looks perfectly meaningful and sensible to me.
Even if we couldn't think of an example which we thought was "meaningful" or even sensible, what we are trying to do when we implement geoms from ggplot2 is to allow them to be used as flexibly as possible (like in R) but without causing problems. So, if having x as numeric meant that ggplot2 gave an error then we would prevent x being numeric. However, if we only thought that it wasn't sensible to have as numeric then we would still allow it for two reasons:
So for the general system of using geoms we are making it as flexible as possible, we don't need to consider what might or might not be sensible. The basic rule is that if ggplot2 allow it then we try to allow it.
For main dialogs it might be different as our main aim is to make them easy to use, and therefore we might restrict functionality to make it easier to get something "sensible", but we would often try to keep the flexibility through the (more advanced) sub dialogs.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding is that a factor variable can be a character string or a numeric value. In the above example, age is a numeric value but its actually a factor (a categorical variable). So in this case having x (age) as a numeric value is okay but in essence it is actually a factor (I stand to be corrected).
I agree that different users may have different needs and what is sensible to one person may not be sensible to another. I am working on removing the restrictions on the data types.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These are good points to bring up and there are lots of different ideas in here that are mixed up together. Let's be clear on variable types. When you define the aesthetics for a geom you are setting the types of variables that are valid for each parameter. These are the R variable types. Each variable has a type. It can be factor, numeric or character (or others) but it can't be more than one of those 3 types at the same time, so you're not quite correct to say that a factor can be a character string or numeric value. A factor is a factor only.
A factor could be created from a character or numeric column but then it is no longer numeric or character, it's factor. R will only recognise it as factor even if it originally came from a numeric. For example, in R these two columns are exactly the same type:
To R, x and y are the same type of variable:
factor
. You can check this by doingclass(x)
orclass(y)
. Even though x was made from a numeric vector and y from a character vector that is unknown to R now.You are right that in my example I was treating x as a categorical variable because there is one bar for each distinct value of x. And there would be lots of numeric columns where treating them as categorical wouldn't make sense. However, categorical is not a variable type in R. What we've discovered is that
geom_col
will treat the x aesthetic as categories even if it isn't a factor. So if I want to use age on the x axis I don't need to convert it to factor. So in essence age it is being treated as categorical for this graph but it is definitely not a factor.Categorical and factor are very similar terms but in R factor is a technical term for a type of variable it stores. Any variable is either a factor or not which can be checked with
class()
. Anything can be made into a factor, even if its not sensible to do so. Whereas categorical is a statistical term. You could say that age is categorical even though its stored in R as numeric. There is no way in R to say that a variable is categorical without converting it to a factor. So when we restrict something to factors only we need to remember that this will exclude any non factor columns, even if they are thought of as "categorical", like the age.For implementing geoms its much easier because we use the simple rule that if ggplot2 allow it, then we try to allow it. For geoms, we want maximum functionality and so we don't need to consider what is sensible and what is not (which is much harder to decide).
I'm not sure if that will clear things up or make things more confusing, but this was a good point so please keep raising these when you have thoughts.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This definitely clears a lot of things. Thank you Danny.