-
Notifications
You must be signed in to change notification settings - Fork 9
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Why assigning to names works #9
Comments
Yes, but I think that's part of the complaint, the 'aRrgh', if you will. "names" is an attribute of the object, but If we do recognize that 'names' is an attribute why do we set the value w/ |
Don't think of names() as a getter function to access the attribute "names" in an R object. Think of it as a convenience wrapper around attr(x,"names"). In fact, don't think of getter and putter functions... I think this is one of the sources of your aRrgh. You are thinking of R like a programmer. Not just any programmer, but a programmer from other languages that use other schemas. For example, object orientedness. R's object orientation S3, S4, proto etc are after thoughts. So, R and R code is much more like a functional programming language than it is like an object oriented one. From a non-programming mindset, or the mindset of a functional programmer R makes a reasonable amount of sense (I come from a rudimentary functional programming background and have seen non-programmers pick up and excel at R). Consider that you really have this thing over here and you want it to be over there. You shouldn't need to call that spot on your bookshelf a different name if you are picking up a book or putting the book back in it - should you? Of course you do in object orienting programming, because you want to verb that object. Of course you don't in functional programming, because it is just one piece of memory moving over to another piece. So, when using base R... ask yourself. How would a functional programmer make something like R work? (This almost certainly isn't how it works under the hood - it is just a thought experiment). Imagine, for example, that names is just an array that is index matched to an array of every other object in R's workspace. Many of the values in that array are null because they have no names attribute. However, you could easily poke values in and out of that array. If you poke a NULL in you erase the attribute. You poke in any other value and you set the attribute. If you are lucky, R will sanity check it for you, but maybe it won't, e.g. names(iris) <- 1:3. This might seem horrible, but you can add the attribute names to /any/ object in the workspace simply by (implicitly) erasing the null and providing an alternate value for that R object. Consider: x <- "bob"
names(x) <- "hi"
x
names(x) <- NULL
x You didn't have to redefine the object or make a getter or putter function. You just did it. This flexibility is very nice when you are working with R in interactive mode trying to manipulate and play with data (really one of R's strengths). It is pretty common in R because that is R's schema. You can assign things to other things and overwrite many of the /basic/ operations in R. For example, if you thought being able to assign the value of F to T was bad, wait and try: `+`<- function(e1,e2) {e1-e2}
5+3 Of course, all of this is the good and bad of functional programming. Writing small code is easy. Writing big code gets pretty difficult (near impossible) ... and that is part of why programming languages started going to more object based schemas. You'll note that some of the better maintained and developed packages in R make use of R's class system (or an alternate one like proto)... there are certainly reasons from that. The ease of writing small programs and the difficulty of writing large ones is also part of why R suffers from a bash mindset, i.e. lots of small simple tools that are powerful. For this reason, many experienced R programmers write code that has many functions nested inside many other functions. Ease of use in interactive mode is also the nice part about dynamic typing. Again - from a programming standpoint, it may very well be a nightmare (it definitely comes up in the Inferno text). However, from a non-programmer playing with the data standpoint, it is easier to be sloppy about the typing and fix the places it turns out to be a problem than it is to always select the type in advance. Another example is vector/array indexing. Things like 0 indexed arrays make sense to programmers but not to non-programmers (another gripe is that you say base 1 when you mean 1 indexed... base 1 sounds like you are talking about a number system, e.g. base 2 or base 10). R is funky but I like it. I understand and do empathize with your aRrgh. I hope that the analogies I gave above help you think about it and that you will continue to revise aRrgh to help others who are suffering the pain you are suffering. |
I should note I'm just a guy with this repo watched, not the maintainer. :)
Absolutely. But I think there are places where that functional abstraction breaks down pretty clearly and where it doesn't. Or points where it's just orthogonal to the language problems. Here we're expecting that If we were looking for a clear way to indicate that the properties of an object are updated, there are examples everywhere of bare object properties being updated by assignment. Either in a language w/ object literals like JS or within R. When we assign |
Oh, sorry about the misattribution. I guess I should say the original author/maintainer's aRrgh. I think that the places where the functional abstraction breaks down, but that they are informative of R's simplicity proclivity (for the most part). In a functional programming languages functions only do returns. In R for the most part functions do returns... but they also can lay bare the memory space for writing. You are right, that breaks the analogy to functional programming by violating our (programmers) expectations about how scoping should work. However, if you didn't know about scoping, it would make sense from the you "have this thing over here and you want it to be over there"/I just want to move this piece of memory over to another place approach. From a naive standpoint, if... x <- LETTERS
x[x == "A"] <- Z
x ...works then why wouldn't names(x) <- "foo"? Of course, in many places that logic won't work. Then again, I think because I am at peace with the idea that functions have their own scope separate from global that I've never been attempted to assign something to a function result that didn't actually have an implicit replacement function written for it. However, I'd never stopped to think about it before... that R has the ability to have names and I think with names and attr we are operating at kind of a meta-level inside R. There is a reason why names is not just a named list inside of each R object - we want the attribute to be able to operate outside of the structure of the object itself (e.g. class). I do see your point, but I'm not sure what other approach could have been used in R to indicate that properties of objects were being updated rather than something else. Nearly every other special character on a standard 101 key US keyboard already has some meaning in R or is an escape character of some kind. I'm sure that is why some of the basic operations are multi-character, e.g. matrix multiplication, modulus, etc. I suppose that another multi-character operator could have been added for that purpose... and if someone really wanted to address attributes in R I don't see any reason why they wouldn't be able to. Although I haven't been able to quite crack it yet myself. |
I can see the similarities if we're willing to think very hard about what It's possible that this is a consequence of the subset/assignment pattern as a whole. While I love it, there is some strangeness to this notion...
The subsetting and extraction function will return (if in the REPL, assigned to a name or at the bottom of a function) the extracted vector as a new object in the environment. If we 'intercept' it, so to speak, it represents some sort of slot for mutable state, iff we do nothing else with it. We get used to it because the pattern is truly very, very handy. But it's weird. That same weirdness is apparent with
From our paradigm within R, these differences can make sense. As you mentioned, it's natural to think about accessor functions like the extraction function and operate with them similarly. But there's enough to justify a complaint. |
I now agree that there is enough to justify a complaint. I just think that there is enough clarity that the complaint can be addressed in a way that will help others who have the same complaint understand what R is doing. I've probably been in the notation too long to see it. Although, if you had replaced your first code example with equals lines it would seem maybe a bit more confusing. One nice thing about the <- notation is that it tells you what is going where. A novice can look at each side of that operator and determine what the 'what' is that is going to the 'where'. The mutable state/dynamic typing/dynamic sizing of data-structures part of it again. Great stuff for playing with data... but potentially dangerous and confusing. I'd especially grant that what R does when data-structure sizes mismatch is confusing, e.g. your last line of the second code example here and in addition to what the original author mentioned in regards to recycling. |
This is an interesting conversation; thanks! I had worked out some of my personal angst on this point after someone explained "You are thinking of R like a programmer. Not just any programmer, but a programmer from other languages that use other schemas." This is absolutely my audience, so that's an appropriate perspective from which to approach the guide, if not the language. :) |
You write that: "You can see a list of columns with names(frame). You rename columns by, spookily, assigning into names(frame). Do you know how and why this works? Please educate me."
It works because 'names' is an attribute of the data.frame that is being accessed by, e.g. names(iris). This yields the same value as attr(iris,"names")... and you can use either to retrieve or assign names to the columns in a data.frame.
The text was updated successfully, but these errors were encountered: