Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Help page rewording for "named character vectors" join #2198

Closed
jealie opened this issue Jun 8, 2017 · 1 comment
Closed

Help page rewording for "named character vectors" join #2198

jealie opened this issue Jun 8, 2017 · 1 comment

Comments

@jealie
Copy link

jealie commented Jun 8, 2017

I just bumped into the same mistake than in this issue: #2094 .

Briefly, the current help on named character vectors join (argument on=) is confusing. It currently states:

As a named character vector, e.g., X[Y, on=c(x="a", y="b")]. This is useful when column names to join by are different between the two tables.

But the example is ambiguous. Those used to the syntax of merge would expect x="a" to be the equivalent of by.x="a" and y="b" to be the equivalent of by.y="b". In reality column x in data.table X should must be matched to column A in data.table Y.

The error message one gets when doing this mistake is accurate but not very helpful to the newcomer (Column(s) [x,y] not found in X).

Hence a couple of suggestions:

  • use a simpler naming convention in the example and describe plainly the outcome:
As a named character vector, e.g., X[Y, on=c(x1="y1", x2="y2")] to join X and Y by matching the columns named "x1" and "x2" from X with the columns "y1" and "y2" from Y. This is useful when column names to join by are different between the two tables.
  • this syntax has no example further down in the help page. I suggest to provide one, for example by adding to the # joins as subsets section:
DT[X, , on=c(y="v")]  # join using column "y" of DT with column "v" of X
@arunsrinivasan
Copy link
Member

@jealie, agreed on both points.. Would you be willing to make a PR?

HughParsonage added a commit to HughParsonage/data.table that referenced this issue Dec 24, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants