unique and duplicated's default value for 'by=' #1284

arunsrinivasan · 2015-08-20T17:42:02Z

I am not particularly happy with the current option of unique/duplicated having by = key(x) as the default, especially because there's no message or warning as to which columns are being used to compute unique values on.

Now that on= argument is also in place, and we are entertaining the idea of having to set keys less, maybe it's right time to lift this default value so that it's similar to base R? i.e., with cols = seq_along(x) - see #1283.

The text was updated successfully, but these errors were encountered:

franknarf1 · 2015-08-20T17:58:56Z

I agree.

(I'm guessing a "question" issue is soliciting opinions..? I've no opinion on the by->cols issue; I like by, but could live with cols. I have never written the cols arg of the other functions out, rather passing by position every time. In contrast, I always write out by=.)

arunsrinivasan · 2015-08-20T18:08:37Z

Meant it to be internals, sorry about that.

MichaelChirico · 2015-09-13T22:07:55Z

I agree--I rarely use the default value for by in my current usage--even if x is keyed, more often I do something equivalent to by=key(x)[1:2], i.e., only using a subset of the assigned keys.

jangorecki · 2016-04-13T01:15:34Z

that would also refer to uniqueN

…1284.

…ge. #1284

cderv · 2016-11-29T10:53:29Z

Help documentation for these functions seems not to have changed. It still says that by = key(x) is used by default. I think we should update the documentation as it is not verry clear right now.

I discovered this reading through the NEWS for all changes in 1.9.8.

arunsrinivasan added the question label Aug 20, 2015

arunsrinivasan added internals and removed question labels Aug 20, 2015

arunsrinivasan added this to the v1.9.8 milestone Sep 22, 2015

arunsrinivasan modified the milestones: v2.0.0, v1.9.8 Apr 10, 2016

arunsrinivasan added a commit that referenced this issue Jul 21, 2016

Info message in place for duplicated's 'by' argument default value, #…

98d624c

…1284.

arunsrinivasan added a commit that referenced this issue Jul 21, 2016

suppress messages in tests, #1284.

1435707

mattdowle modified the milestones: v1.9.8, v2.0.0 Sep 14, 2016

mattdowle closed this as completed in 11e6497 Sep 14, 2016

mattdowle added a commit that referenced this issue Sep 14, 2016

News item added about unique, duplicated and uniqueN default by= chan…

b9e6ea7

…ge. #1284

mattdowle added a commit that referenced this issue Sep 14, 2016

Fixes to pass full R CMD check for #1284

f12c833

mattdowle mentioned this issue Sep 14, 2016

duplicate warning in 1.9.7 - how to switch this off #1841

Closed

MichaelChirico mentioned this issue Nov 29, 2016

Unable to use data.table's duplicated function #1940

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

unique and duplicated's default value for 'by=' #1284

unique and duplicated's default value for 'by=' #1284

arunsrinivasan commented Aug 20, 2015

franknarf1 commented Aug 20, 2015

arunsrinivasan commented Aug 20, 2015

MichaelChirico commented Sep 13, 2015

jangorecki commented Apr 13, 2016

cderv commented Nov 29, 2016

unique and duplicated's default value for 'by=' #1284

unique and duplicated's default value for 'by=' #1284

Comments

arunsrinivasan commented Aug 20, 2015

franknarf1 commented Aug 20, 2015

arunsrinivasan commented Aug 20, 2015

MichaelChirico commented Sep 13, 2015

jangorecki commented Apr 13, 2016

cderv commented Nov 29, 2016