Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Low hanging fruit and the jump to R v4 #159

Merged
merged 41 commits into from
Sep 23, 2020
Merged

Low hanging fruit and the jump to R v4 #159

merged 41 commits into from
Sep 23, 2020

Conversation

nathancday
Copy link
Member

@nathancday nathancday commented Jun 23, 2020

This is PR in progress with remedies for small open issues for new functionality:

@nathancday
Copy link
Member Author

Adds support for #110

@nathancday nathancday changed the title WIP: picking low hanging fruit WIP: picking low hanging fruit and upgrade to R v4 Jul 12, 2020
@nathancday
Copy link
Member Author

Hey @ijlyttle this branch covers a lot of open issues and brings us up to speed for R 4.0.

I think it's time to do a CRAN release and I would like to submit it by the end of July.

Not all of this code has to go into the release, but most of the changes are on new functions, so there is low risk including them. Ideally I'd say we merge this and #155, then cut it.

What do you think?

@nathancday nathancday requested a review from ijlyttle July 13, 2020 00:15
@nathancday nathancday changed the title WIP: picking low hanging fruit and upgrade to R v4 Low hanging fruit and the jump to R v4 Jul 14, 2020
Copy link
Member

@ijlyttle ijlyttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi again Nate - I know I'm very late again in responding.

I like where you're going with this, I just want to make sure we get an idea of a function-naming convention before creating new functions.

I think, as well, this could use a version-bump and additions to the pkgdown function index.

R/boxr_misc.R Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
R/boxr__internal_verb_exit.R Show resolved Hide resolved
R/boxr_add_description.R Outdated Show resolved Hide resolved
R/boxr_add_description.R Outdated Show resolved Hide resolved
R/boxr_save_load.R Outdated Show resolved Hide resolved
NEWS.md Outdated Show resolved Hide resolved
R/boxr_save_load.R Outdated Show resolved Hide resolved
R/boxr_save_load.R Show resolved Hide resolved
#' along with a helpful message.
#'
#' The returned `data.frame` contains a variable, `file_version_id`,
#' which you can use with [box_dl()].
#' * `box_current_version()` returns a `integer`, starting from 1, which can be passed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

how about box_get_current_version()? at some point also to rename (and soft-deprecate) box_previous_versions() to box_file_get_info() (I'd have to take a look at what gets returned)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I dropped this to box_version(file_id) in my responses. It feels succinct and fits with box_previous_versions()

Copy link
Member

@ijlyttle ijlyttle Aug 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

working through all this - this afternoon. I fiddled with the documentation a little bit, just to get my mind right - but made no operational changes other than an as.integer() and to use glue() in the API call.

How about:

  • box_version_info() as new name for (to-be-deprecated) box_previous_versions().
  • box_version_number() for box_version().

I like putting version in the second slot of each name for autocomplete reasons.

That being proposed, how about a quiet argument (default to FALSE) for box_version()?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, that sounds good on names. No rush to deprecate box_previous_versions() (at least as an alias) because it's been around since the beginning

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Tidyverse-style soft-deprecate, then, i.e. "superseded". Sounds good to me!

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, thinking about the naming again, how does box_version_history() sound instead? My thinking is version_info is a bit ambiguous and doesn't say anything directly about the current version, version_history feels more like what the func does only report info about past versions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sounds good! I wonder if this is a good opportunity to supersede, rather than to deprecate.

Let me see if I can take a whack at it - here's an example of the idea: https://github.com/tidyverse/dplyr/blob/master/R/sample.R

@nathancday
Copy link
Member Author

I'm going to start moving these edits to a new PR, but will leave this one open until that is complete.

@ijlyttle
Copy link
Member

ijlyttle commented Aug 7, 2020

I was able to use usethis::pr_pull_upstream() then sort through the conflicts.

If you want to use this PR, great - if you want to move to a new PR, great.

@nathancday
Copy link
Member Author

Hey @ijlyttle I think I responded to your edits on this PR.

There are a few that are still open for discussion and I tried to leave a specific comment above. All of these revolve around naming functions, our new favorite boxr past-time LOL

@ijlyttle
Copy link
Member

Hi @nathancday - I made a few fiddly changes, and have hit the ball back on our game of API tennis... have I missed responding to anything?

I think we're getting closer. Happy to chat whenever!

@nathancday
Copy link
Member Author

Thanks for review no2, I don't think you miss anything. I totally agree with your suggested API tweaks, I'll push those changes up tonight.

@codecov-commenter
Copy link

codecov-commenter commented Aug 23, 2020

Codecov Report

Merging #159 into master will increase coverage by 3.10%.
The diff coverage is 67.31%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #159      +/-   ##
==========================================
+ Coverage   55.88%   58.99%   +3.10%     
==========================================
  Files          24       25       +1     
  Lines        1555     1724     +169     
==========================================
+ Hits          869     1017     +148     
- Misses        686      707      +21     
Impacted Files Coverage Δ
R/boxr_add_description.R 0.00% <ø> (ø)
R/boxr_comment.R 0.00% <0.00%> (ø)
R/boxr_misc.R 46.49% <0.00%> (+1.84%) ⬆️
R/boxr_search.R 0.00% <ø> (ø)
R/boxr_write.R 44.44% <ø> (ø)
R/boxr_s3_classes.R 12.62% <5.71%> (-1.49%) ⬇️
R/boxr_auth.R 27.63% <50.00%> (ø)
R/boxr_file_versions.R 86.84% <78.26%> (-13.16%) ⬇️
R/boxr_collab.R 84.84% <84.84%> (ø)
R/boxr__internal_dir_comparison.R 97.43% <100.00%> (+0.02%) ⬆️
... and 8 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6624083...83b823f. Read the comment docs.

Copy link
Member

@ijlyttle ijlyttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Nate, just a couple of fiddly things. I am looking at the superseding mechanism, so I think we can stay out of each others' way...

R/box_comments.R Outdated Show resolved Hide resolved
@ijlyttle
Copy link
Member

I ended up following a much-easier path: usethis::use_lifecycle().

I reworked the documentation to separate box_previous_versions(), then split apart the code in case we want to do something different with how box_version_history() behaves.

R/box_comment.R Outdated Show resolved Hide resolved
@nathancday nathancday force-pushed the cur_version branch 3 times, most recently from b4367f8 to 08a5a5b Compare September 5, 2020 21:13
@nathancday
Copy link
Member Author

I'm not sure I like classing things anymore, after seeing how you'd have to use box_comment_get() with the class changes I just pushed to this branch. It feels like we are making it clunkier for users by requiring an extra as.frame() call to get a useful object without offering any extra utility in the intermediate object.

Here I've layered our box_r class on top of the list from httr::content(), so the intermediate (pre-frame coerce) object can be name checked, subsetted etc. If we want to continue classing and as-ing everything I think we should make boxr_collab classes work like this too.

Copy link
Member

@ijlyttle ijlyttle left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To answer your question on the usefulness of returning an amorphous list-object of questionable utility, I think we should do so for two reasons:

  • it fits in with how the other functions work.
  • if someone wants a peek at what the API returned, perhaps to debug, it's the only way we can provide that.

boxr was set up somewhat along the lines of what httr suggests.

Clearly, this comes at a cost. But it also provides some flexibility in what someone can use - for a given box object (say a comment or list of comments), we can provide a way to extract:

  • a data frame
  • a tibble, if you like
  • an opinionated optimal version of the return object, using something like box_wrangle() (in the future).

I know I have been burned by other boxr functions when I would print them, then expect a data frame. For this reason, I added the --- printing as data frame --- to the print method for collabs.

I see a lot of the pattern where Box either returns a single object (box_create_comment()) or a collection of objects in the entries element (box_get_comment()) - so we can provide a single-row data frame or a multi-row data frame. The pattern extends to collaborations and to versions, and I'm sure to other stuff.

This is where I thought a set of internal helpers could be reused: stack_row_df|tbl() and stack_rows_df|tbl(). By having these as internal functions, we can modify and test easily when new edge-cases appear.

The print helper print_dispatch() automatically tells you if {tibble} is loaded or not - if a person wanted to always print as.data.frame, we could set an option for that (I'll make a PR).

I think there are better names out there for these internal functions, but being internal functions, we can rename them whenever it suits.

What do you think?

R/boxr_comment.R Outdated
x <- httr::content(req)

# class it up
x[["file_id"]] <- file_id
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think that our return object should be the parsed response from the API call, and that we should not be adding anything the object itself. We could add this to the object as an attribute.

R/boxr_comment.R Outdated
x <- httr::content(req)

# class it up
class(x) <- c("boxr_comment_create", class(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This class could be called boxr_comment, to mirror the Box docs.

For this endpoint, the Box API returns a comment object; when we get a list comments, its
entries member is an array of comment objects.

We see this pattern elsewhere; it's the motivation behind the stack_row_df|tbl() and stack_rows_df|tbl() helpers.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly as box_comment_get(), we could add the file_id as an attribute of the object.

R/boxr_comment.R Outdated

# class it up
x[["file_id"]] <- file_id
class(x) <- c("boxr_comment_get", class(x))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this class could be named boxr_comment_list.


# * Create ----------------------------------------------------------------

# no method for as.data.frame() needed, list.as.data.frame() works fine.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there appears a NULL in a response, this throws a wrench into the stacking process - I work around this in stack_row_df() - which is why I think it could be a useful helper here (and in general).

#' @export
#'
print.boxr_comment_create <- function(x, ...) {
x <- as.data.frame(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think print_dispatch() could be useful here. It could be preceded by the glue::glue() call.

#' @export
#'
as.data.frame.boxr_comment_get <- function(x, ...) {
# stacking the comment records into a frame, one row per comment record
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack_rows_df() should work here. If the file_id were an attribute of x, it could be added here to the data frame.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As well, I think it would be good to have an as_tibble() method.

#' @export
#'
print.boxr_comment_get <- function(x, ...) {
x <- as.data.frame(x)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think print_dispatch() could work here, too.

"Comment"
)
expect_s3_class(resp, "boxr_comment_create")
expect_s3_class(resp, "list")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I forgot to mention this earlier: in the new testthat this will have to be expect_type(resp, "list"), which works now - I'm sure there will be other stuff that comes up when we switch over (not now!), but it could be useful to get ahead of this one.

@nathancday
Copy link
Member Author

nathancday commented Sep 7, 2020

I agree about preserving the original API response as much as possible, I just think that response is more useful in a data frame format (and most users probably end up there eventually) box_previous_versions() has this data frame direct pattern. In a future I could see all boxr functions returning special frames, like googledrive's dribbles, and if folks want to work with API response objects directly maybe they are the users of cardboard.

I just responded to changes and would really like to get this PR to done. Let me know if you are okay to merge and move on to other tasks.

@ijlyttle
Copy link
Member

Sorry for delay - last week was busy at work.

I agree that cardboard is the way to go to support explicit access to the low-level API response, and that there should be a an easy way to get a high-level response without as.data.frame() or equivalent.

I agree box_previous_versions() breaks that pattern, and we inherited that break. That said, I am uncomfortable to have a set of functions behave one way, another set behave a different way, and not have a good heuristic to tell which is which.

Now that I think I have a better idea of the (admittedly less-than-ideal) compromise, I can make an effort to document it better, as it was not immediately evident to me.

I should have some time this week to go through this and make any adjustments, so we can move on to other tasks.

@nathancday
Copy link
Member Author

No worries, Box doesn't pay the bills.

We can leave the behavior as is for previous_versions (hesitant to break long stable code) and support the list to coherce in box_version_history, does that sound like a fix?

FWIW I was imaging the return object reckoning (everyone gets a data.frame/tibble by default?) in v4.0.

@ijlyttle
Copy link
Member

I agree we will have the object reckoning leading to v4 - I am in favor of the data.frame/tibble return (although we will have to sort out which one, and nesting, etc.)

For me the trick will be to migrate from current to v4 without breaking anything. testthat's edition approach seems interesting, but I think even the Tidyverse team are still figuring out if/how to extend that approach.

Getting to work now to map out the current situation, and to see how it might relate to some of our other recently-opened issues.

@ijlyttle ijlyttle mentioned this pull request Sep 20, 2020
…f the API response:

- the rest deals with only pagination
- by returning only entries, we can support pagination
- this is what box_ls() does
… amend tests

- uses `version_id` rather than `file_version_id`
@ijlyttle
Copy link
Member

This is a placeholder for the narrative that I will write Monday.

 - tweak box_read_rds() docs
 - deprecation call for box_previous_versions()
 - amend NEWS
@ijlyttle
Copy link
Member

ijlyttle commented Sep 21, 2020

Here's what I think I know about the changes:

  • tweaked the documentation for box_browse(), box_save_rds(), and box_read_rds().
  • added on.exit() to clean up after temp files. I suspect there's a lot more of this lurking in the codebase; {withr} is coming out with some tools that could be handy.
  • added {jsonlite} to Imports - I have no idea why that was not already there 😳
  • tweaked the documentation for box_comment_create(), box_comment_get(), added to S3 documentation.
  • box_comment_get() and box_collab_get() return only the content for the entries element of the API response. This is probably the biggest change I made:
    • the rest of the API response has only pagination information
    • to support pagination, we need to aggregate and return the entries
    • this is what we did with box_ls() (which was my first big PR to boxr, so forgetting this was 😳😳)
    • this also splashes on stack_rows_df() and stack_rows_tbl()
    • as a matter of semantics, distinguish between a request, response, the content of a response, and the result that we return.
  • reworked the version functions; I think it is now looking a little bit how I might imagine a {cardboard} implementation (to be discussed, of course):
    • box_version_history() returns a data frame. Not a change-of-code, but rather a change-of-mind. This is the less-disruptive change from where things are now, and it suggests a possible way forward with {cardboard}.
    • exported box_version_api() as an internal function; gives someone the chance to grab and parse the content themselves.
    • added the S3 argle-bargle.
    • made an internal parsing function mutate_version_list() to parse the entries.
      • in the future, I can see some things like parsing datetimes being part of a generalized parser used for all API content.
      • there are some things, like specifying the version_no column, that are particular to this class.
    • harmonized the version_no and version_id to agree with the forthcoming changes to box_ls().
    • also, fixed a bug in the version numbers; it was numbering the rows without adding 1 (that was a bug, right?)
  • other documentation cleanup

I am happy with where this is now - if there's anything I should change or we should discuss, please let me know. Otherwise, I would be pleased that one of us "squash-and-merge".

🙌

@ijlyttle ijlyttle mentioned this pull request Sep 21, 2020
@nathancday nathancday merged commit 90df3f8 into master Sep 23, 2020
@nathancday nathancday deleted the cur_version branch February 6, 2021 23:02
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Remedy Travis config warnings
3 participants