Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Datasets #128

Merged
merged 45 commits into from
Nov 11, 2017
Merged

Datasets #128

merged 45 commits into from
Nov 11, 2017

Conversation

masojus
Copy link
Contributor

@masojus masojus commented Oct 26, 2017

See PR #96 for details. This PR is to continue that discussion and merge with changes made in master related to all the .NET Standard work.

Sha added 30 commits July 26, 2017 10:22
Always using outside of namespace.
We don't need to expose this in the client.
Because the sync ones end up calling the async ones. This way all the dataset methods on KeenClient are covered.
Also refactored the tests to be a bit neater
@masojus
Copy link
Contributor Author

masojus commented Nov 4, 2017

Major remaining work is captured in Issues, and we should port this all to NetStandard before release.

@coveralls
Copy link

Coverage Status

Coverage increased (+1.9%) to 59.182% when pulling 7379005 on jm_UpdateCachedDatasetsBranch into b820c2e on master.

- At this point all the tests pass, at least.
@coveralls
Copy link

Coverage Status

Coverage increased (+4.2%) to 61.522% when pulling 74d10c3 on jm_UpdateCachedDatasetsBranch into b820c2e on master.

@masojus
Copy link
Contributor Author

masojus commented Nov 4, 2017

Interesting, some of the new tests failed in Travis CI. They failed in the same way on Linux and MacOS, and it seems to be a UriFormatException so it could be a form of path separator issue between Windows and *nix OSes, or perhaps a permissions issue when trying to read the canned JSON response files, but I'll look into it.

…path in Windows, Linux and MacOS without issue.
@coveralls
Copy link

Coverage Status

Coverage increased (+4.2%) to 61.522% when pulling 81e3f4d on jm_UpdateCachedDatasetsBranch into b820c2e on master.

@Shalelol
Copy link

Shalelol commented Nov 6, 2017

Hey @masojus, thanks for all the feedback, I see you've started working on majority of the problems you pointed out.

Please let me know if you need anything from me. We're currently in the middle of the final stages of a feature release this week, but I will be able to help out if there is anything outstanding after this week.

@masojus
Copy link
Contributor Author

masojus commented Nov 6, 2017

Hi @Shalelol. Yeah I think it's close to ready. If you have any PR feedback on this latest revision, I'd love to hear it. Otherwise I'll probably merge this in the next few days.

@Shalelol
Copy link

Shalelol commented Nov 6, 2017

Hey @masojus,

Just had a quick review of your changes and it's looking really good 🔥. I'm not sure how I missed the SnakeCaseContractResolver already available in Json.NET. I would have picked it up if I had just read the comments on the snippet I used.

Also, did the datasets API always accept group_by as a string? I remember having issues sending them as strings, though it may have been something else. Looking back I wish I hadn't gone through the effort of creating those custom converters and jumping through hoops for nothing 😂.

Just one small thing I noticed, though this might be better as a new issue:

  • Consider consolidating versioning across projects for Newtonsoft.Json, NUnit and Moq. Json.NET for example has both versions 9.0.1 and 10.0.3 in the solution.

@masojus
Copy link
Contributor Author

masojus commented Nov 6, 2017

Thanks @Shalelol!

I'll double check index_by and group_by, but I know the API Reference isn't crystal clear about what's actually accepted (something I'll get fixed as soon as I can), so I had to try it out. I'm not sure if/when it changed.

We went about trying to consolidate versions of Json.NET awhile back (#44), but since then we've started transitioning to .NET Standard 2.0, and bumping versions of some of the dependencies only in some places, knowing we plan to eliminate the 3.5- and 4.5-specific projects. Once those are gone I plan to have all those dependencies at the same version 👍

Copy link
Contributor

@baumatron baumatron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks really good in the context of the existing SDK design! A lot of my comments are about could be done in the future for a better overall SDK design. A few smaller things should probably be addressed though like encoding url parameters and a bit of duplicated code.

var apiResponse = File.ReadAllText($"{GetApiResponsesPath()}/GetDatasetResults.json");
IKeenHttpClientProvider httpClientProvider = null;

if (UseMocks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not a huge fan of UseMocks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I thought maybe we'd look at this more closely with Issue #37, but I'd say we close that issue as fixed with PR #127 and instead open a more specific issue detailing what to do about UseMocks in the future.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue #133

/// </summary>
internal class Datasets : IDataset
{
private const int MAX_DATASET_DEFINITION_LIST_LIMIT = 100;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this the typical naming convention for constants in C#? Looks out of place.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

.NET conventions I think tend to suggest PascalCasing for public constants but generally (AFAIK) don't really lay down the law w.r.t. privates. I'm not sure if it's my Java or C++, but I do this too. I'll switch to PascalCasing though, which is also inline with the CoreCLR's editorconfig, so I need to go make sure that matches before I merge the editorconfig.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've fixed this locally along with editorconfig changes.

}

if (string.IsNullOrWhiteSpace(prjSettings.KeenUrl) ||
!Uri.IsWellFormedUriString(prjSettings.KeenUrl, UriKind.Absolute))
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems like this should be in the factory/constructor for project settings, not here. Creating an invalid project settings object shouldn't be allowed.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I'm actually not sure why I didn't do that during PR #51 or whenever it was that I added these lines of code to Queries/et. al.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was playing with this and realized that part of the problem is you can implement IProjectSettings in 3rd party code. That said, in theory since our impls of Datasets/Queries/Event/etc. have internal ctors, you can't directly construct them, so we could just call it sufficient to validate the project settings in the KeenClient ctor. However, I think this is better off being fixed as part of Issue #56 since we should consolidate a bunch of the code in all those impls.


var datasetResultsUrl = $"{GetDatasetUrl(datasetName)}/results";

var url = $"{datasetResultsUrl}?index_by={indexBy}&timeframe={timeframe}";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't these need to be encoded?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah here and elsewhere we need to verify this before merging.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

index_by can be any JSOn member name, I think, which in theory can be all sorts of things that need encoding.
timeframe when retrieving results can be an absolute timeframe (not just relative) I think, so it could have things like a colon, so also needs encoding.
limit shouldn't need encoding since it's just an int.

throw new KeenException("An API masterkey is required to get dataset results.");
}

var responseMsg = await _keenHttpClient
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems that a lot of this code could be shared with the method above. That said, almost all resources we request could probably share all of this code, which could be a task for the future.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Issue #55 and #56

/// <summary>
/// Holds information describing the query that is cached within a cached dataset.
/// </summary>
public class QueryDefinition
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see this eventually become how we generally define a query and pass parameters to and around within the SDK.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Especially as a part of fixing Issues #32 and #42 my plan has been to do this since once we move to POSTing all queries it'll be cleaner.

var dataset = client.QueryDataset(_datasetName, _indexBy, _timeframe);

Assert.IsNotNull(dataset);
Assert.IsNotNull(dataset["result"]);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would be great to check the values actually returned for the query.

var apiResponse = File.ReadAllText($"{GetApiResponsesPath()}/GetDatasetResults.json");
IKeenHttpClientProvider httpClientProvider = null;

if (UseMocks)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to see UseMocks go away.

Assert.IsNotNull(datasetCollection);
Assert.IsNotNull(datasetCollection.Datasets);
Assert.IsTrue(datasetCollection.Datasets.Any());
Assert.IsTrue(!string.IsNullOrWhiteSpace(datasetCollection.NextPageUrl));
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Again, would be great to verification of values instead of just checking to see that anything is there.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a blurb to Issue #130 about this.

{
var dataset = new DatasetDefinition();

Assert.Throws<KeenException>(() => dataset.Validate());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test seems like it doesn't quite cover all validation. For example it doesn't really test that AnalysisType being unset will ever throw. Maybe separate tests for each invalid condition would be warranted? Maybe start with a valid dataset and then modify only one field at a time per-test and assert that it throws?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added a blurb to Issue #130 about this.

@Shalelol
Copy link

Shalelol commented Nov 7, 2017

@baumatron I agree with your comments regarding the design of this project. It was quite strange to for me to come into and start working, and was quite different from most of the other C# projects I have worked on, but I tried to keep my code consistent with what was already there.

Some really simple changes that could be made to drastically increase the ease of contributing to this project are:

  • Moving all classes that only store data and have no logic into a separate folder (or project) called "Models" or something. At the moment they're mixed in with the code and it's very hard to find where all the actual logic is.
  • Offloading all the work around building the URL and Query Parameters as well as checking the HttpStatus code to the KeenHttpClient and then extending the KeenHttpClient into a QueriesHttpClient, DatasetHttpClient, EventCollectionHttpClient, etc, so things like building urls for the API endpoints are not tangled up with other business logic.

Doing this would result in having a series of classes sitting next to each other (Queries, Datasets, Event, EventCache, EventCollection) whose roles are very clear, and whose code is very straight to the point, with all non-business logic nonsense abstracted away (preferably in some generic way, as you said) via the HttpClients, and potentially some helpers.

@masojus
Copy link
Contributor Author

masojus commented Nov 7, 2017

@Shalelol Agreed. @baumatron and I also inherited the bulk of that code, and we created Issues #55 and #56 to basically cover exactly those points, so we understand your sentiment and are glad to hear our thoughts validated 😄

Copy link

@Shalelol Shalelol left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just added some comments to places where I forgot to add .ConfigureAwait(false) in order to keep the async methods context safe.

/// <returns>An instance of Newtonsoft.Json.Linq.JObject containing query results and metadata defining the cached dataset.</returns>
public async Task<JObject> QueryDatasetAsync(string datasetName, string indexBy, string timeframe)
{
return await Datasets.GetResultsAsync(datasetName, indexBy, timeframe);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to call .ConfigureAwait(false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will do here and elsewhere. I fixed some down in the Datasets class but missed these 😞

/// <returns>An instance of Keen.Core.Dataset.DatasetDefinition containing metadata about your cached dataset.</returns>
public async Task<DatasetDefinition> GetDatasetDefinitionAsync(string datasetName)
{
return await Datasets.GetDefinitionAsync(datasetName);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to call .ConfigureAwait(false)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here and elsewhere I just return the Task directly since there's no need to await and no need to configure an awaiter--in any case internally our impls configure the awaiter, and if 3rd party impls don't, there's not much we can do about it.

/// <returns>An instance of Newtonsoft.Json.Linq.JObject containing query results and metadata defining the cached dataset.</returns>
public async Task<JObject> QueryDatasetAsync(string datasetName, string indexBy, string timeframe)
{
return await Datasets.GetResultsAsync(datasetName, indexBy, timeframe);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to call .ConfigureAwait(false)

/// <returns>An instance of Keen.Core.Dataset.DatasetDefinition containing metadata about your cached dataset.</returns>
public async Task<DatasetDefinition> GetDatasetDefinitionAsync(string datasetName)
{
return await Datasets.GetDefinitionAsync(datasetName);
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to call .ConfigureAwait(false)

@baumatron
Copy link
Contributor

@Shalelol All great ideas! I also have been trying to keep with the existing design of the SDK until there's an opportunity to do a larger refactor as you describe. Would love to see that in the future.

- Regarding *Async() methods in KeenClient.cs, for simple forwarding methods, just return the task directly. No need to await or configure the awaiter...either way, internally our impls configure awaiters properly everywhere, and a 3rd party impl of, say IQueries could fail to do so and cause blocking on our task to potentially deadlock anyway, depending on the sync ctx. Anywhere there are calls to multiple async APIs we go ahead and await/configure as expected.
- One actual ramification of returning tasks is that in a few places where we validate params, we'll throw synchronously now instead of when the task is awaited, so potentially if you call us and store the task to await later, the client code's try/catch could be in the wrong place, but that seems like it's not that huge a deal as long as we bump the minor version (still pre-v1.0).
- Some of the Datasets code was incorrectly demanding a MasterKey when it really only needs a ReadKey.
- Changing to ReadKey showed me that some of the tests that expect throwing situations wouldn't catch the wrong key, because we throw the same exception for the server returning 500 as for an incorrect key (KeenException). This could mean we should throw different exceptions, which would make testing this easier (and potentially make coding against the SDK easier). It could also be fixed in tests by checking for certain words in the error message, but I figured it was more straightforward and clear to separate that particular throwing condition into its own test for each of the create/result/definition(s)/delete functions.
- Changed GetQueries()->GetQueriesAsync(), since it's public and should indicate that it returns an awaitable.
@coveralls
Copy link

Coverage Status

Coverage increased (+2.7%) to 72.708% when pulling f73aa1d on jm_UpdateCachedDatasetsBranch into 676e52f on master.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

4 participants