-
Notifications
You must be signed in to change notification settings - Fork 23
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
WASB support #161
base: master
Are you sure you want to change the base?
WASB support #161
Conversation
I saw a comment on a related issue:
'Processes' start in a specific directory called The same way that you start your shell and $ pwd
/Users/username # this is your current directory
$ touch foo.txt # /Users/username/foo.txt
$ touch /foo.txt # /foo.txt |
@krontogiannis Thanks for feedback :-)
Again, this PR isn't ready for consumption but I'm grateful for your time to tell me some points to look at - I'll continue looking at this next week. |
I've had a think about this over the weekend, and I'm wavering on this. The main problem I have with the current implementation is that it's easy to accidentally start working with files in another location than what you thought, simply by forgetting the leading slash. The idea of the default directory only further serves to confuse matters. Perhaps there's a way to keep the current behaviour and optionally also allow WASB paths. |
I think this is great feature work, though I agree tests are needed
This wouldn't concern me - growing functionality and usage is most important at this point
For MBrace.Azure I'd like to see WASB be the default, per this comment from @eiriktsarpalis:
|
I've made some changes (although there's still some tests I need to write + feedback) so it's essentially backwards compatible now (hence why it's green). The table below should show the different permutations. Default Container means the contextual "default container" of the blob store - this might be something like
This applies for both There's one hack I've had to put in currently which revolves around container names when creating a dictionary. Also note that this isn't the full WASB spec (which also allows you to include an account as well) but it puts us in a much better place for if we address the ability to connect to multiple storage accounts from an MBrace cluster. Feedback welcome. |
I very much agree in retrospect. I recommend we just go ahead with that change? |
Which change - this one or also change MBrace.Core to not put in a "default" directory for user operations? |
this |
I've discovered that the default user directory is a feature that's localised to MBrace.Azure, and it was a relatively painless removal. This means that the above matrix can be simplified when working with user data as follows: -
In other words - this would now be a breaking change, but FWIW I think it's a lot less open to misinterpretation and acknowledges that trying to map conventional file paths to blobs is a somewhat leaky abstraction. |
…ization [<DataMember>] issues?)
Basic testing being done on: -
I'm not sure what else is needed. |
@isaacabraham Do tests pass? Remember CI testing is not enabled on this repo, so we currently have to run tests manually |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor nits
My main concern is that no tests have been added or adjusted
|
||
let ensureRooted (path : string) = | ||
if isPathRooted path then path else raise <| FormatException(sprintf "Invalid path %A. Paths should start with '/' or '\\'." path) | ||
let splitPath = |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please use a function declaration for a function :)
let splitPath (path:string) = ...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doh :-) There was some code happening previously between the name and the argument.
@@ -213,37 +210,42 @@ type BlobStore private (account : AzureStorageAccount, defaultContainer : string | |||
member this.Name = "MBrace.Azure.Store.BlobStore" | |||
member this.Id : string = account.CloudStorageAccount.BlobStorageUri.PrimaryUri.ToString() | |||
|
|||
member this.DefaultDirectory = defaultContainer | |||
member this.DefaultDirectory = defaultContainer |> defaultArg <| "" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please don't use x |> y <| z
. A good rule is never to have both |>
and <|
on the same line. (A better rule is almost never to use <|
in code you want to be readable by others)
I need to figure out how to run the tests :-) I tried build.cmd a week or so ago but it took absolutely ages to run (and failed half way through). Once that's working I'll add more tests. |
@isaacabraham Getting a subset of the tests running on CI under simulated Azure storage would be absolutely fantastic progress in the long term. It's going to be hard to get contributions without that The other thing I think we really need to help enable contributions is a way to co-develop FsPickler+Vagabond+MBrace.Core+MBrace.Azure/AWS+StarterKit (or + one's own data scripting code) in one smooth build+test+deploy+use inner-dev workflow. In some ways I regret how the repos have been split into multiple github project, as propagating a fix from FsPickler or Mono.Cecil through to actually testing it in your own data scripting code is incredibly painful currently requiring updating about 15 nuget packages. The future of MBrace surely has to be as a go-to solution for F#-centric data scripters who have Spark-like needs and want to have control over a big data scripting stack that they can easily comprehend and contribute to. This means it must be possible to iterate from an idea "MBrace would be better if it just had XYZ" to refining, using and contributing that idea rapidly. |
@dsyme agree 100% - i'm feeling that same pain now (and is exactly why I'm shying away from changes to MBrace.Core here). One of the things I looked at a while ago was dumping reliance on service bus and moving entirely onto storage queues - this should mean that we could just use the storage emulator entirely for MBrace.Azure in a local context for all three components (storage = blobs, state = tables, messaging = queues). |
This should close #158 and puts in place the groundwork for a resolution to #65.
This PR is probably not quite ready for accepting but I'd still like some review on this if possible.