Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Module error notification #1

Open
fridex opened this issue Feb 6, 2016 · 20 comments
Open

Module error notification #1

fridex opened this issue Feb 6, 2016 · 20 comments

Comments

@fridex
Copy link
Contributor

fridex commented Feb 6, 2016

In the original proposal bac6372 there are used err return values. I would stick with Python's error handling used across all standard Python modules that means: if something goes wrong, raise an appropriate exception and let callee handle an error.

@ingvagabund
Copy link
Contributor

Exceptions should be used only exceptionally. What is a benefit of using exception here if a callee can test for err != "" or similar test? I would rather log the error and return a generic error saying "RESOURCE resource not found". It does make sense to replace a lot of if-elif-elif-...-else with try-catch. However, as long as there is only one error to test for, I don't see a reason for exceptions.

@ingvagabund
Copy link
Contributor

What would you recommend to read showing that exceptions are the best way for handling any error in python?

@ingvagabund
Copy link
Contributor

Something not just found on the first page of google. Something historical :)

@ingvagabund
Copy link
Contributor

@ingvagabund
Copy link
Contributor

After reading some blog posts it makes sense to raise KeyError in a storage/provider instead of returning err and using try-except-else in a controller. What do you think?

@ingvagabund
Copy link
Contributor

"Raise exceptions, and handle them at the level at which they're exceptional—which is also generally going to be the level where you know how to handle them." [1] I like the reasoning.

[1] http://stupidpythonideas.blogspot.cz/2015/05/if-you-dont-like-exceptions-you-dont.html

@fridex
Copy link
Contributor Author

fridex commented Feb 7, 2016

I didn't read links, but from my POV:

  • there are plenty types of errors that can occur - no internet connection, no space left to proceed the action, remote service is down, no privileges to write to a file, out of memory, and we could go on - do you really want to handle all of this in a module? I guess no...
  • with one variable which is just saying "there was an error" you loose what went wrong - okay, there was some error, but what error? maybe I can handle this special case and omit the error
  • the callee just uses methods, if something goes wrong, the callee implements the logic how to handle what error
  • you will use Python libraries which can raise an error, again - I don't see reasonable to handle all errors in a module
  • when an exception is raised you decide where you handle the exception (and how)
  • I would suggest to do logging on the callee side - you will implement logging once and on one place (just for example Python's logging instance will be passed)
  • ...

I think all of this relates to generic design principles. Let's make "dummy" easy to maintain, easy to debug, easy to support, easy to reuse modules (which can be standalone projects/applications) and connect them like a puzzle.

@ingvagabund
Copy link
Contributor

there are plenty types of errors that can occur - no internet connection, no space left to proceed the
action, remote service is down, no privileges to write to a file, out of memory, and we could go on - do
you really want to handle all of this in a module? I guess no...

  • no internet connection: this is relevant to a provider as the provider is responsible for dealing with retrieving a resource. So if there is no internet connection the provider could try to get the resource from other sources. "Interned connection is down" is important information for a caller but does the caller really need to know the connection is down? He is only interested in if the resource is retrievable or not. Of course it depends on who is the caller.
  • no space left to proceed the action: this is related to a storage. The same reasoning here. Does the caller really needs to know that there is no space left? What can he do about it? He can only skip storing a resource to a storage. It is up to a storage to handle it,e .g. by running a garbage collector.

And there are other exceptions suitable for the storage/resource/controller modules.

with one variable which is just saying "there was an error" you loose what went wrong - okay, there
was some error, but what error? maybe I can handle this special case and omit the error

If a caller can handle the error, than yes, let's provide the caller what he needs. What I wanted to say is that if a storage/resource will raise one exception and one only (read calling raise SMTH), it is worth considering exception vs. error. However, after reading the blog post I like the idea of using exception.

For storage, asking for "is resource in a storage" will be a frequent request. For a provider it will be always "give me a resource" request. I can image following errors/exceptions:

  • resource not foud
  • unable to store a resource
  • unable to retrieve a resorce
    If something in a module raises an exception (read module itself does not call raise) it is considerable what exceptions a module (here storage or provider) should ignore and handle.

Or is there something I don't see?

the callee just uses methods, if something goes wrong, the callee implements the logic how to handle
what error

Nothing against that

you will use Python libraries which can raise an error, again - I don't see reasonable to handle all
errors in a module

Not all, just those that are reasonable for handling. E.g. "unable to connect to db", "unable to read from a directory".

when an exception is raised you decide where you handle the exception (and how)

Every exception? If I am a top most caller, I don't like the idea. Module "must" handle all exceptions/errors he is responsible for. Who is "you" here? If you mean every callee is responsible for handling local exceptions and the caller is left with exceptions none of the callee can handle, I agree.

I would suggest to do logging on the callee side - you will implement logging once and on one place

Agree. If something goes wrong you can read the logs and see what it was.

... I think all of this relates to generic design principles. Let's make "dummy" easy to maintain, easy to
debug, easy to support, easy to reuse modules and connect them like a puzzle.

Agree.

@fridex
Copy link
Contributor Author

fridex commented Feb 7, 2016

  • no internet connection: this is relevant to a provider as the provider is responsible for dealing with retrieving a resource. So if there is no internet connection the provider could try to get the resource from other sources. "Interned connection is down" is important information for a caller but does the caller really need to know the connection is down? He is only interested in if the resource is retrievable or not. Of course it depends on who is the caller.

Exactly, who is the caller... So keep it simple and generic, avoid putting such logic overhead which is counterproductive to a module - let a caller decide how to retrieve a resource - a module will only encapsulate the process, which can change if necessary (generic OOP principles).

  • no space left to proceed the action: this is related to a storage. The same reasoning here. Does the caller really needs to know that there is no space left? What can he do about it? He can only skip storing a resource to a storage. It is up to a storage to handle it,e .g. by running a garbage collector.

Yes, I want to know that there is no space left so I can for example clean my disc drive or I can make cache size bigger. Imagine you have an application, which says "sorry, there was an error" vs. "sorry, you don't have enough space to save FOO".

For storage, asking for "is resource in a storage" will be a frequent request. For a provider it will be always "give me a resource" request.

This will be encapsulated inside the module. It has nothing to do with actual module usage.

when an exception is raised you decide where you handle the exception (and how)

Every exception? If I am a top most caller, I don't like the idea. Module "must" handle all exceptions/errors he is responsible for. Who is "you" here? If you mean every callee is responsible for handling local exceptions and the caller is left with exceptions none of the callee can handle, I agree.

Lets repeat it then... The original proposal had only error variables, so I opened this issue, where I am suggesting to raise an exception when an error occurs. So yes, the caller (in previous context "you") should handle exception logic where necessary or terminate when it is unable to proceed.

@ingvagabund
Copy link
Contributor

  • no internet connection: this is relevant to a provider as the provider is responsible for dealing with retrieving a resource. So if there is no internet connection the provider could try to get the resource from other sources. "Interned connection is down" is important information for a caller but does the caller really need to know the connection is down? He is only interested in if the resource is retrievable or not. Of course it depends on who is the caller.

Exactly, who is the caller... So keep it simple and generic, avoid putting such logic overhead which is counterproductive to a module - let a caller decide how to retrieve a resource - a module will only encapsulate the process, which can change if necessary (generic OOP principles).

Just confirming we are on the same page. E.g. here caller is GithubSourceCodeProvider. The caller gets request to retrieve a resource. Caller calls methods. First method corresponds to retrieving a tarball from upstream repository. Second method corresponds to retrieving tarball from local repository. Here the first method can raise NoInternetConnection exception. The second can raise CommitNotFound exception. So from your point of view, you are saying caller (here GithubSourceCodeProvider) is responsible for deciding what method he is going to use. So none of the methods should not handle NoInternetConnection/CommitNotFound exception. My point of view is GithubSourceCodeProvider is the callee and he should handle both exceptions.

  • no space left to proceed the action: this is related to a storage. The same reasoning here. Does the caller really needs to know that there is no space left? What can he do about it? He can only skip storing a resource to a storage. It is up to a storage to handle it,e .g. by running a garbage collector.

Yes, I want to know that there is no space left so I can for example clean my disc drive or I can make cache size bigger. Imagine you have an application, which says "sorry, there was an error" vs. "sorry, you don't have enough space to save FOO".

Here the same point of view as above. On the other hand if a caller (here TarballStorage) can not free any memory when he catch OutOfMemory exception (because he can not garbage collect any resource or he can not allocate additional memory due to resource limitations), it is reasonable to forward the exception up. Then, it is again a questions who is responsible for what. It makes sense to catch exception and if I am not able to handle them properly to forward them up.

For storage, asking for "is resource in a storage" will be a frequent request. For a provider it will be always "give me a resource" request.

This will be encapsulated inside the module. It has nothing to do with actual module usage.

In that case what do you mean under "module" here? For me module = one python class. Or you mean module as a set of python classes then I agree.

when an exception is raised you decide where you handle the exception (and how)

Every exception? If I am a top most caller, I don't like the idea. Module "must" handle all exceptions/errors he is responsible for. Who is "you" here? If you mean every callee is responsible for handling local exceptions and the caller is left with exceptions none of the callee can handle, I agree.

Lets repeat it then... The original proposal had only error variables, so I opened this issue, where I am suggesting to raise an exception when an error occurs.

Actually, it was a golang syntax and just a high level API saying "some error can occur". It was not intended to define the final form of the API.

So yes, the caller (in previous context "you") should handle exception logic where necessary
or terminate when it is unable to proceed.

Agree.

@fridex
Copy link
Contributor Author

fridex commented Feb 7, 2016

Just confirming we are on the same page. E.g. here caller is GithubSourceCodeProvider. The caller gets request to retrieve a resource. Caller calls methods. First method corresponds to retrieving a tarball from upstream repository. Second method corresponds to retrieving tarball from local repository. Here the first method can raise NoInternetConnection exception. The second can raise CommitNotFound exception. So from your point of view, you are saying caller (here GithubSourceCodeProvider) is responsible for deciding what method he is going to use. So none of the methods should not handle NoInternetConnection/CommitNotFound exception. My point of view is GthubSourceCodeProvider is the callee and he should handle both exceptions.

No, not at all. As far as I know, we are discussing about what will "resources" provide, not how it will be implemented. Moreover, NoInternetConnection and CommitNotFound will be raised to an application, which will import "resources" and let the exception handling logic on its implementation.

Here the same point of view as above. On the other hand if a caller (here TarballStorage) can not free any memory when he catch OutOfMemory exception (because he can not garbage collect any resource or he can not allocate additional memory due to resource limitations), it is reasonable to forward the exception up. Then, it is again a questions who is responsible for what. It makes sense to catch exception and if I am not able to handle them properly to forward them up.

Yes, so "up" is an application, which does "import resources".

In that case what do you mean under "module" here? For me module = one python class. Of you mean module as a set of python classes then I agree.

OK, we missed here. So to be clear, lets call "resources package", which will consist of modules (e.g. files).

@ingvagabund
Copy link
Contributor

Yes, so "up" is an application, which does "import resources".

Agree.

OK, we missed here. So to be clear, lets call "resources package", which will consist of modules (e.g. files

OK.

@ingvagabund
Copy link
Contributor

Just confirming we are on the same page. E.g. here caller is GithubSourceCodeProvider. The caller gets request to retrieve a resource. Caller calls methods. First method corresponds to retrieving a tarball from upstream repository. Second method corresponds to retrieving tarball from local repository. Here the first method can raise NoInternetConnection exception. The second can raise CommitNotFound exception. So from your point of view, you are saying caller (here GithubSourceCodeProvider) is responsible for deciding what method he is going to use. So none of the methods should not handle NoInternetConnection/CommitNotFound exception. My point of view is GthubSourceCodeProvider is the callee and he should handle both exceptions.

No, not at all. As far as I know, we are discussing about what will "resources" provide, not how it will be implemented.

This issue is about error notification and from the going discussion who will handle exceptions.

Moreover, NoInternetConnection and CommitNotFound will be raised to an application, which will import "resources" and let the exception handling logic on its implementation.

Let's agree to disagree. How will the application handle those exceptions? It is not up to the application to deal with them. Primarily, it is up to whoever will work over storages and providers (from here on controller) to handle all those exceptions. Application can deal with them if the controller will not be able to. However, as long as the controller can deal with it, application should not. Or are you saying that the controller is not allowed to handle any exception even if it can? Application importing "resources" package will communicate through controllers with modules inside the package. Or what do you mean under "application" here?

@fridex
Copy link
Contributor Author

fridex commented Feb 7, 2016

Or are you saying that the controller is not allowed to handle any exception even if it can?

Can you be more specific what exceptions do you want to handle inside "resources package"?

@ingvagabund
Copy link
Contributor

Two examples:

CommitNotFound: when retrieving a tarball I will look if a given commit can be found in a local repository. If not, I will download the tarball from upstream repository. I could run 'git pull' of course. However, not always I will have write permissions to do so.

OutOfMemory: when a storage runs out of a memory, I would like to run garbage collector and try to free some space before I give up .

@fridex
Copy link
Contributor Author

fridex commented Feb 7, 2016

CommitNotFound: when retrieving a tarball I will look if a given commit can be found in a local repository. If not, I will download the tarball from upstream repository. I could run 'git pull' of course. However, not always I will have write permissions to do so.

I don't see this as an error, simply requested data are not cached and need to be downloaded.

OutOfMemory: when a storage runs out of a memory, I would like to run garbage collector and try to free some space before I give up .

You want to run garbage collector in Python (python package?) when there is not enough memory? Or do you want to implement your own garbage collector? I don't get this.

@ingvagabund
Copy link
Contributor

CommitNotFound: when retrieving a tarball I will look if a given commit can be found in a local repository. If not, I will download the tarball from upstream repository. I could run 'git pull' of course. However, not always I will have write permissions to do so.

I don't see this as an error, simply requested data are not cached and need to be downloaded.

Data are not cached/stored/available in a repository can raise exception the same was as a key is not found in a dictionary. GitLocalRepositoryProvider will give you the tarball from a local repository, GitRepositoryProvider will give you tarball from upstream repository. So if a commit is not found via GitLocalRepositoryProvider, exception is raised.

OutOfMemory: when a storage runs out of a memory, I would like to run garbage collector and try to free some space before I give up .

You want to run garbage collector in Python when there is not enough memory? Or do you want to implement your own garbage collector? I don't get this.

How does system OutOfMemory relates to resource package? If you hit system error there is not much you can do no matter who the exception catches. What I meant was that store method of any storage can raise its own OutOfMemory exception. If so caller (controller) can catch it and run storage's GC to free some space.

@fridex
Copy link
Contributor Author

fridex commented Feb 7, 2016

Data are not cached/stored/available in a repository can raise exception the same was as a key is not found in a dictionary.

When you are trying to access a key in a dict, which does not exist that leads to an error (you shouldn't do it) - thus an appropriate exception is raised. So you want to write code like:

d = {}
try:
    print d['foo']
except KeyError:
    d['foo'] = 'bar'
    print d['foo']

Instead of:

d = {}
if 'foo' not in d.keys():
    d['foo'] = 'bar'
print d['foo']

Or for dicts you can use a wrapper:

d = { }
print d.get('foo', 'bar')

GitLocalRepositoryProvider will give you the tarball from a local repository, GitRepositoryProvider will give you tarball from upstream repository. So if a commit is not found via GitLocalRepositoryProvider, exception is raised.

The same goes here. It is an error. But I don't see any point why are you dealing with it. If you are trying to get something, which does not exist in a storage/dict, it's an error and it means you are doing something wrong. BUT this error should not occur outside of resources package - if so then there is an issue in resource package code. BTW, communicating with objects based on exceptions they rise would lead to, at least, unreadable source code. If you rise an exception, it should signal, that something went wrong and you can debug it. Python provides methods and they are friendly.

How does system OutOfMemory relates to resource package? If you hit system error there is not much you can do no matter who the exception catches. What I meant was that store method of any storage can raise its own OutOfMemory exception. If so caller (controller) can catch it and run storage's GC to free some space.

So you were speaking about out of space, not out of memory issues and clean up, not garbage collecting.

@fridex
Copy link
Contributor Author

fridex commented Feb 8, 2016

Based on our meeting today, I prepared a skeleton proposal. I am open to discussion 12985a8

You can try TarballProvider, it will guide you through the main:

$ python tarballProvider.py

some notes

  • a receiver was introduced to avoid naming collisions and misleading - an object which encapsulates how are things received
  • I would rather implement {set,get}_storage() and {set,get}_receiver() instead of directly accessing instances
  • package is exporting only *Proivder classes
  • added one level of inheritance for FilesystemStorage, DatabaseStorage which could be extended with SshStorage, FtpStorage, ... see sources for more info

@ingvagabund
Copy link
Contributor

Constracts for python.

[1] https://www.python.org/dev/peps/pep-0316/

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants