Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

AttributeError: 'NoneType' object has no attribute 'span' #1080

Open
raphaelrubrice opened this issue Dec 19, 2024 · 6 comments
Open

AttributeError: 'NoneType' object has no attribute 'span' #1080

raphaelrubrice opened this issue Dec 19, 2024 · 6 comments

Comments

@raphaelrubrice
Copy link

raphaelrubrice commented Dec 19, 2024

Hello,

I am a new user of skorch. When trying to import NeuralNetRegressor I got the error : AttributeError: 'NoneType' object has no attribute 'span'.

This error stems from the documentation retrieval. Either a change in documentation format or a change in the documentation retrieving fonction is causing this error ? Here is the error I got :

Cell In[37], line 63
     61 import torch
     62 import torch.nn as nn
---> 63 from skorch import NeuralNetRegressor
     64 from torch.optim import Adam
     66 class MAPNet(nn.Module):

File ~/miniconda3/envs/myenv/lib/python3.13/site-packages/skorch/__init__.py:10
      8 from .history import History
      9 from .net import NeuralNet
---> 10 from .classifier import NeuralNetClassifier
     11 from .classifier import NeuralNetBinaryClassifier
     12 from .regressor import NeuralNetRegressor

File ~/miniconda3/envs/myenv/lib/python3.13/site-packages/skorch/classifier.py:54
     50     return doc
     53 # pylint: disable=missing-docstring
---> 54 class NeuralNetClassifier(NeuralNet, ClassifierMixin):
     55     __doc__ = get_neural_net_clf_doc(NeuralNet.__doc__)
     57     def __init__(
     58             self,
     59             module,
   (...)
     64             **kwargs
     65     ):

File ~/miniconda3/envs/myenv/lib/python3.13/site-packages/skorch/classifier.py:55, in NeuralNetClassifier()
     54 class NeuralNetClassifier(NeuralNet, ClassifierMixin):
---> 55     __doc__ = get_neural_net_clf_doc(NeuralNet.__doc__)
     57     def __init__(
     58             self,
     59             module,
   (...)
     64             **kwargs
     65     ):
     66         super(NeuralNetClassifier, self).__init__(
     67             module,
     68             *args,
   (...)
     71             **kwargs
     72         )

File ~/miniconda3/envs/myenv/lib/python3.13/site-packages/skorch/classifier.py:47, in get_neural_net_clf_doc(doc)
     45 doc = neural_net_clf_doc_start + " " + doc.split("\n ", 4)[-1]
     46 pattern = re.compile(r'(\n\s+)(criterion .*\n)(\s.+){1,99}')
---> 47 start, end = pattern.search(doc).span()
     48 doc = doc[:start] + neural_net_clf_additional_text + doc[end:]
     49 doc = doc + neural_net_clf_additional_attribute

AttributeError: 'NoneType' object has no attribute 'span'

As it was not a functionality issue, I bypassed the error by defining in the file defining NeuralNetClassifier, NeuralNetRegressor and NeuralNetBinaryClassifier the function get_neural_net_reg_doc by adding a check, changing the function from :

This :

def get_neural_net_reg_doc(doc):
    doc = neural_net_reg_doc_start + " " + doc.split("\n ", 4)[-1]
    pattern = re.compile(r'(\n\s+)(criterion .*\n)(\s.+){1,99}')    
    start, end = pattern.search(doc).span()
    doc = doc[:start] + neural_net_reg_criterion_text + doc[end:]
    return doc

To this :

def get_neural_net_reg_doc(doc):
    doc = neural_net_reg_doc_start + " " + doc.split("\n ", 4)[-1]
    pattern = re.compile(r'(\n\s+)(criterion .*\n)(\s.+){1,99}')    
    match = pattern.search(doc)
    if match:
        start, end = match.span()
        doc = doc[:start] + neural_net_reg_criterion_text + doc[end:]
    else:
        doc = "No documentation found."
        print("Pattern not found in the doc string.")
    return doc

I hope this helps. This is my first time posting an issue on Github, if this was not helpful do not hesitate to tell me so.
Have a good day !
Raphaël.

@BenjaminBossan
Copy link
Collaborator

Thank you for reporting this error. Could you please tell us which Python version you're using?

@raphaelrubrice
Copy link
Author

I am using Python version 3.13.1

@BenjaminBossan
Copy link
Collaborator

Okay, so this is a problem with Python 3.13. As skorch does not officially support 3.13 yet*, we didn't catch that problem.

The issue is a bit subtle. When we print the start of the NeuralNet.__doc__, we get different results depending on Python version.

Below is 3.12:

NeuralNet base class.

    The base class covers more generic cases. Depending on your use
    case, you might want to use :class:`.NeuralNetClassifier` or
    :class:`.NeuralNetRegressor`.

    In addition to the parameters listed below, there are parameters
    with specific prefixes that are handled separately. To illustrate
    this, here is an example:

    >>> net = NeuralNet(
    ...    ...,
    ...    optimizer=torch.optimizer.SGD,
    ...    optimizer__momentum=0.95,
    ...)

vs 3.13:

NeuralNet base class.

The base class covers more generic cases. Depending on your use
case, you might want to use :class:`.NeuralNetClassifier` or
:class:`.NeuralNetRegressor`.

In addition to the parameters listed below, there are parameters
with specific prefixes that are handled separately. To illustrate
this, here is an example:

>>> net = NeuralNet(
...    ...,
...    optimizer=torch.optimizer.SGD,
...    optimizer__momentum=0.95,
...)

This is because

The compiler now strips common leading whitespace from every line in a docstring

(link)

This messes with our parsing code of the docstring. The correct solution is thus to fix the parsing code so that it works both with and without leading whitespace on each line.

Regarding your problem @raphaelrubrice, I would recommend to downgrade to Python 3.12 if possible. Also, let me know if you're interested in creating a PR to fix the issue.

*PyTorch now works with Python 3.13, so we could indeed add support in skorch.

@raphaelrubrice
Copy link
Author

Thank you for identifiying the source of the issue. Yes I think I'd be interested, I'll try to find a real fix then.

@raphaelrubrice
Copy link
Author

raphaelrubrice commented Dec 27, 2024

Okay so I think I found a solution. There were two issues :

1) Regexp was no longer valid when leading whitespace were removed as @BenjaminBossan previously stated.

With python 3.12 (criterion block is correctly processed by regexp):

    criterion : torch criterion (class)
      The uninitialized criterion (loss) used to optimize the
      module.

With 3.13 because of whitespace removal, the block was not recognized because of the absence of whitespace required by the regexp when specifying (\s.+) at the end.
This was corrected by changing the regexp from (\n\s+)(criterion .*\n)(\s.+){1,99} to (\n\s+)(criterion .*\n)(\s.+|.){1,99} which adds the possibility to match even without leading whitespaces while still matching exactly the criterion block only.

2) The first line of the function that modifies the documentation had a small mistake in it :

def get_neural_net_reg_doc(doc):
    doc = neural_net_reg_doc_start + " " + doc.split("\n ", 4)[-1]

split argument should be 3 instead of 4 because otherwise the criterion block was getting removed so even with the proper regexp there was no criterion block to match anymore. So this was fixed by setting :

def get_neural_net_reg_doc(doc):
    doc = neural_net_reg_doc_start + " " + doc.split("\n ", 3)[-1]

I tested imports from 3.13 and 3.12 and it now works fine.

I made a PR

PS: I fixed it for NeuralNetClassifier and NeuralNetBinaryCasiffier also as the issue was the exact same.

Thanks again,
Raphaël.

@raphaelrubrice
Copy link
Author

raphaelrubrice commented Jan 4, 2025

Update on this issue :
Previous fix was not good because it was missing a documentation paragraph and still had incorrect indentation.

Missing paragraph stems from one "\n " becoming a "\n"without trailing whitespace in Python 3.13 which leads to one paragraph to be skipped before the next occurence. This was fixed by setting the split argument to "\n".

However this change leads to :class:`.NeuralNetRegressor`. being included in the documentation. Thus specifying ".split("\n", 5) instead of ".split("\n", 4) allows proper retrieval now.

The other issue was incorrect indentation. By using textwrap as suggested by @BenjaminBossan I was able to fix this. The branch for pull request should now be correct.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants