Skip to content
This repository has been archived by the owner on Feb 7, 2024. It is now read-only.

AttributeError: 'Submission' object has no attribute 'd_' #82

Open
AmeyHengle opened this issue Oct 5, 2020 · 1 comment
Open

AttributeError: 'Submission' object has no attribute 'd_' #82

AmeyHengle opened this issue Oct 5, 2020 · 1 comment

Comments

@AmeyHengle
Copy link

The Api documentation provides a method to directly convert a submission object into a DataFrame using the special attribute '_d' .
However, in practice, I am getting an error that there is no such attribute.

Posting the error below:

AttributeError Traceback (most recent call last)
in
24 ))
25
---> 26 df = pd.DataFrame([obj.d_ for obj in submissions])
27 df.to_csv('../Mental_Health/AskReddit.csv')

in (.0)
24 ))
25
---> 26 df = pd.DataFrame([obj.d_ for obj in submissions])
27 df.to_csv('../Mental_Health/AskReddit.csv')

~\Anaconda3\lib\site-packages\praw\models\reddit\base.py in getattr(self, attribute)
33 if not attribute.startswith("_") and not self._fetched:
34 self._fetch()
---> 35 return getattr(self, attribute)
36 raise AttributeError(
37 "{!r} object has no attribute {!r}".format(

~\Anaconda3\lib\site-packages\praw\models\reddit\base.py in getattr(self, attribute)
36 raise AttributeError(
37 "{!r} object has no attribute {!r}".format(
---> 38 self.class.name, attribute
39 )
40 )

AttributeError: 'Submission' object has no attribute 'd_'

@dmarx
Copy link
Owner

dmarx commented Oct 6, 2020

Interesting! It's hard to debug this without seeing the code for how you requested obj. My suspicion here is that you instantiated the psaw.PushshiftAPI instance with a praw.Reddit instance, right? If that's the case, obj is an instance of praw.models.Submission (you can check this with type(obj)). That special d_ attribute is specific to the psaw Submission model. If you instantiate the psaw.PushshiftAPI instance without passing it a praw.Reddit instance, it will return objects with the d_ attribute. If you give it a praw.Reddit instance, it will return praw objects, which don't have this attribute.

Assuming this is what's going on, I think there are two main reasons to instantiate psaw with a praw instance: to ensure you are fetching the current state of the reddit Thing rather than the snapshot in pushshift's archive (e.g. if you need the current score on the object or want to ignore deleted items), or because you are passing the results to code that was designed around praw objects and you want to ensure compatibility. If you don't fall into either of these use cases, you can probably just instantiate psaw without passing it a praw instance and you'll get the magic d_ attribute.

I haven't tested this, but per the praw docs I suspect you could do something like this:

subm_dicts = [{k:getattr(praw_obj, k) for k in vars(praw_obj)} for praw_obj in submissions]
df = pd.DataFrame(subm_dicts)

If my suspicion is off target here, I'm going to need more information to help you figure out what's going on. Let me know if this answers your question. If you still need help, please share enough code to permit me to replicate the issue on my end.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants