Example #33

jgarciab · 2024-02-29T11:07:49Z

Improved examples and README

jgarciab · 2024-02-29T11:08:45Z

Please @modhurita review, and add yourself to the citation, once we have a new version we can put it in Zenodo and add your name there too

README.md

… the saved image filename

modhurita · 2024-03-22T11:34:07Z

README.md

-scraper.close()
-```
+To download data from GoogleArt it is necessary to install 
+[Firefox](https://www.mozilla.org/en-US/firefox/new/) and `geckodriver`. Geckodriver is installed automatically when you run the code for the first time.


Is it not confusing to first say that geckodriver needs to be installed, and then say that it is installed automatically? Maybe it is better to leave out any mention of geckodriver?

Yes let's remove geckodriver

modhurita · 2024-03-22T14:26:33Z

examples/example_collect_all_artworks.ipynb

   ]
  },
  {
   "cell_type": "code",
-   "execution_count": null,
-   "id": "54afc420",
+   "execution_count": 5,


Cell 5 is empty and needs to be removed.

modhurita · 2024-03-22T14:42:37Z

Hi @jgarciab :

Thanks for your work on improving this repository!

I have provided some comments above, and changed some things myself online. I hope they look okay to you.

I could run the Google Arts & Culture parts of the example notebooks, but not the WikiArt ones. I obtained API keys by creating a WikiArt account, and placed them in the examples directory, as per the instructions in the README. However, the very first example cell didn't execute successfully. This is the error I got:

FileNotFoundError                         Traceback (most recent call last)
~/ResearchEngineering/artscraper/artscraper/wikiart.py in __init__(self, output_dir, skip_existing, min_wait, timeout)
     23         try:
---> 24             with open(".wiki_session", "r", encoding="utf-8") as f:
     25                 self.session_key = f.read()

FileNotFoundError: [Errno 2] No such file or directory: '.wiki_session'

During handling of the above exception, another exception occurred:

KeyError                                  Traceback (most recent call last)
/tmp/ipykernel_13612/1175252453.py in <module>
      3 art_url = "https://www.wikiart.org/en/edvard-munch/anxiety-1894"
      4 
----> 5 with WikiArtScraper() as scraper:
      6     scraper.load_link(art_url)
      7     metadata = scraper.get_metadata()

~/ResearchEngineering/artscraper/artscraper/wikiart.py in __init__(self, output_dir, skip_existing, min_wait, timeout)
     25                 self.session_key = f.read()
     26         except FileNotFoundError:
---> 27             self._new_session()
     28             with open(".wiki_session", "w", encoding="utf-8") as f:
     29                 f.write(self.session_key)

~/ResearchEngineering/artscraper/artscraper/wikiart.py in _new_session(self)
     66                                 },
     67                                 timeout=self.timeout)
---> 68         self.session_key = json.loads(response.text)["SessionKey"]
     69         self.last_request = time.time()
     70 

KeyError: 'SessionKey'

Finally, at which position in the names list in the citation should I add my name?

Thanks,
Modhurita

jgarciab · 2024-04-08T09:57:01Z

Hi Modhurita, I only tried with the interactive version, could you try to figure it out? That was Raoul's part, if you don't understand it how it works maybe you could ask him.

You can add your name in second place in the citation if you are okay with that.

modhurita · 2024-04-08T11:48:43Z

Hi @jgarciab :

The WikiArt part seems to work now - not sure why I got that error earlier. I have made the other changes. I approve the pull request; you can now merge this branch into main now.

jgarciab added 3 commits February 29, 2024 12:03

close firefox after crashing

6804d91

remove unused requirements

3bbe841

improved readme and examples

2654f73

jgarciab requested a review from modhurita February 29, 2024 11:07

modhurita reviewed Mar 22, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

modhurita reviewed Mar 22, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

wikimedia -> wikidata

e0482ad

modhurita reviewed Mar 22, 2024

View reviewed changes

README.md Outdated Show resolved Hide resolved

Update Google A&C link to point to Edvard Munch's "Anxiety", to match…

cb8b6cc

… the saved image filename

modhurita reviewed Mar 22, 2024

View reviewed changes

modhurita added 2 commits April 8, 2024 13:37

Remove geckodriver reference, add name to citation

fb065b4

Remove empty cell

142f21b

modhurita merged commit 1ca4ab6 into main Apr 8, 2024
6 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Example #33

Example #33

jgarciab commented Feb 29, 2024

jgarciab commented Feb 29, 2024

modhurita Mar 22, 2024

jgarciab Apr 8, 2024

modhurita Mar 22, 2024

jgarciab Apr 8, 2024

modhurita commented Mar 22, 2024

jgarciab commented Apr 8, 2024

modhurita commented Apr 8, 2024

Example #33

Example #33

Conversation

jgarciab commented Feb 29, 2024

jgarciab commented Feb 29, 2024

modhurita Mar 22, 2024

Choose a reason for hiding this comment

jgarciab Apr 8, 2024

Choose a reason for hiding this comment

modhurita Mar 22, 2024

Choose a reason for hiding this comment

jgarciab Apr 8, 2024

Choose a reason for hiding this comment

modhurita commented Mar 22, 2024

jgarciab commented Apr 8, 2024

modhurita commented Apr 8, 2024