Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError: MD5Sum #14

Closed
trond7 opened this issue Jul 21, 2015 · 9 comments
Closed

KeyError: MD5Sum #14

trond7 opened this issue Jul 21, 2015 · 9 comments

Comments

@trond7
Copy link

trond7 commented Jul 21, 2015

I have been using you script to upload around 2000 images spread over multiple folders.
For most folders it works fine but for a few I got an error after uploading some of the images.
When I try to run the script again for these folders I alway get the same error.
When I run the same command again the script does not give any skipping image messages for the images it had managed to upload. It throws the error instantly.

Complete Error message:

Traceback (most recent call last):
File "/home/thh/bin/smugline/smugline.py", line 303, in
file_filter)
File "/home/thh/bin/smugline/smugline.py", line 137, in upload_folder
self._upload(images, album_name, album)
File "/home/thh/bin/smugline/smugline.py", line 148, in _upload
images = self._remove_duplicates(images, album)
File "/home/thh/bin/smugline/smugline.py", line 202, in _remove_duplicates
md5_sums = self._get_md5_hashes_for_album(album)
File "/home/thh/bin/smugline/smugline.py", line 168, in _get_md5_hashes_for_album
md5_sums = [x['MD5Sum'] for x in remote_images['Album']['Images']]
KeyError: 'MD5Sum'

My environment:

  • Ubuntu 12.04 LTS
  • Python 2.7.3
  • smugline = Already up-to-date

More comments:
It does not stop on strange filenames
--> All image filenames have the same structure like “DSC_0626.JPG”
It is not any strange folder names
--> It does upload some of the images
I happens only sometime
--> 5 out of approximately 120 folders has this error
All images in the folder has the same ownership and permission
--> Owner is me (permission rwx,rwx,rwx)

@gingerlime
Copy link
Owner

Thanks for the detailed report. I think SmugMug has a limit on the number of images in a single album. Could you be hitting this limit?

@jremes-foss
Copy link

Hi. It seems that there is a gallery cap in SmugMug:

http://news.smugmug.com/2013/03/20/uploading-to-smugmug-what-how-big-and-how-many/

Officially, the documentation says that the cap is 5000 images per gallery, but there are reports that the cap would be actually 2000.

KeyError exception is raised when a mapping a key is not found in the set of existing keys, so this would make sense if the cap is being hit.

@zgrega
Copy link
Contributor

zgrega commented Jul 27, 2015

Hi, I think that gallery cap is not a problem. I am getting this error with almost empty gallery. After and error I tried to upload images via smugmug web interface and it worked.

@gingerlime
Copy link
Owner

Are you able to share the folders / images that cause this issue somehow, so I can try to reproduce this error? Otherwise, it's very difficult to investigate this.

@sfrostick
Copy link

I'm actually seeing the same issue, to me it doesn't look like the smugmug is returning the MD5sum in the response. I threw a print statement just below line 167 in the _get_md5_hashes_for_album function

167         remote_images = self._get_remote_images(album, 'MD5Sum')
168         print remote_images

And the response is ( i have removed my smugmug account and album detail from the response)

{u'Album': {u'Images': [{u'id': 4227296512, u'Key': u'b23W3ht'}, {u'id': 4227298744, u'Key': u'BNbn33d'}, {u'id': 4227301704, u'Key': u'XkWqhTT'}, {u'id': 4227305106, u'Key': u'jskCCfx'}, {u'id': 4227307225, u'Key': u'LkLtCWj'}, {u'id': 4227311603, u'Key': u'RqT64H3'}, {u'id': 4227312621, u'Key': u'HxK5MCX'}, {u'id': 4227316227, u'Key': u'Zkp8M8V'}, {u'id': 4227318270, u'Key': u'mKWP9qS'}, {u'id': 4227320820, u'Key': u'Zb4swhS'}, {u'id': 4227323151, u'Key': u'DHPc7gG'}, {u'id': 4227325130, u'Key': u'QCZZ4Dn'}, {u'id': 4227331881, u'Key': u'XqVL2C4'}, {u'id': 4227336419, u'Key': u'HGSGqRG'}, {u'id': 4227342670, u'Key': u'W9shvVS'}, {u'id': 4227347405, u'Key': u'nmGMT6D'}, {u'id': 4227352469, u'Key': u'ZvR7kPs'}, {u'id': 4227353199, u'Key': u'7f6qfNb'}, {u'id': 4227359025, u'Key': u'S5NMKWn'}, {u'id': 4227359096, u'Key': u'mt53VJZ'}, {u'id': 4227364054, u'Key': u'MFTCSSR'}, {u'id': 4227364251, u'Key': u'35fZwCq'}, {u'id': 4227369602, u'Key': u'nvXpX7H'}, {u'id': 4227369655, u'Key': u'82d4S5n'}, {u'id': 4227375536, u'Key': u'5vN7Wg6'}, {u'id': 4227375634, u'Key': u'6HgQMng'}, {u'id': 4227380580, u'Key': u'2FJFppd'}, {u'id': 4227381365, u'Key': u'bnKLmnF'}, {u'id': 4227385860, u'Key': u'HF8cQBL'}], u'URL': u'https://REMOVED', u'ImageCount': 29, u'id': 50837046, u'Key': u'75G8gC'}, u'stat': u'ok', u'method': u'smugmug.images.get'}
Traceback (most recent call last):
  File "/usr/bin/smugline.py", line 304, in <module>
    file_filter)
  File "/usr/bin/smugline.py", line 137, in upload_folder
    self._upload(images, album_name, album)
  File "/usr/bin/smugline.py", line 148, in _upload
    images = self._remove_duplicates(images, album)
  File "/usr/bin/smugline.py", line 203, in _remove_duplicates
    md5_sums = self._get_md5_hashes_for_album(album)
  File "/usr/bin/smugline.py", line 169, in _get_md5_hashes_for_album
    md5_sums = [x['MD5Sum'] for x in remote_images['Album']['Images']]
KeyError: 'MD5Sum'

I have noticed this happening on albums that already have some pictures in and I'm trying to add some more.

@gingerlime
Copy link
Owner

I also bumped into this and even reported to the SmugMug API team, but they weren't particularly helpful unfortunately. To be fair, this should improve with the latest version of the API, but Smugline relies on Smugpy, which still isn't using the latest version of the API.

However, I'm still not particularly happy that SmugMug broke backwards compatibility and started returning empty MD5Sum for some images. And they're not willing to fix it by the looks of it. If you'd like to contact their api team, maybe they'll explain things better, but I didn't manage to convince them to fix this.

A few workarounds that I found to be helping avoid this (you'd have to pre-process before using smugline):

  • auto-rotating images, e.g. on Linux you can use exifautotran
  • stripping meta-data from images, e.g. using Imagemagick's convert -strip

I don't think those transformations belong inside smugline, and even if I could embed them (which might be tricky), it feels dirty to modify the source images just to upload them. Doing it in-memory or via a temporary copy makes things much more complicated and my knowledge or time are limited.

Unfortunately, beyond that my hands are tied.

@sfrostick
Copy link

Thanks for the response. Its a little poor from smugmug, i noticed some other fields also aren't actually returned. For the moment as i don't really want to do any pre-processing and like the workflow with smugline i have hacked some code on to yours to compare the EXIF data for DateTimeOriginal along with the current filename. For me at least this is a unique enough to weed out duplicates in place of a md5sum as the date/time and file name will always be a unique coupling and these fields are also correctly returned by smugmug per image with smugmug.images_getEXIF.

@gingerlime
Copy link
Owner

Perhaps that's a good alternative until we can upgrade the API? I can imagine building some kind of heuristics for duplicate detection based not solely on MD5 but also other parameters like filename, timestamp or maybe some other metadata. Perhaps a combination of a few elements can achieve more than solely relying on MD5 hashes? it's a little counter-productive, because the primary (perhaps only) reason of having those MD5 hashes in the first place is to uniquely identify images... but if the API is broken, maybe that's the best we can do under those circumstances.

Would you mind sharing a snippet of your patch via gist or a pull request @sfrostick? I'll have a think about it and see if it makes sense to implement something within smugline (also as long as I have some spare time to play with it... perhaps over the holiday season)

@sfrostick
Copy link

Submitted a pull request with my hack which i wouldn't use but shows my thinking. Something along the lines of another command line argument to switch between md5sum and exif matching might be useful.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants