Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Managing files from AWS s3 #146

Open
Libisch opened this issue Apr 6, 2017 · 4 comments
Open

Managing files from AWS s3 #146

Libisch opened this issue Apr 6, 2017 · 4 comments

Comments

@Libisch
Copy link
Contributor

Libisch commented Apr 6, 2017

Overview

Images scraped using Scrapy (from Bagnowka for example) are uploaded to AWS s3, not to google storage (were all the photos are currently stored). Even though the correct main_image_url is specified.
in /bhs_api/item.py:

def get_image_url(image_id, bucket):
    return  'https://storage.googleapis.com/{}/{}.jpg'.format(bucket, image_id)

Expected

The main image url should not be referring to gcs if other storage is specified inmain_image_url.

"main_image_url": "https://s3-us-west-2.amazonaws.com/bagnowka-scraped/full/95eb0e2ccf2a62ad46939994718ba150.jpg"

Actual

The API returns PictureId appended to the storage.googleapis uri:

"main_image_url": "https://storage.googleapis.com/bhs-flat-pics/95eb0e2ccf2a62ad46939994718ba150.jpg"

#131

@Libisch
Copy link
Contributor Author

Libisch commented Apr 6, 2017

@OriHoch

@OriHoch
Copy link
Contributor

OriHoch commented Apr 12, 2017

@Libisch following our discussion -

  1. bagnowka items should have some kind of indication that we should use their url attribute as-is
  2. item.py enrich_item function should check this attribute

@nuritgazit nuritgazit added this to the High Priority Backend milestone Apr 27, 2017
@Libisch
Copy link
Contributor Author

Libisch commented May 11, 2017

Done. see #165

@OriHoch
Copy link
Contributor

OriHoch commented May 15, 2017

@Libisch if it's done, please assigne to @TheGrandVizier for QA (if needed)
if it doesn't require QA or it's very technical you should assign to yourself for QA

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants