Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add API support for downloading git based urls #5

Closed
wants to merge 1 commit into from
Closed

Add API support for downloading git based urls #5

wants to merge 1 commit into from

Conversation

TG1999
Copy link
Collaborator

@TG1999 TG1999 commented Nov 24, 2019

Partially solves issue #1
Signed-off-by: unknown <tushar.goel.dav.gmail.com>

@TG1999
Copy link
Collaborator Author

TG1999 commented Nov 25, 2019

@pombredanne please have a look on it

@TG1999 TG1999 closed this Nov 25, 2019
@TG1999 TG1999 reopened this Nov 25, 2019
Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks and sorry for the time it took! to review this!

fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Show resolved Hide resolved
if dest:
dest_dir = os.path.join(dest, branch_name)
else:
dest_dir = os.path.join(os.environ.get('CHARM_DIR'), "fetched",
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Whats is CHARM_DIR?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since you copied this code from https://github.com/juju/charm-helpers/blob/master/charmhelpers/fetch/giturl.py or similar we absolutely need to track its origin and license AND keep all original notice.

We never borrow or copy code without tracking it and documenting where it comes from.

Now, why not reusing charm-helpers as a library directly?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think instead of using their module I should code mine with some little tweaks and how can I track its origin, please help me in that

fetchcode/giturl.py Outdated Show resolved Hide resolved
return True

def clone(source, dest, branch='master', depth=None):
if not canHandle(source):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

May be a small docstring would help? after all you are not only cloning but also pulling

fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
@TG1999
Copy link
Collaborator Author

TG1999 commented Jan 26, 2020

I got your point @pombredanne will do the changes in 1-2 days

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you and sorry for taking so long to review this!
See my comments inline. You also really want to start adding some tests too.

fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
@TG1999
Copy link
Collaborator Author

TG1999 commented Feb 21, 2020

@pombredanne changes are done, please check them.

Copy link
Contributor

@steven-esser steven-esser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See my feedback.

Additionally, you need many more test URLs to make sure that your code will work in all cases. We need tests for Null URLs, '' URLs, git:// structured URLs, svn:// structured URLs, ftp:// structured URLs and many others.

fetchcode/giturl.py Outdated Show resolved Hide resolved
fetchcode/giturl.py Outdated Show resolved Hide resolved
Returns destination directory
"""
url_parts = urlparse(source)
repo_name = url_parts.path.strip('/').split('/')[-1].split('.')[0]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is very ugly and I do not know what it does. You should use urllib.parse to parse URLs instead of homemade string manipulation.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am already using urllib.parse, but it only gives path, Then I have to parse that path into something meaningful. I agree it looks ugly, that's why I am now using multiple lines and explained what steps I am using. I hope that will work :)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, maybe you could use https://docs.python.org/3/library/pathlib.html to handle the various pieces of the path?

The point here being, parsing of paths has been done before by others. I would much rather use a tested library instead of splitting path strings, when possible.

@TG1999
Copy link
Collaborator Author

TG1999 commented Feb 29, 2020

Hi @MaJuRG, can you give me some sample URLs that I have to handle for test cases using git

@steven-esser
Copy link
Contributor

@TG1999 Here is a list of possible git urls combos: https://stackoverflow.com/questions/31801271/what-are-the-supported-git-url-formats

You can craft samples from these skeletons.

@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 3, 2020

Hey @MaJuRG , Thanks for your guidance I think now this PR is good to go

  • All the tests are covered(if any is left by chance, please provide me the URL of that case I will ad it also).
  • I have now used pathlib for taking out repo name.

Copy link
Contributor

@steven-esser steven-esser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left some formatting comments.

A quick note: Our convention is to indent with spaces, not tabs. Our default is 4 spaces per line of indentation.

On a side note, have you ran the entire test suite for fetchcode? Since we have no CI at the moment, I will have to check later and see if everything has passed tests.

"""
Testing https based URLs
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line break

"""
Testing git based URLs
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line break

"""
Testing git+ssh based URLs
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line break

"""
Testing git+https based URLs
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line break

"""
Testing ssh based URLs
"""

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Remove this line break

"""
url_parts = urlparse(source)
if url_parts.scheme in ('https', 'git', 'git+ssh', 'ssh', 'git+https'):
return True
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is out of line with the rest of the file.

You need to indent in increments of 4 spaces per line of indentation. This line should be indented 8 spaces in this case.

if os.path.exists(dest):
print('Can not clone since repository already exists')
else:
os.mkdir(dest)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is out of line with the rest of the file.

You need to indent in increments of 4 spaces per line of indentation. This line should be indented 8 spaces in this case.

os.mkdir(dest)
cmd = ['git', 'clone', source, dest, '--branch', branch]
if depth:
cmd.extend(['--depth', depth])
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is out of line with the rest of the file.

You need to indent in increments of 4 spaces per line of indentation. This line should be indented 8 spaces in this case.

repo_name = Path(url_parts.path).stem

if dest:
dest_dir = os.path.join(dest, repo_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is out of line with the rest of the file.

You need to indent in increments of 4 spaces per line of indentation. This line should be indented 8 spaces in this case.

if dest:
dest_dir = os.path.join(dest, repo_name)
else:
dest_dir = os.path.join('./', repo_name)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This line is out of line with the rest of the file.

You need to indent in increments of 4 spaces per line of indentation. This line should be indented 8 spaces in this case.

@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 3, 2020

Yes I have ran the test suite for it :), I will edit the indentation as per your comments.

@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 3, 2020

@MaJuRG Done 👍

@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 5, 2020

@MaJuRG @pombredanne please look into this PR :D

Copy link
Member

@pombredanne pombredanne left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 Thank you for the updates. See the few nitpickings form my review before I can merge!

import os

from urllib.parse import urlparse
from pathlib import Path
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you sort your imports here and everywhere?
os, pahlib, urllib

print('Can not clone since repository already exists')
else:
os.mkdir(dest)
cmd = ['git', 'clone', source, dest, '--branch', branch]
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is OK for now, but we would want to have something much more robust based on pip code in the future.


clone(source, dest_dir, branch, depth)

return dest_dir
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please add a trailing LF.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey, what is LF :D

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry for the lack of clarity. I meant a line feed, e.g. a line return.
The convention is to always have one at the end of text files.

@mock.patch('os.system')
def test_git_https(mock_os):
"""
Testing ssh based URLs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

test_git_https makes me believe that this is a test for git and https
But the docstring says "Testing ssh based URLs"
It would be simpler and cleaner to:

  1. use verbose and descriptive test function names
  2. do not use docstring for tests

So your function could be instead:
test_fetch_with_ssh_git_url_returns_a_response(mock_os)

"""
url = 'ssh://[email protected]:EastCloud/node-websockets.git'
with mock.patch('os.mkdir') as mocked_file:
response = giturl.fetch(source=url,branch='master',depth=None,dest='/home/tg')
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please make sure that your spacing is correct.
This should be:
(source=url, branch='master', depth=None, dest='/home/tg')

@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 6, 2020

@pombredanne @MaJuRG good to go ?

url = 'https://github.com/TG1999/fetchcode.git'
with mock.patch('os.mkdir') as mocked_file:
response = giturl.fetch(source=url, branch='master', depth=None, dest='/home/tg')
assert response is not None
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you add to your tests? I would like to see what this response object contains for each of these url schemes.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Add what @MaJuRG I can not understand 😅

Copy link
Contributor

@steven-esser steven-esser Mar 9, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, you only test if response is not None. This is not very descriptive as to what fetching a git URL gets you. What are the attributes to this response object? What are the values for these various attributes for each of these test urls?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay I am getting you, doing the changes :)

@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 9, 2020

@MaJuRG I have added tests for schemes and domain of URL please check them :) and if any case is not handled please tell me I will try to resolve those cases too 😄

with mock.patch('os.mkdir') as mocked_file:
response = giturl.fetch(source=url, branch='master', depth=None, dest='/home/tg')
assert response.url_scheme == 'ssh'
assert response.domain == '[email protected]:EastCloud'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't the domain be github.com as well for this case?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I am getting you, can you suggest me some way in which I can parse the domain for ssh based URLs currently I am using urllib.parse which is giving me this kind of domain name, what approach should I use for it or should I handle ssh case differently and do some string manipulation on my own

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@TG1999 Well, the first step would be to look at the urllib docs to see exactly how things like this address get parsed. If there is nothing there, then we probably want to handle this as a special case. I havent looked in depth at the docs myself, so I do not know the exact way to approach this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @MaJuRG , I have handled it separately now all test cases are working fine too, good to go?

@TG1999 TG1999 closed this Mar 9, 2020
@TG1999 TG1999 reopened this Mar 9, 2020
@TG1999
Copy link
Collaborator Author

TG1999 commented Mar 10, 2020

@pombredanne only your approval is left, please approve it 😅

@TG1999 TG1999 requested a review from pombredanne March 27, 2020 15:33
@TG1999 TG1999 closed this Jul 28, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants