[fix] Add base URL extraction method to GithubRepositoryReader and update version to 0.4.0 #16926
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Description
This PR introduces a new method to extract the base URL from various types of GitHub blob URLs in the GithubRepositoryReader. The change addresses an issue where URL parsing resulted in incorrect base URLs for GitHub Enterprise domains, leading to inaccurate links when interacting with repositories hosted on custom or enterprise GitHub instances.
Changes
Fixes # (issue)
New Package?
Did I fill in the
tool.llamahub
section in thepyproject.toml
and provide a detailed README.md for my new integration or package?Version Bump?
Did I bump the version in the
pyproject.toml
file of the package I am updating? (Except for thellama-index-core
package)Type of Change
Please delete options that are not relevant.
How Has This Been Tested?
Your pull-request will likely not be merged unless it is covered by some form of impactful unit testing.
Suggested Checklist:
make format; make lint
to appease the lint gods