Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

API: please make commit hash/timestamp available with ​/repos​/{owner}​/{repo}​/contents​/{filepath} #12840

Closed
1 of 7 tasks
IzzySoft opened this issue Sep 14, 2020 · 23 comments · Fixed by #20398
Closed
1 of 7 tasks
Labels
modifies/api This PR adds API routes or modifies them type/feature Completely new functionality. Can only be merged if feature freeze is not active.

Comments

@IzzySoft
Copy link

IzzySoft commented Sep 14, 2020

Description

I'm trying to fetch commit timestamps for a given file via the API. Using /repos​/{owner}​/{repo}​/contents​/{filepath} (eg https://try.gitea.io/api/v1/repos/magnusja/libaums/contents/app%2Fbuild.gradle) reports details for the file quite fine, but obviously lists a wrong sha: using the sha given there as {ref} for ​/repos​/{owner}​/{repo}​/commits​/{ref}​/statuses gives an empty result, for https://try.gitea.io/api/v1/repos/magnusja/libaums/commits?sha=049bdf7bbc8ff39411b618ff855a14ed3446dcc1&limit=1 it even results in a 503 server error. Trying the sha on ​/repos​/{owner}​/{repo}​/git​/trees​/{sha} (where it explicitly says "sha") results in a "bad request: sha not found".

Reproducible for other repos/files on eg Codeberg.org – I could not find a single valid commit hash in the values given for sha, and no way to get the correct commit hashes.

UPDATE: as pointed out by @mrsdizzie: "this is the file SHA and not a bug" – hence converting this from a "bug report" to a feature request: As currently there's no way to get the file's timestamp/commit-hash, please add it at this level (https://try.gitea.io/api/swagger#/repository/repoGetContents – as suggested by @mrsdizzie). Ideally both. More details on the commit itself could be obtained easily using ​/repos​/{owner}​/{repo}​/commits/{ref}​/statuses, for example, once the commit hash is known.

@6543 6543 added modifies/api This PR adds API routes or modifies them type/bug labels Sep 15, 2020
@6543 6543 added this to the 1.13.0 milestone Sep 15, 2020
@IzzySoft
Copy link
Author

IzzySoft commented Sep 15, 2020

@6543 thanks for digging that up! I'm no Go dev, but tried to figure myself in the hope to give a pointer. Ended up at the very same place, but was unsure if I might have misread it. Thanks for confirming!

Problem is, as there's no way to figure the correct SHA via the API, all methods requiring it will fail – unless there's a second source to get the SHA from. If someone has a work-around for getting the latest commit SHA for a given path (apart from checking out the entire repo), I'd welcome the hint!

@johanvdw
Copy link
Contributor

johanvdw commented Sep 28, 2020

I don't thinks your analysis is correct:

if I go to https://try.gitea.io/api/v1/repos/magnusja/libaums/contents/app%2Fbuild.gradle
I get "sha":"1c5ce3a124eb0acc1ddda0da47d108a89e99fbde"

I get the same sha when checking on the command line:

johan@x1:/tmp/libaums(develop)$ git ls-files -s app/build.gradle
100644 1c5ce3a124eb0acc1ddda0da47d108a89e99fbde 0	app/build.gradle

or also

johan@x1:/tmp/libaums(develop)$ git hash-object app/build.gradle
1c5ce3a124eb0acc1ddda0da47d108a89e99fbde

You are confusing the file sha with the commit sha

@johanvdw
Copy link
Contributor

Note the behaviour of github is the same here:
https://api.github.com/repos/go-gitea/gitea/contents/LICENSE

  "sha": "a8d4b49dd073a4a38a7e58385eeff7cc52568697"
johan@x1:~/git/gitea(master)$ git ls-files -s LICENSE 
100644 a8d4b49dd073a4a38a7e58385eeff7cc52568697 0	LICENSE

@johanvdw
Copy link
Contributor

FYI, it looks like the github api does support specifying a path
https://developer.github.com/v3/repos/commits/#list-commits-on-a-repository

Gitea does not, but that does not seem a bug but rather a feature request.

@IzzySoft
Copy link
Author

You are confusing the file sha with the commit sha

I only get one SHA reported. If you can tell me where to find the commit SHA, I'd immediately take that, of course!

Maybe I started with wrong assumptions having used the APIs of Github/GitLab before. But which approach would you suggest then if I want to get the timestamp for the last commit of a given file?

@mrsdizzie
Copy link
Member

Comment above should be correct this is the file SHA and not a bug.

For the feature request itself I don't think we currently provide a way to get the last timestamp of a file by name via API -- Maybe it should just be added as a response field to https://try.gitea.io/api/swagger#/repository/repoGetContents

Or add last commit SHA there as well and be able to look it up with that

@mrsdizzie mrsdizzie removed this from the 1.13.0 milestone Sep 29, 2020
@IzzySoft
Copy link
Author

Maybe it should just be added as a response field to https://try.gitea.io/api/swagger#/repository/repoGetContents

Would be a good place indeed. Could be named commit (or commit_hash / commit_sha) to make a clear connection. As the result represents the file, having timestamp details available directly at this place would of course be welcome, too – even if it would just be the commit_timestamp (unix time). More details on the commit itself could be obtained easily using ​/repos​/{owner}​/{repo}​/commits/{ref}​/statuses, for example, once the commit hash is known.

@6543
Copy link
Member

6543 commented Sep 29, 2020

@IzzySoft can you change issue title?

@6543 6543 added the type/feature Completely new functionality. Can only be merged if feature freeze is not active. label Sep 29, 2020
@IzzySoft IzzySoft changed the title API: ​/repos​/{owner}​/{repo}​/contents​/{filepath} reports wrong sha API: please make commit hash/timestamp available with ​/repos​/{owner}​/{repo}​/contents​/{filepath} Sep 29, 2020
@IzzySoft
Copy link
Author

@6543 sure, done (hope that fits now). Shall I edit my original post as well (striking the wrong assumptions, adding a summary of "the real thing")?

@6543
Copy link
Member

6543 commented Sep 29, 2020

thanks - if you like ...

@IzzySoft
Copy link
Author

Done as well. Makes it easier to see what's open without scanning the entire thread 😉

@IzzySoft
Copy link
Author

Just a short heads-up (no pressure/whining/…): Any (at least vague) ETA on when this could be expected?

johanvdw added a commit to johanvdw/gitea that referenced this issue Nov 16, 2020
@IzzySoft
Copy link
Author

It's been almost 2 years now, so just a friendly reminder this issue still exists. May I kindly ask for an ETA – provided there is one meanwhile?

Gusted pushed a commit to Gusted/gitea that referenced this issue Jul 18, 2022
- When requesting the contents of a filepath, add the latest commit's
SHA to the requested file.
- Resolves go-gitea#12840
@Gusted
Copy link
Contributor

Gusted commented Jul 18, 2022

#20398 will do the trick for latest commit's SHA. Timestamps are a quite trickier, as you multiple timestamps that a user could want (so it's better to leave that out and let another API figure that out).

@IzzySoft
Copy link
Author

Thanks @Gusted – SHA of the latest commit is totally fine! IIRC that's the same way Github and GitLab are handling that (I just remember when implementing that in my tool I always had to go via the commit). As soon as that's released and implemented at Codeberg I can verify and confirm.

@Gusted
Copy link
Contributor

Gusted commented Jul 18, 2022

As soon as that's released and implemented at Codeberg I can verify and confirm.

It's an enhancement, so not being backported to 1.17(which is the next release), so it can take a while... Given it's small it might be fine to backport it.

@IzzySoft
Copy link
Author

It waited for almost 2 years now, so I can be patient for another few weeks I guess 😄 Whenever it arrives I'll check. Need to update my library then to be able to confirm the path is complete 😉

@zeripath
Copy link
Contributor

IIRC that's the same way Github and GitLab are handling that `

Are you sure that Github is providing this last commit information on the /repos/.../contents/{filepath} endpoint?

I don't see it here: https://docs.github.com/en/rest/repos/contents

Which Github API is exposing this information?

My problem with this FR is that although it sounds simple - it is not. There is a, (potentially substantial), cost to calculating this information which we tolerate for the UI through caching, time deferred calculation and plain old user demand but it's actually not necessarily a good idea to provide this on the API due to the fact that this is computationally expensive.

What is your use-case here?


PS I know that git log -1 <refish/commitish> -- <path> will calculate this information - but this is not free on large repos and on old unchanged paths. It's substantially cheaper with commit-graphs (especially bloom filtered graphs) - which is why these represent such a huge improvement for git but... it's really not free.

@IzzySoft
Copy link
Author

@zeripath I'm pretty much sure as I use it daily – also with GitLab's API, just Gitea is missing it. Use-case is fetching files only if they do not yet exist locally in their recent version (i.e. not at all, or were updated; and no, checking out all the thousand repos this runs on is no option). Basically, whenever my updater fetches a new release APK from a project, it checks specific metadata files (such as description and screenshots) for whether they where updated as well. Only changed files shall be pulled for post-processing.

@zeripath
Copy link
Contributor

I have been able to find the Gitlab API https://docs.gitlab.com/ee/api/repository_files.html#get-file-from-repository but I cannot find the Github one.

Please show me the Github API that you are using - we need to make sure that we don't call whatever we do something that would conflict.


I don't understand why aren't you using the sha of the file for this? That is the unique identifier of the file. You don't need the last commit hash to detect whether a file has changed - the sha tells you that. For the last commit SHA to be usable in this context you'd have to be storing it against the file path - in which case storing the SHA of the file instead would be better as it is the literal cache object.

@IzzySoft
Copy link
Author

Please show me the Github API that you are using

See at the top of the method, it's in the comments 😉 https://developer.github.com/v3/repos/commits/ – and of course https://developer.github.com/v3/repos/contents/ for the contents themselves.

I don't understand why aren't you using the sha of the file for this?

Because all other sub-routines (Github, GitLab) use the commit date, and I want to keep the class consistent. Data is collected into a unified structure looking the same independent of the source – to be processed at a different place, again independent from the source. The routine taking care to replace the needed files doesn't know of the API; it basically just gets a list of files and the corresponding time stamps. I cannot mix SHAs into that. Further there are other routines relying on the same structure (see below).

You don't need the last commit hash to detect whether a file has changed

But apart from unification, I'd need it to tell what time a file was "touched" last (aka "this file has not been updated for 7 years"). Among other things. Or to compare if the file I have locally is never than the latest commit (and thus was e.g. modified locally and should not be overwritten). The SHA doesn't tell me that.

@Gusted
Copy link
Contributor

Gusted commented Jul 20, 2022

Gitlab calls it last_commit_id. Any strong opinion to replace it to that or hold the current last_commit_sha?

@IzzySoft
Copy link
Author

From my point of view: as you please. The important thing is one can get the SHA to connect to the commit. As this part of the code is specific to the service, any name should be fine 😉

lunny pushed a commit that referenced this issue Jul 30, 2022
* Add latest commit's SHA to content response

- When requesting the contents of a filepath, add the latest commit's
SHA to the requested file.
- Resolves #12840

* Add swagger

* Fix NPE

* Fix tests

* Hook into LastCommitCache

* Move AddLastCommitCache to a common nogogit and gogit file

Signed-off-by: Andrew Thornton <[email protected]>

* Prevent NPE

Co-authored-by: Andrew Thornton <[email protected]>
Co-authored-by: wxiaoguang <[email protected]>
vsysoev pushed a commit to IntegraSDL/gitea that referenced this issue Aug 10, 2022
* Add latest commit's SHA to content response

- When requesting the contents of a filepath, add the latest commit's
SHA to the requested file.
- Resolves go-gitea#12840

* Add swagger

* Fix NPE

* Fix tests

* Hook into LastCommitCache

* Move AddLastCommitCache to a common nogogit and gogit file

Signed-off-by: Andrew Thornton <[email protected]>

* Prevent NPE

Co-authored-by: Andrew Thornton <[email protected]>
Co-authored-by: wxiaoguang <[email protected]>
@github-actions github-actions bot locked as resolved and limited conversation to collaborators May 5, 2023
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
modifies/api This PR adds API routes or modifies them type/feature Completely new functionality. Can only be merged if feature freeze is not active.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants