Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate cml publish with cml send-comment #1026

Merged
merged 23 commits into from
Jun 23, 2022
Merged

Conversation

0x2b3bfa0
Copy link
Member

@0x2b3bfa0 0x2b3bfa0 commented May 27, 2022

Use npm install --global github:iterative/cml#7bd9257 to use this feature.

Usage

tee report.md <<END
# Report
![cat](cat.jpg)
END

npx github:iterative/cml#7bd9257 send-comment --publish --watch report.md &

while sleep 30; do
  curl --location https://thecatapi.com/api/images/get?format=src > cat.jpg
done

When using locally (as opposed to running from CI/CD), provide also --repo https://github.com/user/repository, --token ghp_personal_access_token and --commit-sha a1b2c3d pointing to a commit on that repository, preferably part of an open pull request.

Behavior

  • --publish — uploads and replaces all the local paths on report.md (e.g. links & images)
    • e.g. ![description](outputs/plot.png) becomes ![description](https://assets.cml.dev/...)
  • --watch — watches report.md and all the local paths it contains
    • i.e. when any of them changes, updates the comment in the forge

Experimental

  • --trigger-file — specify a trigger file with the same behavior as DVC checkpoint file-based API1
    • only effective along with --watch

Using a trigger file

cml send-comment --publish --watch --trigger-file=example report.md &

while true; do
  date > report.md # modify the report
  touch example # trigger an update
  while test -f example; do sleep 1; done # wait for the update to finish
done

Pending

Reverted

Questions

  • is there any use case for calling cml publish directly after we implement this?
    • maybe (e.g. publish for use outside markdown reports?), though probably fine to keep for backward-compat? — @casperdcl dixit

    • fine to keep hidden, perhaps, as any other deprecated command? — @0x2b3bfa0 respondebat

    • p3-whatever@casperdcl 🙃

  • How does this work outside CI? Are we asking users to run this before using DVCLive?

Footnotes

  1. As the DVC team may know, a file-based API needs synchronization to avoid all sorts of pitfalls; e.g. like lost events, rate limits, corrupted files... et cetera. This makes me question whether we should follow this approach or not. 2

@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 27, 2022 02:31 Inactive
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 27, 2022 03:00 Inactive
@casperdcl
Copy link
Contributor

btw shouldn't cml publish --update be a prerequisite (i.e. overwriting existing files on assets.cml.dev)?

@0x2b3bfa0
Copy link
Member Author

0x2b3bfa0 commented May 27, 2022

@casperdcl, cml publish --update is not necessary, because storage is content-addressed.

$ echo one > file
$ cml publish file
https://host/4355a46b19d348dc2f57c046f8ef63d4538ebb936000f3c9ee954a27460dd865
$ echo two > file
$ cml publish file
https://host/53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3
$ echo two > file
$ cml publish file # same CONTENT, thus same URL
https://host/53c234e5e8472b6ac51c1ae1cab3fe06fad053beb8ebfd8977b010655bfdd3c3

Location-addressed storage is probably what you’re looking for with cml publish --update

$ echo one > file
$ cml publish file
https://host/file
$ echo two > file
$ cml publish file
https://host/file
$ echo three > file
$ cml publish --update file # same PATH, thus same URL
https://host/file

However, paths generated this second way are:

  • Insecure, i.e. easy to IDOR without authentication
  • Prone to undesirable collisions across users, repositories and workflow runs
  • Unable to bypass GitHub user content cache; it rewrites all the image source URLs to camo.githubusercontent.com1

Footnotes

  1. If dynamic status badges work despite this, it may be a non-issue (?)

@0x2b3bfa0
Copy link
Member Author

An intermediate solution to avoid clutter and preserve desirable properties of storage is:

  • Generate a UUIDv4 when the daemon starts
  • Use it as a prefix for location-based storage

@casperdcl
Copy link
Contributor

casperdcl commented May 27, 2022

I mean this:

$ echo one > file
$ ONE_URL=$(cml publish file)
$ curl -I $ONE_URL | head -n1
HTTP/1.1 200 OK
$ echo two > file
$ TWO_URL=$(cml publish file)
warn: CML detected subsequent publish of same filename in same session. Deleting old file.
$ curl -I $TWO_URL | head -n1
HTTP/1.1 200 OK
$ curl -I $ONE_URL | head -n1
HTTP/1.1 404 Not Found

To avoid us hosting 1M images per training run.

@0x2b3bfa0
Copy link
Member Author

CML detected subsequent publish of same filename in same session

Sounds like #1026 (comment)?

@0x2b3bfa0
Copy link
Member Author

Related to iterative/dvclive#91 (comment)

src/cml.js Outdated Show resolved Hide resolved
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 29, 2022 23:32 Inactive
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 29, 2022 23:35 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 30d35c0 to eaf33bc Compare May 30, 2022 00:00
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 00:00 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from eaf33bc to 85374e4 Compare May 30, 2022 00:19
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 00:19 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 85374e4 to 88d15df Compare May 30, 2022 01:46
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 01:46 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 88d15df to 869ad75 Compare May 30, 2022 02:06
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal May 30, 2022 02:06 Inactive
@0x2b3bfa0
Copy link
Member Author

0x2b3bfa0 commented May 30, 2022

After #1026 (comment) and some other limitations1 of the file event interface, I wonder if this is the right approach to integrate CML with other tools. 🤔

Footnotes

  1. Namely, the inability of synchronizing events with rate limits, and the issues with corrupted files during write.

@casperdcl casperdcl mentioned this pull request May 30, 2022
7 tasks
@casperdcl casperdcl added cml-publish Subcommand cml-comment Subcommand labels May 30, 2022
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 12, 2022 20:01 Inactive
dacbd
dacbd previously approved these changes Jun 12, 2022
Copy link
Contributor

@dacbd dacbd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, functionality I'll defer to others. 😁 🥼

@0x2b3bfa0 0x2b3bfa0 linked an issue Jun 13, 2022 that may be closed by this pull request
7 tasks
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 14, 2022 00:46 Inactive
@0x2b3bfa0 0x2b3bfa0 force-pushed the comment-publish-report branch from 763b4ff to 484143d Compare June 14, 2022 00:50
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 14, 2022 00:50 Inactive
@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 14, 2022 21:38 Inactive
@0x2b3bfa0
Copy link
Member Author

@iterative/cml, shall we merge this?

@DavidGOrtega DavidGOrtega temporarily deployed to internal June 15, 2022 18:21 Inactive
@0x2b3bfa0 0x2b3bfa0 mentioned this pull request Jun 15, 2022
1 task
},
triggerFile: {
type: 'string',
description: 'File used to trigger the watcher',
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Trigger how?

Suggested change
description: 'File used to trigger the watcher',
description: 'If specified, --watch will trigger only when the lockfile is present, and will delete the lockfile',

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Exactly as stated in the “using a trigger file” section on #1026 (comment)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's hard to describe in a help text. 😅

  • Takes a path to a regular file that doesn't exist yet
  • Creating a regular file in that path triggers an --update
  • Once the --update finishes, the file gets automatically deleted

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
description: 'File used to trigger the watcher',
description: 'Path to the watcher trigger; create a file on that path to triger an update, then wait until the file disappears',

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need to do #762 before rewording this again :)

Copy link
Contributor

@casperdcl casperdcl Jun 23, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

update: If you'd prefer to do #762 immediately after this PR please do feel free to leave this unresolved here @0x2b3bfa0

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're happy to see #1073 merged immediately after this pull request, fine by me. I would push a new commit to #1073 with unified periods and (perhaps) some better descriptions.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given that this is a hidden option and we don't know if DVCLive is going to use it, I'd rather not care too much about rewording (?)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

resolving unresolved

src/cml.js Show resolved Hide resolved
Copy link
Contributor

@casperdcl casperdcl left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

only had a super quick glance, just checking - is this "solution to avoid clutter and preserve desirable properties of storage" or equivalent implemented?

@0x2b3bfa0
Copy link
Member Author

only had a super quick glance, just checking - is this "solution to avoid clutter and preserve desirable properties of storage" or equivalent implemented?

yes, yes, YES

@0x2b3bfa0 0x2b3bfa0 temporarily deployed to internal June 23, 2022 14:03 Inactive
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cml-comment Subcommand cml-publish Subcommand enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

epic: CML <> DVCLive
6 participants