Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Support remote scripts with uv run #6375

Merged
merged 32 commits into from
Oct 10, 2024

Conversation

manzt
Copy link
Contributor

@manzt manzt commented Aug 21, 2024

First off, congratulations on the 0.3 release! The PEP 723 standalone scripts support is awesome, and I can already imagine a long tail of little scripts of my own that would benefit from this functionality.

Background

I really like the Deno CLI's support for running and installing remote scripts.

deno run <url>
deno install --name foo <url>

I can see parallels with uv run and uvx. After mentioning this on Discord, @zanieb suggested I could take a stab at a PR to implement similar functionality for uv.

Summary

This PR attempts to add support for executing remote standalone scripts directly with uv run. While this is already possible by downloading the script (i.e., via curl/wget) and then using uv run, having direct support would be convenient.

The proposed functionality is:

uv run <url>

Another addition/alternative could be to support running scripts via stdin:

curl -sL <url> | uv run -

But that is not implemented in this PR.

Test Plan

I noticed that GitHub and files.pythonhosted.org URLs are used in some of the tests. I've created a personal GitHub Gist with the example from PEP 723 for now to test this functionality.

However, I couldn't figure out how to get the with_snapshot config filter to filter out the tempfile path, so the test is currently failing. Any assistance with this would be appreciated.

Notes

I'm not totally pleased with the implementation of this PR. I think it would be better to handle the case earlier (and probably reuse the cache), and avoid mutation, but since run command requires a local path this was the simplest implementation I could come up with.

I know that performance is paramount with uv so I totally understand if this requires a different approach or something more explicit to avoid "inferring" the path. I'm just taking this as an opportunity to learn a little more Rust and acquaint myself with the code base. cheers!

@zanieb
Copy link
Member

zanieb commented Aug 21, 2024

Awesome! Thanks for putting this up.

@zanieb zanieb self-assigned this Aug 21, 2024
@zanieb zanieb added the enhancement New feature or improvement to existing functionality label Aug 21, 2024
crates/uv/tests/run.rs Outdated Show resolved Hide resolved
@mgaitan
Copy link

mgaitan commented Aug 23, 2024

Amazing feature. Imagine this with #6283

@zanieb
Copy link
Member

zanieb commented Sep 17, 2024

Sorry for the delay, I've been stretched thin on reviews.

@BurntSushi would you be able to take over review of this?

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Before getting into the weeds of how this is implemented, I wanted to pop-up a level and think about this feature more holistically. And I think I do have a significant concern here.

My main concern is that it is technically possible for a valid URL to also be a valid file path. For example:

$ mkdir 'https:'
$ echo wat 'https:/example.com'
$ cat https://example.com
wat

If uv run https://example.com were used, then I believe this PR would result in uv doing an HTTP request instead of using the file path on disk.

Now, I'm not especially worried about this happening in normal usage. The double slash for example is clearly a bit of subterfuge to provoke the ambiguity. So I think the thing where I'm worried is if someone somehow causes a uv run command to run where the end user thinks it's hitting a local file but it's actually hitting a URL. I had a bug similarish to this in ripgrep a while back where end users thought the system gzip executable was being used, but if they cloned a repository containing a gzip.exe on Windows and ran rg from that same directory, then a quirk of Windows would result in executing the gzip.exe in the clone instead.

There are a lot of different possible mitigations to this. Or perhaps this isn't as much of a problem as I think. For example:

  • This could be a feature that requires some opt-in flag to be enabled in our config.
  • The default mode could be to show the Python script that will be executed on stdout first and then provide a prompt confirming it. And then add a --no-confirm (or whatever) flag to make it non-interactive.
  • We could look for a file path matching the URL, and if it exists, either use the local file or report an error.

Maybe there's another idea I haven't thought of.

crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
@zanieb zanieb assigned BurntSushi and unassigned zanieb Sep 17, 2024
@zanieb
Copy link
Member

zanieb commented Sep 17, 2024

I generally have a preference for the "We could look for a file path matching the URL, and if it exists, either use the local file or report an error." option rather than prompting. Opt-in might make sense, but via a --remote flag rather than a persistent configuration option (maybe that's what you meant though).

@zanieb
Copy link
Member

zanieb commented Sep 17, 2024

See also #7396 (comment)

@BurntSushi
Copy link
Member

If we go the route of prioritizing a local file when given a URL when a file exists, then I think that probably satisfies me. I can't think of a way for that to go wrong from a "accidentally executed a remote script instead of the intended local file" perspective.

There may still be an argument in favor of a --remote flag, but more in a vague "otherwise it's too easy to run remote scripts" sense. I don't feel as strongly about that personally.

@manzt What do you think?

@manzt
Copy link
Contributor Author

manzt commented Sep 18, 2024

Thank you for raising this concern. I agree it's a valid issue I hadn't considered.

My preference aligns with @zanieb's suggestion to prioritize checking for a local file path matching the URL first, erroring if a match is found. For context, I checked your example with deno, which always makes the HTTP request (though it operates in a more sandboxed environment than Python, not sure why they decided on this behavior).

Regarding a --remote flag, while I don't have strong arguments for or against it, I appreciate its explicitness. The common use case I envision is sharing links to Gists or GitHub, where adding --remote before execution seems a reasonable safeguard (especially if the readme just has a copy link).

It's worth noting that with #6467, I believe scripts can already be executed via curl <url> | uv run - (though last time I checked the script metadata wasn't parsed from stdin). However, handling this directly in uv does seem really nice UX.

Apologies for the delayed response!

@BurntSushi
Copy link
Member

All righty, let's go with checking for whether a local file exists or not.

@charliermarsh Do you have any opinions here? Do you think uv run https://example.com/foo.py should "just work," or do you think there should be an extra step needed, e.g., uv run --remote https://example.com/foo.py? (Above we discussed mitigating the case where https://example.com/foo.py actually refers to a valid file path, in which case, we would just read the local file instead.)

@zanieb
Copy link
Member

zanieb commented Sep 19, 2024

I sort of prefer "just works"

@BurntSushi
Copy link
Member

Same.

@charliermarsh
Copy link
Member

Same. Whatever we do, it would be nice to align the behavior of requirements.txt files, which can also be either remote or local.

@manzt
Copy link
Contributor Author

manzt commented Sep 19, 2024

Seems like agreement, I’ll start addressing the prior comments and handle the file behavior!

@manzt
Copy link
Contributor Author

manzt commented Sep 25, 2024

Thanks for your patience. I think I was able to achieve the desired behavior and added another test.

I'm not sure how we should plan on testing the remote behavior (right now still depends on a gist with a fixture of my own).

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice work! This is coming along. :-) I've requested another round of changes and suggested a way forward with respect to testing.

crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/tests/run.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
Copy link
Contributor Author

@manzt manzt left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the review! Learning a lot.

Just a clarifying question about Path::try_exists.

crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
crates/uv/src/lib.rs Outdated Show resolved Hide resolved
@manzt
Copy link
Contributor Author

manzt commented Oct 1, 2024

I wonder if the issue on Windows is related to some behavior behind how NamedTempFile works on Windows. Are we "Opening a named temporary file in a world-writable directory."?

.suffix(".py")
.tempfile()?;

let response = reqwest::get(script_url).await?;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs to go through the uv client. Otherwise, it won't respect --offline, SSL certs, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, using the client builder now. I went with re-resolving the global args locally since the contents of file downloaded by this script updates the resolution later.

@BurntSushi
Copy link
Member

I wonder if the issue on Windows is related to some behavior behind how NamedTempFile works on Windows. Are we "Opening a named temporary file in a world-writable directory."?

I'm not sure. It seems like it's NamedTempFile::persist that's failing.

Oh, I think I see the problem. You're using write_atomic and passing the path of the temporary file you just created to it. And write_atomic tries to create another temporary file at that same path and persists it. Specifically, I think this is where the error is coming from:

uv/crates/uv-fs/src/lib.rs

Lines 147 to 156 in c070007

temp_file.persist(&path).map_err(|err| {
std::io::Error::new(
std::io::ErrorKind::Other,
format!(
"Failed to persist temporary file to {}: {}",
path.user_display(),
err.error
),
)
})?;

I don't think you need write_atomic here. Can you say more about why you used it? I think just a write_all on the NamedTempFile itself is sufficient.

@manzt
Copy link
Contributor Author

manzt commented Oct 10, 2024

Oh, I think I see the problem.

Thanks for explanation. I've learned a lot from this PR.

I don't think you need write_atomic here. Can you say more about why you used it? I think just a write_all on the NamedTempFile itself is sufficient.

It was naive of me. I was looking for how files were written (async) in other parts of the project, but for a tempfile I realize we really don't need. Replaced with write_all!

@charliermarsh
Copy link
Member

Do we need a file for this? Could we just read into memory and pass the contents over stdin?

@charliermarsh
Copy link
Member

(I could also look into that separately, but I'm wondering if it would remove a lot of the complexity around tempfiles, drop, etc.)

@manzt
Copy link
Contributor Author

manzt commented Oct 10, 2024

Do we need a file for this? Could we just read into memory and pass the contents over stdin

Definitely! However, I went this direction because writing to a tempfile was the easiest way to support PEP 723 metadata and passing down arguments to the Python script. IIRC PEP 723 is only parsed (currently) from RunCommand::PythonScript | RunCommand::PythonGuiScript (not stdin):

uv/crates/uv/src/lib.rs

Lines 150 to 172 in 7bac708

// If the target is a PEP 723 script, parse it.
let script = if let Commands::Project(command) = &*cli.command {
if let ProjectCommand::Run(uv_cli::RunArgs { .. }) = &**command {
if let Some(
RunCommand::PythonScript(script, _) | RunCommand::PythonGuiScript(script, _),
) = run_command.as_ref()
{
Pep723Script::read(&script).await?
} else {
None
}
} else if let ProjectCommand::Remove(uv_cli::RemoveArgs {
script: Some(script),
..
}) = &**command
{
Pep723Script::read(&script).await?
} else {
None
}
} else {
None
};

and using stdin resolves to python -c <contents> and doesn't capture other addtional arguments/flags for the python script.

@charliermarsh
Copy link
Member

Okay, makes sense. I think we should follow-up by changing that -- it seems like an oversight that we don't detect scripts there!

@charliermarsh
Copy link
Member

Funnily enough: #8097

@charliermarsh
Copy link
Member

I defer to @BurntSushi to merge.

Copy link
Member

@BurntSushi BurntSushi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this LGTM now! Amazing work @manzt. Thank you for sticking with this. I know there was a lot of back-and-forth to get this merged, but I think it's in a good state now!

@BurntSushi
Copy link
Member

BurntSushi commented Oct 10, 2024

I think you just need to rebase this on main.

@BurntSushi BurntSushi merged commit 585456a into astral-sh:main Oct 10, 2024
61 checks passed
@BurntSushi
Copy link
Member

I think you just need to rebase this on main.

Nope that was wrong! I usually do rebase & merge for my PRs, so that was selected (even though I was always planning to squash this PR) and that way greyed out. But squash & merge was fine. Interesting.

@manzt
Copy link
Contributor Author

manzt commented Oct 10, 2024

Ok good! I tried rebasing and it was scary :)

@manzt manzt deleted the manzt/remote-scripts branch October 10, 2024 21:54
@hauntsaninja
Copy link
Contributor

Downloading has some semantic advantages, __file__ is meaningful, tracebacks are better, etc

@charliermarsh
Copy link
Member

Yeah I looked into it and agree that downloading is actually better.

tmeijn pushed a commit to tmeijn/dotfiles that referenced this pull request Oct 15, 2024
This MR contains the following updates:

| Package | Update | Change |
|---|---|---|
| [astral-sh/uv](https://github.com/astral-sh/uv) | patch | `0.4.20` -> `0.4.21` |

MR created with the help of [el-capitano/tools/renovate-bot](https://gitlab.com/el-capitano/tools/renovate-bot).

**Proposed changes to behavior should be submitted there as MRs.**

---

### Release Notes

<details>
<summary>astral-sh/uv (astral-sh/uv)</summary>

### [`v0.4.21`](https://github.com/astral-sh/uv/blob/HEAD/CHANGELOG.md#0421)

[Compare Source](astral-sh/uv@0.4.20...0.4.21)

##### Enhancements

-   Add support for managed installations of free-threaded Python ([#&#8203;8100](astral-sh/uv#8100))
-   Add note about `uvx` to `uv tool run` short help ([#&#8203;7695](astral-sh/uv#7695))
-   Enable HTTP/2 requests ([#&#8203;8049](astral-sh/uv#8049))
-   Support `uv tree --no-dev` ([#&#8203;8109](astral-sh/uv#8109))
-   Support PEP 723 metadata with `uv run -` ([#&#8203;8111](astral-sh/uv#8111))
-   Support `pip install --exact` ([#&#8203;8044](astral-sh/uv#8044))
-   Support `uv export --no-header` ([#&#8203;8096](astral-sh/uv#8096))
-   ADd Python 3.13 images to Docker publish ([#&#8203;8105](astral-sh/uv#8105))
-   Support remote (`https://`) scripts in `uv run` ([#&#8203;6375](astral-sh/uv#6375))
-   Allow comma value-delimited arguments in `uv run --with` ([#&#8203;7909](astral-sh/uv#7909))

##### Configuration

-   Support wildcards in `UV_INSECURE_HOST` ([#&#8203;8052](astral-sh/uv#8052))

##### Performance

-   Use shared index when fetching metadata in lock satisfaction routine ([#&#8203;8147](astral-sh/uv#8147))

##### Bug fixes

-   Add prerelease compatibility check to `uv python` CLI ([#&#8203;8020](astral-sh/uv#8020))
-   Avoid deleting a project environment directory if we cannot tell if a `pyvenv.cfg` file exists ([#&#8203;8012](astral-sh/uv#8012))
-   Avoid excluding valid wheels for exact `requires-python` bounds ([#&#8203;8140](astral-sh/uv#8140))
-   Bump `netrc` crate to latest commit ([#&#8203;8021](astral-sh/uv#8021))
-   Fix `uv python pin 3.13t` failure when parsing version for project requires check ([#&#8203;8056](astral-sh/uv#8056))
-   Fix handling of != intersections in `requires-python` ([#&#8203;7897](astral-sh/uv#7897))
-   Remove the newly created tool environment if sync failed ([#&#8203;8038](astral-sh/uv#8038))
-   Respect dynamic extras in `uv lock` and `uv sync` ([#&#8203;8091](astral-sh/uv#8091))
-   Treat resolver failures as fatal in lockfile validation ([#&#8203;8083](astral-sh/uv#8083))
-   Use `git config --get` for author information for improved backwards compatibility ([#&#8203;8101](astral-sh/uv#8101))
-   Use comma-separated values for `UV_FIND_LINKS` ([#&#8203;8061](astral-sh/uv#8061))
-   Use shared resolver state between add and lock to avoid double Git update ([#&#8203;8146](astral-sh/uv#8146))
-   Make `--relocatable` entrypoints robust to symlinking ([#&#8203;8079](astral-sh/uv#8079))
-   Improve compatibility with VSCode PS1 prompt ([#&#8203;8006](astral-sh/uv#8006))
-   Fix "Stream did not contain valid UTF-8" failures in Windows ([#&#8203;8120](astral-sh/uv#8120))
-   Use `--with-requirements` in `uvx` error hint ([#&#8203;8112](astral-sh/uv#8112))

##### Documentation

-   Include `uvx` installation in Docker examples ([#&#8203;8179](astral-sh/uv#8179))
-   Make the instructions for the Windows standalone installer consistent across README and documentation ([#&#8203;8125](astral-sh/uv#8125))
-   Update pip compatibility guide to note transitive URL dependency support ([#&#8203;8081](astral-sh/uv#8081))
-   Document `--reinstall` with `--exclude-newer` to ensure downgrades ([#&#8203;6721](astral-sh/uv#6721))

</details>

---

### Configuration

📅 **Schedule**: Branch creation - At any time (no schedule defined), Automerge - At any time (no schedule defined).

🚦 **Automerge**: Disabled by config. Please merge this manually once you are satisfied.

♻ **Rebasing**: Whenever MR becomes conflicted, or you tick the rebase/retry checkbox.

🔕 **Ignore**: Close this MR and you won't be reminded about this update again.

---

 - [ ] <!-- rebase-check -->If you want to rebase/retry this MR, check this box

---

This MR has been generated by [Renovate Bot](https://github.com/renovatebot/renovate).
<!--renovate-debug:eyJjcmVhdGVkSW5WZXIiOiIzNy40NDAuNyIsInVwZGF0ZWRJblZlciI6IjM3LjQ0MC43IiwidGFyZ2V0QnJhbmNoIjoibWFpbiIsImxhYmVscyI6WyJSZW5vdmF0ZSBCb3QiXX0=-->
@afeblot
Copy link

afeblot commented Oct 16, 2024

Was coming here to suggest/ask for this feature, and found it just released already. I love you :-)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or improvement to existing functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants