Lazy load subapps/subcommands #276

mrjsj · 2024-12-29T11:38:51Z

Hi - I love the ergonomics of this. It's really a breeze to work with.

I've built a CLI with a lot of nested subapps/subcommands, and since it goes through and registers everything on every invocation, it is a bit slow.

Is there any way to lazy load subapps and its subcommands? Here's an example.

When I run mycli --help only data-pipeline and environment subapps should be loaded, but not their subapps/commands.
When I run mycli environment then only children of the environment subapp should be loaded.

├── data-pipeline [app]
│   ├── create [command]
│   ├── delete [command]
│   ├── get [command]
│   ├── list [command]
│   └── update [command]
├── environment [app]
│   ├── create [command]
│   ├── delete [command]
│   ├── get [command]
│   ├── list [command]
│   ├── update [command]
│   ├── spark-compute [app]
│   │   ├── get-published-settings [command]
│   │   ├── get-staging-settings [command]
│   │   └── update-staging-settings [command]
│   └── spark-libraries [app]
│       ├── cancel-publish [command]
│       ├── delete-staging-library [command]
│       ├── get-published-libraries [command]
│       ├── get-staging-libraries [command]
│       ├── publish-environment [command]
│       └── upload-staging-library [command]

And this is how I do it currently:

from cyclopts import App

from .create import create_command
from .delete import delete_command
from .get import get_command
from .list import list_command
from .spark_compute import spark_compute_app
from .spark_libraries import spark_libraries_app
from .update import update_command

environment_app = App(name="environment", help="[yellow]Commands for environment[/yellow]")
environment_app.command(create_command, name="create")
environment_app.command(delete_command, name="delete")
environment_app.command(get_command, name="get")
environment_app.command(list_command, name="list")
environment_app.command(update_command, name="update")
environment_app.command(spark_compute_app)
environment_app.command(spark_libraries_app)

The text was updated successfully, but these errors were encountered:

BrianPugh · 2024-12-29T13:12:02Z

Is your project publicly available? I'd like to run a profiler on it. Lazy loading is one solution, but I wonder if something else is slow (and lazy loading would be a premature optimization that wouldn't help too much).

mrjsj · 2024-12-29T14:31:40Z

Yes.
https://github.com/mrjsj/msfabricutils/tree/feat/new-cli/cli

Would be lovely. Thanks!

mrjsj · 2024-12-30T10:21:55Z

@BrianPugh Thanks for profiling! It seems cyclopts itself is very efficient with loading subapps/commands, so the issue is mostly loading the 3rd party libs in those.

I can move some of those imports into the commands themselves which helps.

Maybe lazy loading was the wrong term. I looked into your documentation regarding Meta Apps - maybe that's what I'm looking for? I.e. launching the admin_app only when the admin command is run. I couldn't figure out how to implement it in my solution.
Or maybe something similar typers callback?

BrianPugh · 2024-12-30T12:53:03Z

Hmmm sorry I still don't quite follow 😅. Can you explain in more detail/example about what you are trying to achieve?

And I'll be merging in these optimizations and cutting a release later today 👍.

mrjsj · 2024-12-30T13:32:53Z

I apologize. I might just be oblivious because I'm not that experienced with CLIs.

The entrypoint application can have commands registered, and it can have applications registered as commands.

If we look at the example from the post, I have a data-pipeline app, and an environment app. These could in theory be run as individual apps, i.e. python cli/commands/core/data-pipeline/__init__.py.

So I'm thinking that the main app only need to know about the two commands: data-pipeline and environment. The main app doesn't need to know the commands within those sub-apps.
But when I then run python cli/main.py data-pipeline then it essentially runs python cli/commands/core/data-pipeline/__init__.py. So at that time we have only registered the commands: data-pipeline, environment, and the subcommands of data-pipeline.

If we instead run python cli/main.py environment, the we would run the environment application.

So if I for example run python cli/main.py environment spark-compute --help I would expect the following to happen:

The main entry point is run
The commands data-pipeline and environment are registered
The environment command is executed which starts the environment app.
The create, delete, list update, spark-compute and spark-libraries are registered as commands
The spark-compute command is executed which starts the spark-compute app.
The --help command is executed showing the help-page for the spark-compute app.

In this example, we skipped over registering the following commands (marked with asterisk). If we could skip registing those when the main app is launched, I would imagine a great improvement which only increases, the more complex/nested command structure is. In the project I linked, I have over 200 commands, but they're spread across sub-applications, so if we could only register a subset of nested commands, I imagine it a great performance boost.

data-pipeline
- create*
- delete*
- get*
- list*
- update*
environment
- spark-libraries
  - cancel-publish*
  - delete-staging-library*
  - get-published-libraries*
  - get-staging-libraries*
  - publish-environment*
  - upload-staging-library*

BrianPugh · 2024-12-30T20:22:17Z

I see what you mean; currently I think Cyclopts is fast enough that it probably doesn't matter to much; I think we could only save a few dozens of milliseconds in very large applications from lazily importing/parsing subapplications. A potentially attractive option would be to register subapplications as a reference string (sort of how scripts work in python-packaging), where Cyclopts would dynamically import it if necessary:

from cyclopts import App
app = App()
app.command("msfabricutils.cli.core.kql_database:create", name="create")  # something like this

The real benefit here would be if specific subapp's import modules that are relatively expensive to import, that it wouldn't be come additive over the entire app. I'll need to think about this a little more.

Some minor clarification of how Cyclopts works internally: ideally, it would work as you describe (subcommands directly invoking subapps who subsequently invoke their own subapps; it would be beautifully recursive). Unfortunately, due to implementation details of how "help" works (e.g. you need the entire list of commands that lead up to the current subcommand), the root app actually directly invokes a subapp, no matter how far down the invocation chain it is.

Second aside; I see you reaching in to the private app._commands attribute; perhaps I should add a public App.update method (similar to dict.update) that takes in an App and copies over all it's commands.

mrjsj · 2024-12-30T20:51:55Z

I see what you mean; currently I think Cyclopts is fast enough that it probably doesn't matter to much;

Absolutely. That is evident from the profling you did.

The real benefit here would be if specific subapp's import modules that are relatively expensive to import, that it wouldn't be come additive over the entire app. I'll need to think about this a little more.

Exactly. One specific subapp could have a really expensive import which would slow the entire app, so being able to optionally lazy load it would be helpful (I think). Either directly exposed like app.command(my_expensive_app, lazy=True) or maybe just restructuring the code somehow

import time

from cyclopts import App

app = App()

def expensive_command():
    time.sleep(2)

@app.command
def admin_command():
    admin_app = App()
    admin_app.command(expensive_command, name="expensive")

    admin_app()


if __name__ == "__main__":
    app()

Second aside; I see you reaching in to the private app._commands attribute; perhaps I should add a public App.update method (similar to dict.update) that takes in an App and copies over all it's commands.

Yes, a little hack to skip over the "core" command while maintaining the code-generated folder/file structure. Eventually, I would just not create the "core" app in the first place and register its commands directly on the main app.

mrjsj changed the title ~~Lazy load subcommands~~ Lazy load subapps/subcommands Dec 29, 2024

BrianPugh mentioned this issue Dec 29, 2024

Various optimizations. #279

Merged

BrianPugh mentioned this issue Dec 31, 2024

Add App.update method. #281

Merged

BrianPugh added the enhancement New feature or request label Jan 3, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Lazy load subapps/subcommands #276

Lazy load subapps/subcommands #276

mrjsj commented Dec 29, 2024

BrianPugh commented Dec 29, 2024

mrjsj commented Dec 29, 2024

mrjsj commented Dec 30, 2024

BrianPugh commented Dec 30, 2024

mrjsj commented Dec 30, 2024

BrianPugh commented Dec 30, 2024

mrjsj commented Dec 30, 2024

Lazy load subapps/subcommands #276

Lazy load subapps/subcommands #276

Comments

mrjsj commented Dec 29, 2024

BrianPugh commented Dec 29, 2024

mrjsj commented Dec 29, 2024

mrjsj commented Dec 30, 2024

BrianPugh commented Dec 30, 2024

mrjsj commented Dec 30, 2024

BrianPugh commented Dec 30, 2024

mrjsj commented Dec 30, 2024