Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Bug]: Local Proxy Charm in LXD fails to be deployed #839

Closed
gcalvinos opened this issue Apr 20, 2023 · 6 comments
Closed

[Bug]: Local Proxy Charm in LXD fails to be deployed #839

gcalvinos opened this issue Apr 20, 2023 · 6 comments
Assignees
Labels
hint/2.9 going on 2.9 branch kind/bug indicates a bug in the project

Comments

@gcalvinos
Copy link

Description

When trying to deploy a local charm in LXD python libjuju returns the following error:

2023-04-20T07:44:50 ERROR lcm.ns ns.py:4829 Task ns=8eb21604-9e55-4e08-ae17-c2d260750f32 instantiate=0beca26a-4c7b-44da-8dd2-eb22b13d9747 Deploying VCA vnf2.: Install configuration Software Error deploying charm into ee=8eb21604-9e55-4e08-ae17-c2d260750f32.app-vnf-ea396e3481-z0-uce1s.1 : ('Unknown version : %s', '')

I checked that this error happens in the method deploy in model.py, when it calls the following line:

base = utils.get_local_charm_base( charm_series, channel, metadata, charm_dir, client.Base)

Urgency

Blocker for our release

Python-libjuju version

2.9.42.1

Juju version

2.9.42

Reproduce / Test

To reproduce this error I used the following Charm:
https://osm.etsi.org/gitlab/vnf-onboarding/osm-packages/-/tree/master/charm-packages/nopasswd_proxy_charm_vnf/charms/simple

I already commented this issue with @juanmanuel-tirado
@gcalvinos gcalvinos added the kind/bug indicates a bug in the project label Apr 20, 2023
@juanmanuel-tirado juanmanuel-tirado added high hint/2.9 going on 2.9 branch labels May 8, 2023
@juanmanuel-tirado
Copy link
Contributor

This issue seems to be fixed with the last changes in the 2.9 branch. @gcalvinos has to confirm it

@gcalvinos
Copy link
Author

After testing the 2.9 branch, I still see the same issue:

+-------------------------+--------------------------------------+---------------------+----------+-------------------+------------------------------------------+
| ns instance name        | id                                   | date                | ns state | current operation | error details                            |
+-------------------------+--------------------------------------+---------------------+----------+-------------------+------------------------------------------+
| nopasswd_proxy_charm-ns | e8d239e0-2638-43fd-bae8-8a121c090180 | 2023-05-09T08:32:20 | BROKEN   | IDLE (None)       | Operation: INSTANTIATING.aa5f365f-ac45-4 |
|                         |                                      |                     |          |                   | 2a0-a5b3-9d138b83424d, Stage 2/5:        |
|                         |                                      |                     |          |                   | deployment of KDUs, VMs and execution    |
|                         |                                      |                     |          |                   | environments.                            |
|                         |                                      |                     |          |                   | Detail: Deploying VCA vnf2.: Install     |
|                         |                                      |                     |          |                   | configuration Software Error deploying   |
|                         |                                      |                     |          |                   | charm into ee=e8d239e0-2638-43fd-        |
|                         |                                      |                     |          |                   | bae8-8a121c090180.app-                   |
|                         |                                      |                     |          |                   | vnf-28344fc1eb-z0-bc5cy.0 : ('Unknown    |
|                         |                                      |                     |          |                   | version : %s', ''). Deploying VCA vnf1.: |
|                         |                                      |                     |          |                   | Install configuration Software Error     |
|                         |                                      |                     |          |                   | deploying charm into                     |
|                         |                                      |                     |          |                   | ee=e8d239e0-2638-43fd-                   |
|                         |                                      |                     |          |                   | bae8-8a121c090180.app-                   |
|                         |                                      |                     |          |                   | vnf-38c5da05a0-z0-mtpdy.1 : ('Unknown    |
|                         |                                      |                     |          |                   | version : %s', '')                       |
+-------------------------+--------------------------------------+---------------------+----------+-------------------+------------------------------------------+

@juanmanuel-tirado
Copy link
Contributor

@cderici I think we need some cherry-picking to fix this issue

@cderici
Copy link
Contributor

cderici commented May 9, 2023

yeah looks like one of the base fixes needs to be cherry picked from #830, I'm on it 👍 just to make sure, @gcalvinos, are you deploying this as a local charm? nevermind I just saw the title 👍

@cderici
Copy link
Contributor

cderici commented May 9, 2023

Interesting. I'm having trouble reproducing this with the 2.9 branch, i.e. I'm able to deploy that charm just fine:

>>> import asyncio
>>> from juju import model
>>> m = model.Model()
>>> await m.connect()
>>> await m.deploy("./tmp/osm-packages-master/charm-packages/nopasswd_proxy_charm_vnf/charms/simple")
<Application entity_id="simple-ha-proxy">
>>> 

Note that I'm using the tip of the 2.9 branch, so this might've been already fixed (maybe we just need a new 2.9 release to get it upstream). @gcalvinos can you try this with the 2.9 branch of libjuju to confirm?

jujubot pushed a commit that referenced this issue May 9, 2023
`utils.get_local_charm_base()` was incorrectly using the `--channel`
argument (the charm's channel) for discovering the channel part of the
base. (we should stop using the word 'channel' for two different
things).

This fixes that by taking out the incorrect part of the code.

Should fix #839
jujubot added a commit that referenced this issue May 10, 2023
#846

#### Description

`utils.get_local_charm_base()` was incorrectly using the `--channel` argument (the charm's channel) for discovering the channel part of the base. (we should stop using the word 'channel' for two different things).

This fixes that by taking out the incorrect part of the code.

Should fix #839


#### QA Steps

So a trivial way to reproduce the #839 is to deploy a local charm with a `--channel='stable'` argument. You may try to use one of the examples to validate this. Additionally an integration test is added (see below), so passing that should be enough. I also suggest getting a confirmation from [@gcalvinos](https://github.com/gcalvinos), just in case.

```
tox -e integration -- tests/integration/test_model.py::test_deploy_local_charm_channel
```

All CI tests need to pass. We might have a couple of time-outs from previously known CI test failures.
@juanmanuel-tirado
Copy link
Contributor

A fix was already committed in #846. I will close this issue.

jujubot pushed a commit to cderici/python-libjuju that referenced this issue Jun 1, 2023
`utils.get_local_charm_base()` was incorrectly using the `--channel`
argument (the charm's channel) for discovering the channel part of the
base. (we should stop using the word 'channel' for two different
things).

This fixes that by taking out the incorrect part of the code.

Should fix juju#839
juanmanuel-tirado added a commit that referenced this issue Jun 6, 2023
* Add kubernetes as supported series as per juju/core/series/supported.go

Fixes #865

* Add example local charm to test and reproduce #865

Unfortunately we can't write an integration test that uses this
because we run our tests on lxd so the charm will be deployed but will
never actually get 'active'. We could test the deployment itself (that
it doesn't error), but if we did that, the 3.0 will actually fail to
deploy on lxd because the series is kubernetes so the test will be
invalid anyways.

* Fix bug in Type.from_json() parsing simple entries

Should fix #850 & #851

* Add integration test for simple assumes expression

This is not the ideal test because it depends on that the upstream
charm having the simple

assumes:
  -juju

expression. However, simulating the bug in the facade.py for parsing
such expressions is non-trivial as we need a test to call something
like the CharmsFacade.CharmInfo() to trigger parsing the metadata
(which is actually where it fails in the reported bug). Creating a
local charm wouldn't work because we locally handle the metadata
(wihtout going through anything in the facade.py where the bug is
located). Maybe we can manually call AddCharm in the test for a local
charm and then manually call the CharmInfo with the returned url.

* Fix wait_for_units flag to not block when enough units ready

wait_for_idle will keep waiting if there are less number of units
available than requested (via the wait_for_units flag). However, if
there are already a number of units in a desired status ready to go,
more than (or equal to) wait_for_units, then it shouldn't block until
other not-yet-available units to get into that desired state as well.

fixes #837

* Add integration test for wait_for_units in wait_for_idle

* Fix failing wait_for_idle test

As per discussion in
#841 (comment)

Should fix #837

* Remove accidental printf for debugging

* Small patch for wait_for_idle

* Fix wait_for_exat_units=0 case

* Fix logical typo

* Fix merge resolve error for parsing assumes

* Fix base channel discovery for local charms

`utils.get_local_charm_base()` was incorrectly using the `--channel`
argument (the charm's channel) for discovering the channel part of the
base. (we should stop using the word 'channel' for two different
things).

This fixes that by taking out the incorrect part of the code.

Should fix #839

* Fixes to pass the CI problems regarding missing postgresql charm. (#847)

* Add test for deploying local charm with channel argument

* Add application.get_status to get the most up to date status from API

Introduces an internal self._status which is initially set to the
'unknown' which has the lowest severity. The regular property
self.status uses both self._status and the unit statuses to derive the
most severe status as the application status

* Use application.get_status in wait_for_idle to use most up to date
application status

* Fix unit test TestModelWaitForIdle::test_wait_for_active_status

* Fix linter

* Expect and handle exceptions from the AllWatcher task

fixes #829

The `_all_watcher` task is a coroutine for the AllWatcher to run in
the background all the time forever, and it involves a while loop
that's being controlled manually through some flags (asyncio events),
e.g. things like `_watch_stopping`, `watch_stopped`.

The problem is that when the `_all_watcher` raises an exception (or
receives one from things like `get_config()` like in the case of
ether in the event loop, not handled/or re-raised. This is because
this coroutine is not `await`ed (for good reason), it can't be
`await`ed because there won't ever be any results, this method is
supposed to be working in the background forever getting the deltas
for us. As a result of this, if `_all_watcher` fails, then external
flags like `_watch_received` is never set, and whoever's calling
`await self._watch_received.wait()` will block forever (in this case
the `_after_connect()`. Similarly the `disconnect()` waits for the
`_watch_stopped` flag, which won't be set either, so if we call
disconnect when all_watcher failed then it'll hang forever.

This change fixes this problem by allowing (at the wait-for-flag
spots) to wait for two things, 1) whichever flag we're waiting for, 2)
`_all_watcher` task to be `"done()"`. In the latter case, we should
expect to see an exception because that task is not supposed to be
finished. More importantly, if we do see that the
`_watcher_task.done()`, then we don't sit and wait forever for the
_all_watcher event flags to be set, so we won't hang.

Also a nice side effect of this should be that we should be getting
less number of extra exception outputs saying that the "Task exception
is never handled", since we do call the `.exception()` on the
`_all_watcher` task. Though we'll probably continue to get those from
the tasks like `_pinger` and `_debug_log` etc. However, this is a good
first example solution to handle them as well.

* Assume ubuntu focal base for legacy k8s charms

Updates the get_local_charm_base with Base(20.04/stable, ubuntu) for
legacy k8s charms as per
juju/cmd/juju/application/utils.DeduceOrigin()

* Fix get_local_charm_base call.

---------

Co-authored-by: Juan M. Tirado <[email protected]>
Co-authored-by: Juan Tirado <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
hint/2.9 going on 2.9 branch kind/bug indicates a bug in the project
Projects
None yet
Development

No branches or pull requests

3 participants