Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix #556 #557

Merged
merged 1 commit into from
Jan 25, 2018
Merged

Fix #556 #557

merged 1 commit into from
Jan 25, 2018

Conversation

liangchenye
Copy link
Member

@liangchenye liangchenye commented Jan 22, 2018

  1. should rebase after validation/util/container: Use --bundle (and stop requiring BundleDir) #551 -- done
  2. should add more test case to cover 'create'
  3. should add test case to cover 'kill' -- will do that in other PR

Signed-off-by: Liang Chenye [email protected]

@liangchenye
Copy link
Member Author

This PR does two things:

  1. add LifecycleFuncs
type LifecycleFuncs struct {
 	Setup     func(runtime *Runtime) error
 	PreStart  func(runtime *Runtime) error
 	PostStart func(runtime *Runtime) error
 	PostStop  func(runtime *Runtime) error
 }

It is similar to RuntimeOutsideValidate(g *generate.Generator, f AfterFunc) .
But I think we don't need to define PreFunc or AfterFunc, we can use func(runtime *Runtime) and get any information from the Runtime structure.

So if this approach is useful, we can merge 'RuntimeInsideValidate', ‘RuntimeOutsideValidate' and 'RuntimeLifecycleValidate' into one.

  1. add PIDFile test
    I take this as an example of using 'RuntimeLifecycleValidate'.

t.Header(0)

var lifecycle util.LifecycleFuncs
tempFile, err := ioutil.TempFile("", "oci-pidfile")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This approach (using an empty but already existing PID file) seems strange to me. We may want to update the CLI spec to specifically address this case and keep this test, but I think a more traditional test would be to use a temp directory with a --pid-file path that is a direct child of that temp directory (so the target path does not exist when create is called).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not an existing PID file, it is also a new temp file, similar to using TempDir + 'pid'.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not an existing PID file, it is also a new temp file, similar to using TempDir + 'pid'.

The TempFile docs say it creates the file. TempDir + "pid" would not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Got it.

util.Fatal(err)
}
tempPidFile := tempFile.Name()
lifecycle.Setup = func(r *util.Runtime) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of initializing lifecycle and then filling in these properties, it might be more idiomatic to use:

lifecycle := util.LifecycleFuncs{
  Setup: func(…) error {
    …
  },
  PreStart: …,
  PostStart: …,
}

defer os.Remove(tempPidFile)

lifecycle.PreStart = func(r *util.Runtime) error {
pidData, err := ioutil.ReadFile(tempPidFile)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we want a PostCreate slot (maybe instead of the PreStart slot?). We can run the --pid-file test with create / kill / delete without involving start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1, 'PostCreat'e is better, 'Prestart' might also be confusing,

}

lifecycle.PostStart = func(r *util.Runtime) error {
time.Sleep(1 * time.Second)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's this about?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for the container to stop, or we cannot delete it.
There should be a better function to waiting for the status change.

return nil
}

lifecycle.PostStop = nil
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Isn't this the default already? I don't think we need this line.

lifecycle.PostStop = nil

g := util.GetDefaultGenerator()
g.SetProcessArgs([]string{"ls"})
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd rather use:

g.Spec().Process = nil

and skip start.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'runc' has a bug: if the 'Process' is nil, runc crashes.

Besides this, 'runc' does not follow the CLI spec, it applies the 'process args' when create a container.
I add a test to change the 'process args' in PostCreate, in theory, runc should execute the new args, but it does not.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

'runc' has a bug: if the 'Process' is nil, runc crashes

create crashes? That would be a bug.

I add a test to change the 'process args' in PostCreate, in theory, runc should execute the new args, but it does not.

No, the runtime must ignore any post-create config changes, so runc is fine on that point.

@@ -56,14 +57,21 @@ func (r *Runtime) SetID(id string) {
r.ID = id
}

// SetPidFile sets the pid file
func (r *Runtime) SetPidFile(pidFile string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

PidFile is already public. I don't think we need a setter.

@@ -146,23 +154,32 @@ func (r *Runtime) State() (rspecs.State, error) {
}

// Delete a container
func (r *Runtime) Delete() error {
func (r *Runtime) Delete() ([]byte, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why are we returning stderr? You don't seem to consume this new value anywhere.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I use this to debug why 'Delete' fails.
The 'time.Sleep' line is coming from this.

}
}

stderr, err = r.Start()
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we'll want a flag to toggle whether we call Start or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So something like this

func RuntimeLifecycleValidate(g *generate.Generator, lifecycle *LifecycleFuncs,  actions LifecycleAction) {
...
if actions & LifeCycleCreate {
     stderr, err := r.Create()
}
...
if actions & LifeCycleStart {
     stderr, err = r.Start()
}
...
}

@liangchenye
Copy link
Member Author

liangchenye commented Jan 23, 2018

  1. add LifecycleFuncs 'Setup/PostCreate/PreStop/PostStop'
    'PostCreate' and 'PreStop' are slight different with 'PreStart' and 'PostStart', so these new names are used.
  2. add WaitingForStatus
    It is flexible to add this to the testing code.
  3. add pidfile.go
    It is used to test if '--pid-file' works as expected
    I added a WaitingForStatus(Created) function in 'PreStop', just in case the 'creating' takes a long time.

Two issues were found, but I think they could be solved in other PRs.

  1. ProcessArgs issue of 'runc'
    'runc' validates ProcessArgs in the 'create' phase which is not the spec expected.
    Now I use 'true' command in pidfile test.
  2. Delete a container with the 'Created' status
    I think we can delete a container with 'created' status, we should update the spec.

PTAL @wking @q384566678 @Mashimiao

@liangchenye liangchenye changed the title WIP: Fix #556 Fix #556 Jan 23, 2018
@liangchenye liangchenye added this to the v0.5.0 milestone Jan 24, 2018
@zhouhao3
Copy link

zhouhao3 commented Jan 24, 2018

LGTM

Approved with PullApprove

util.Fatal(err)
}
tempPidFile := filepath.Join(tempDir, "pidfile")
defer os.RemoveAll(tempDir)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I'd rather swap the order of these two lines to get the cleanup defer right after the TempDir error check. filepath.Join cannot error, but having the defer set as early as possible would give me warm fuzzy feelings ;).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I'll update it.

)

var lifecycleStatusMap = map[string]LifecycleStatus{
"creating": LifecycleStatusCreating,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or these could just be strings:

type LivecycleStatus string

const (
  LifecycleStatusCreating = "creating"
  …
)

and then you cast with LifecycleStatus(state.Status). I don't see a need for indirection through an integer.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An 'integer' is useful when we expect to have multiple statuses. For example, if we want to delete a container, we expected to have 'created' or 'stopped' status. So we can 'WaitingForStatus(Created|Stopped)'.

@@ -184,3 +248,74 @@ func RuntimeOutsideValidate(g *generate.Generator, f AfterFunc) error {
}
return nil
}

// RuntimeLifecycleValidate validates runtime lifecycle.
func RuntimeLifecycleValidate(g *generate.Generator, lifecycle *LifecycleFuncs, actions LifecycleAction) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need to split funcs from actions. Can we rename LifecycleFuncs to LifecycleConfig (or just Lifecycle?) and add an Actions property?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 for this

}
}

if lifecycle != nil && lifecycle.PostStop != nil {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think anything running after Delete should be PostDelete. This PR currently seems to handle stop and delete in one step (or maybe it is never intended to call stop and instead waits for the container to stop on its own?), but calling this setting PostDelete makes sense regardless of an explicit stop or not.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So all the embedded functions should have different new, I'll update it too.

I prefer calling the 'stop' in the related test cases, for example, if we want to test 'kill'.
We can add test codes in the 'PreDelete'. (PreDelete is better than PreStop)

@liangchenye
Copy link
Member Author

Updated by using lifecycleConfig (merge funcs and actions), now the funcs are symmetrical: PreCreate/PostCreate, PreDelete/PostDelete

@Mashimiao
Copy link

Mashimiao commented Jan 24, 2018

LGTM

Approved with PullApprove

1 similar comment
@zhouhao3
Copy link

zhouhao3 commented Jan 25, 2018

LGTM

Approved with PullApprove

@zhouhao3 zhouhao3 merged commit 7ac117a into opencontainers:master Jan 25, 2018
return nil
},
PreDelete: func(r *util.Runtime) error {
return util.WaitingForStatus(*r, util.LifecycleStatusCreated, time.Second*10, time.Second*1)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't need this, because a container must be created before create returns. Maybe a runc bug?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runc does not implement 'creating' lifecycle ,
it is always 'created' or 'fails'.
From the spec, it should be like this:

  1. user run runX create x
  2. user run runX status x
    it should be 'creating' (assuming the creating takes a long time).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

runc does not implement 'creating' lifecycle...

That's definitely a bug. And there's also some CLI spec wording about create's exit timing. I think we can drop this, and I'll file a runc patch.

wking added a commit to wking/ocitools-v2 that referenced this pull request Jan 25, 2018
The runtime-spec defines a 'creating' status [1] and requires the
'create' operation to finish creating the container [2,3].  Our command
line API also requires the 'create' command to block until creation
completes:

  Callers MAY block on this command's successful exit to trigger
  post-create activity.

runc does not support 'creating' yet [4], and it seems to return from
'create' before having quite finished (or we wouldn't have needed the
code I'm removing in this commit).  However, both of those are runc
problems.  These tests are about validating spec compliance, not about
working around runc's issues, so remove the crutches.

[1]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L19
[2]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L54
[3]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L101
[4]: opencontainers#557 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/ocitools-v2 that referenced this pull request Jan 26, 2018
The runtime-spec defines a 'creating' status [1] and requires the
'create' operation to finish creating the container [2,3].  Our command
line API also requires the 'create' command to block until creation
completes:

  Callers MAY block on this command's successful exit to trigger
  post-create activity.

runc does not support 'creating' yet [4], and it seems to return from
'create' before having quite finished (or we wouldn't have needed the
code I'm removing in this commit).  However, both of those are runc
problems.  These tests are about validating spec compliance, not about
working around runc's issues, so remove the crutches.

[1]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L19
[2]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L54
[3]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L101
[4]: opencontainers#557 (comment)

Signed-off-by: W. Trevor King <[email protected]>
wking added a commit to wking/ocitools-v2 that referenced this pull request Jan 29, 2018
The runtime-spec defines a 'creating' status [1] and requires the
'create' operation to finish creating the container [2,3].  Our command
line API also requires the 'create' command to block until creation
completes:

  Callers MAY block on this command's successful exit to trigger
  post-create activity.

runc does not support 'creating' yet [4], and it seems to return from
'create' before having quite finished (or we wouldn't have needed the
code I'm removing in this commit).  However, both of those are runc
problems.  These tests are about validating spec compliance, not about
working around runc's issues, so remove the crutches.

[1]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L19
[2]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L54
[3]: https://github.com/opencontainers/runtime-spec/blame/v1.0.1/runtime.md#L101
[4]: opencontainers#557 (comment)

Signed-off-by: W. Trevor King <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants