Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Internal module instances per VU #1858

Closed
mstoykov opened this issue Feb 15, 2021 · 6 comments
Closed

Internal module instances per VU #1858

mstoykov opened this issue Feb 15, 2021 · 6 comments
Labels
breaking change for PRs that need to be mentioned in the breaking changes section of the release notes evaluation needed proposal needs to be validated or tested before fully implementing it in k6 feature
Milestone

Comments

@mstoykov
Copy link
Contributor

Currently, internal modules such as http have a single instance of them for the whole k6 instance. This is not how it works for not internal modules - imported js files, which get an instance per VU.

This will make it possible for per VU settings (of a single module) such as the callback from #1856 to actually not be in the lib.State. This is also probably true for the RPSLimit and Transport especially if in the future we have multiple transports.

Having both per VU and global store would've also let stuff like #1739 to be implemented only by a single module without other changes in other parts of the code.

This is not to say that lib.State should be removed as there are definitely things that need to be in a more shared location between modules such as the tls.Config.

Currently all, but HTTP and gRPC use empty structs (both the exceptions use it for constants) and one of the easiest ways to add a way to have this will be to instead of taking an instance to take the New method most of those call to get an instance.

This though is a breaking change especially with xk6 in mind. Given that, and the fact that it will also be nice if there is a way to have a "shared" global state per/in module without the usage of global variables such as in xk6-counter maybe it will be better to keep the current behaviour, but being able to add a method New (name up for debate, maybe Instantiate) that will then return a new copy of another object that will be the actual module instance:

So for HTTP it will be something like:

type HTTP struct{
   // global fields for the whole module that can be accessed by each instance + some syncing
}
type HTTPinstance {
  h  *HTTP // so we can  
  // current fields of HTTP
}

func (h *HTTP) New() interface{} { // for interface reasons ;) 
   return &HTTPInstance {
      h: h,
      // initialize fields per instance 
      // also the constants that are currently in HTTP
   }
}

And the code in

https://github.com/loadimpact/k6/blob/11811171a3c416daf17f7f0ad182ca30aa9990bb/js/internal/modules/modules.go#L35-L39
or
https://github.com/loadimpact/k6/blob/11811171a3c416daf17f7f0ad182ca30aa9990bb/js/initcontext.go#L142-L148

will have

mod, ok := modules[name]
if ok {
  if m, ok := mod.(interface{New() interface{}}); ok {
     return m.New()
  }
}
return mod

or similar (code was written in GitHub, so likely has problems ;) )

@mstoykov mstoykov added feature evaluation needed proposal needs to be validated or tested before fully implementing it in k6 breaking change for PRs that need to be mentioned in the breaking changes section of the release notes labels Feb 15, 2021
@mstoykov mstoykov changed the title Internal modules per VU instance Internal module instances per VU instance Feb 16, 2021
@mstoykov mstoykov changed the title Internal module instances per VU instance Internal module instances per VU Feb 16, 2021
@na--
Copy link
Member

na-- commented Feb 16, 2021

Hmm yeah, 👍 this definitely deserves some investigation... I wonder if we even have to make it optional? If every VU has its own copy of the module struct, that wouldn't increase memory usage perceptibly, considering how hollow the current module structs are... And if a specific module requires global VU storage, they can simply have a properly synchronized global variable. The biggest reason not to do it probably is xk6 backwards compatibility... 🤷‍♂️

@mstoykov
Copy link
Contributor Author

they can simply have a properly synchronized global variable.

I am somewhat against this ... as this makes it kind of impossible to have 1 instance of k6 running 2+ tests in parallel. This is technically impossible now as well, but having to use global variables for this functionality will definitely make it harder in the future. It also makes testing ... harder, and we already are trying to drop global variables so ... let's not design them in something ;).

The good news is that technically nothing forbids this even if we make it a requirement. But the code above (where the currently registered struct is checked for a New method) won't be enough if we don't want to have the per module per instance shared state to be easily testable. Instead what is registered should be something that can instantiate the global per k6 instance module and then that to have a method to make a per VU module instance. (this sentence is hard to read, but I couldn't simplify it more 😭 )

I would argue that regardless we can have it work with the old version (through checking for an interface as above) and just print a message such as "your module extension uses old API - please change to do the thing described here if you are the developer of the extension otherwise open an issue with them or check if a newer version doesn't exist" and then breaking it at a later point by instead of taking an empty interface taking an interface with the methods we want.

I doubt there are all that many xk6 extensions that aren't made by k6 employees, so updating them will probably be a non-issue., but still the above seems like it will just be a few more lines and a link to some documentation -possibly just the changelog or a post in the community forum, although a doc page might not be a bad idea as well

mstoykov added a commit that referenced this issue Feb 19, 2021
This is dones so we can now have per VU settign for the HTTP module such
as the ResponseCallback

This is part of #1858
mstoykov added a commit that referenced this issue Feb 26, 2021
@mstoykov
Copy link
Contributor Author

mstoykov commented Mar 1, 2021

More on the actual changes that I currently envision for this(hopefully done in v0.32.0):

  1. All internal module will be able to initialize a per test module instance which initializes a per VU instance (more or less the code in d36ce43)
  2. the module registry will be a bit more advance and will take a w/e actually will initialize the per test instance(I have called it RootModule previously). Whether that will be another interface with a single method or just a function is up for debate.
  3. the module registry will have an instance of it's own, which:
    3.1. will initialize the per test module instances of each imported module and "cache" it and then use that to return a per VU instance for each import.
    3.2. will be used for each test run - so basically the Runner will get an instance of the registry module
    3.3. will be able to be used in tests to have separate per test module instances or even separate registry modules (with more or fewer modules)
  4. All external modules will be tried for this new interface and if they implement it just act as the interface intended
    4.1 if not they will be wrapped in a wrapper that implements the interface but just returns the one module (which is technically very global a kin to the per-instance one) and act the same way it acts for now. Additionally, a warning will be printed that this is happening and the author of the module should be contacted so it is fixed.
    4.2. break those and expect the module interface in ... some time - for example 3 release 🤷
    4.3. possibly make PRs to some extensions to fix it over that time. - the code changes will be 10 lines or so.

Why require the interface instead of just checking for it?

Consistency and ease of use/expansion. If we just check for it without enforcing it(or even worse check for multiple interfaces), it means that down the line a possible rewrite will be required on the module author part to add per module shared state (both for VU or per test) as was the case in d36ce43, if they don't want to go for global variables 🤢.
This also means that they have the chance to not need to read some hackish way to do that as proposed in this comment but just figure out that they have a per VU instance and a per Test instance - which seems pretty self-explanatory - we will still document it either way ;).
All of the above also will help with tests not stepping on each other toes when some of the implementations for those have not been thought through from the fact that there will be parallel tests accessing those states.
Also if the 10 lines are too much and the author truly thinks they don't need it we can leave the wrapper from 4.1 and it can be used - effectively removing the downsides of having to have 2 (or 3) separate structs in the cases where they are totally not needed.

edit: It is very likely that both the per-vu instance and the per-test instance will be taking some struct/interface as an argument so they can get some parameters that matter for them. What those will include exactly and will one of them be context or include context is up for debate and will probably need a further reading of the current modules' code and figuring out if some of it would be better of using such parameters.

edit 2: maybe the per-test instance should be per-script ? The one can be confused as a per test in the codebase and the other can be confused as per a script file (that is for example imported) ...

mstoykov added a commit that referenced this issue Mar 1, 2021
@na--
Copy link
Member

na-- commented Mar 2, 2021

I am unconvinced by the necessity of all of this extra complexity, breaking changes and multiple interfaces... Very, very far from convinced, in fact...

So, instead, here's a counter suggestion, based on the discussion in #1856 (comment). We only have this code you've already written and a simpler version of the module registry you propose ([a]): https://github.com/loadimpact/k6/blob/fde4130165079de64b417dbad31a6fbf72b163ec/js/internal/modules/modules.go#L34-L49

Though, again, I think the interfaces should be named like this instead of PerTestModule, since it's much more descriptive of the actual reality:

// HasModuleInstancePerVU should be implemented by all native Golang modules that
// would require per-VU state. k6 will call their NewModuleInstance() methods
// every time a VU imports the module and use its result as the returned object.
type HasModuleInstancePerVU interface {
    NewModuleInstance() interface{}
}

With this, we can have everything we want:

  1. Per-k6-instance/global module instances, i.e. objects shared between all VUs in the same k6 instance. This will happen exactly how all k6/xk6 JS modules are currently registered - very simply, and with absolutely no breaking changes: https://github.com/loadimpact/k6/blob/82cab8b693963659d5f0cfb4cbf785c792577a0f/js/modules/k6/k6.go#L40
    Simple, and has worked very well so far - both for all k6 built-in modules and for all current xk6 extensions... And, thinking about it, the SharedArray logic probably didn't need changes in common.InitEnvironment, it could have been written like this:

    // Module contains global properties for the k6/data module that is shared
    // between all VUs that import it.
    type Module struct {
        sharedObjects *SharedObjects
    }
    
    // NewModule returns a newly initialized data module.
    func NewModule() *Module {
        return &Module{
            sharedObjects: NewSharedObjects(),
        }
    }
    
    func init() {
        modules.Register("k6/data", NewModule())
    }
    
    // XSharedArray is a constructor returning a shareable read-only array
    // indentified by the name and having their contents be whatever the call returns
    func (d *Module) XSharedArray(ctx context.Context, name string, call goja.Callable) (goja.Value, error) {
        name = sharedArrayNamePrefix + name
        value := d.sharedObjects.GetOrCreateShare(name, func() interface{} {
            return getShareArrayFromCall(common.GetRuntime(ctx), call)
        })
        // ...
    }
    // ...

    See, super simple, no global variables, no changes to lib.State or common.InitEnvironment, tesable ([a]) and clean, with no extra interfaces. And, as a bonus, we no longer need to restrict SharedArray usage in the init context, it can be used in VU context as well! 🎉

  2. Per-VU modules, i.e. every VU will have a unique object that is returned by the NewModuleInstance() method with distinct state, like how you've refactored the k6/http module to behave in Add a way to mark http requests as failed or not(passed) #1856

  3. A mix of the two - unique module objects for every VU, but with shared global state that is not saved in global varaibles, so it's perfectly testable ([a]). For example, the SharedArray logic can be rewritten like this, if we ever want to have per-VU data in the same k6/data module:

    // Module contains global properties for the k6/data module that is shared
    // between all VUs that import it.
    type Module struct {
        sharedObjects *SharedObjects
    }
    
    // NewModule returns a newly initialized data module.
    func NewModule() *Module {
        return &Module{
            sharedObjects: NewSharedObjects(),
        }
    }
    
    type PerVUModuleObject struct {
        globalSharedObjects *SharedObjects
        uniqueDataPerVU     int
    }
    
    func (m *Module) NewModuleInstance() interface{} {
        return &PerVUModuleObject{
            globalSharedObjects: m.sharedObjects,
            uniqueDataPerVU:     42,
        }
    }
    
    func init() {
        modules.Register("k6/data", NewModule())
    }

[a] The only problem with testing here are actually the init() functions, they create de-facto global variables. With what I suggest, we would be able to test single modules in isolation by just creating a unique object with the NewModule() function in its own package for every test, without any conflicts. However, integration-level tests (i.e. multiple modules combined) will be problematic because of the shared "registry" caused by these init() calls.

However, I think the solution isn't to introduce more levels of abstraction, it's to fix the init() mess we currently have. Nobody is forcing us to do it this way for built-in modules. Instead, we can have a GetJSModules() function that returns a registry copy, similar to how I did it with the outputs:

func GetJSModules() map[string]interface{}{
    return map[string]interface{}{
        "k6": k6.New(),
        "k6/data": data.NewModule(),
        "k6/http": http.NewModule(),
        // ...
    }
}

Not perfect, but actually better than this...Using this will remove the intrinsic global variables we are currently struggling with. It will also be perfectly adequate for integration tests with all built-in JS modules, but without any conflicts or shared registries, and it will remove the init() magic-at-a-distance as a bonus.

So, for now, I'm firmly against most of the complexity you want to introduce, with the exception of this simplified module registry, for which 👍 from me for k6 v0.32.0. The approach above seems to be good enough for all use cases we currently have and would want to handle in the future, without any breaking changes required. And if it isn't, it can be easily extended in a much more Go-idiomatic way (simple optional interfaces), if we ever want to handle more complicated things in the future... To change my mind, I'd need some very good arguments why I'm wrong, as well as concrete use cases where the above approach fails to suffice.

@simskij
Copy link
Contributor

simskij commented Mar 16, 2021

As stated in #1909, I just want to pitch in that we at some point, when we've agreed on the implementation, should make this accessible to external consumers as well. 😅

mstoykov added a commit that referenced this issue Mar 17, 2021
This allows integration tests to have separate global module instance of
all internal js modules.

For this to be supported for extension an additional type/interface will
be required a bit more added code
mstoykov added a commit that referenced this issue Apr 6, 2021
This allows integration tests to have separate global module instance of
all internal js modules.

For this to be supported for extension an additional type/interface will
be required a bit more added code
mstoykov added a commit that referenced this issue Apr 7, 2021
This allows integration tests to have separate global module instance of
all internal js modules.

For this to be supported for extension an additional type/interface will
be required a bit more added code
mstoykov added a commit that referenced this issue Apr 9, 2021
Add GetJSModules support from #1858
@na--
Copy link
Member

na-- commented Apr 9, 2021

I think we can close this, since pretty much everything got implemented in #1911?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
breaking change for PRs that need to be mentioned in the breaking changes section of the release notes evaluation needed proposal needs to be validated or tested before fully implementing it in k6 feature
Projects
None yet
Development

No branches or pull requests

3 participants