-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
SharedArray improvements #2043
Comments
Being able to make a And I agree returning a SharedArray from |
This forum topic is another example where the first part of this issue would have been useful. With the new JS module APIs, implementing that basically boils down to deleting these 3 lines, right? Lines 89 to 91 in c4b88f5
If we remove this (now artificial) restriction, everything else should "just work", I think... 🤔 It should allow users to create a There are two potential pitfalls that I can think of:
In any case, I think it's worth it to remove the artificial restriction around |
Does it mean a k6's wrapped object that supports safe concurrency?
The SharedArray under the hood is a DynamicArray, do we have a way for implementing the interface on it or it would require a wrap? k6/js/modules/k6/data/share.go Lines 43 to 54 in c4b88f5
Will the const data = new SharedArray('some name', function () {
});
export function setup() {
console.log(JSON.stringify(data)) // I guess it shouldn't emit the warning
}
It seems a more distributed-friendly solution, if it hasn't critical problems, why not encourage users to use it until we will have a complete working solution for the |
Yes, something like a
Yes, why not? |
Allowing the initialization of SharedArray within the setup context would solve an important use case for my organization. Alternatively we could also solve it if we were allowed to make http requests within the SharedArray function that is run during in the init context. Our use case is that we have a large file (~100MB) that we would like to share across many distributed k8s pods running our test. We were hoping to put it on s3 and download it in the test. However we find that we are not allowed to fetch the s3 object within the SharedArray initialization because http requests are not allowed in the init context. Furthermore if we download the s3 object in the setup context, we are unable to initialize a SharedArray because this can only be done in the init context. If we put the file contents in the data block that is returned by the setup function our understanding is that memory usage will grow quickly as we increase number of VUs. |
Hi @Perzach! We had the same problem a few months ago and we found a workaround that maybe will fit for you as well. Let me explain it a bit: We have a single repository where each team in my company could write their tests scripts. Those scripts and the large files with real data to trigger load tests are generated in a GitHub Action and copied to an EFS volume via running a pod. It's a reusable job so you could use it directly from our repo if you have self-hosted runners. And here is the manifest for the pod able to copy script tests and large files to the volume: apiVersion: v1
kind: Pod
metadata:
name: k6-pvc-copy
spec:
volumes:
- name: k6-pvc-copy-storage
persistentVolumeClaim:
claimName: k6-runner-tests
containers:
- name: k6-pvc-copy-container
image: nginx
ports:
- containerPort: 80
name: "http-server"
volumeMounts:
- mountPath: "/test"
name: k6-pvc-copy-storage
resources:
requests:
cpu: "100m"
memory: "100Mi"
limits:
cpu: "100m"
memory: "100Mi" Then the pods running the test can mount the EFS volume and launch the tests using those large files already there in the volume. Here you have an example of what we have (in this case is a template for Argo Workflows but the K6 manifest is the same, you just need to change the apiVersion: k6.io/v1alpha1
kind: K6
metadata:
name: k6-{{`{{inputs.parameters.team}}`}}
spec:
parallelism: {{`{{inputs.parameters.parallelism}}`}}
script:
volumeClaim:
name: k6-runner-tests
file: {{`{{inputs.parameters.team}}`}}/test/{{`{{inputs.parameters.scriptName}}`}}
arguments: --out prometheus=namespace=k6
ports:
- containerPort: 5656
name: metrics
runner:
image: ACCOUNT.dkr.ecr.REGION.amazonaws.com/REPO/IMAGE
resources:
requests:
cpu: {{`{{inputs.parameters.cpuRequests}}`}}
memory: {{`{{inputs.parameters.memoryRequests}}`}}
limits:
cpu: {{`{{inputs.parameters.cpuLimits}}`}}
memory: {{`{{inputs.parameters.memoryLimits}}`}} Don't hesitate to ask for more details. Hope it helps! |
Thanks @guillermotti for the feedback. Are you saying that if we use the If so, this was greatly helpful and we'll give it a shot. For reference, we will be reading the file in our k6 script somewhat like this:
|
Yes, if you are using the k6-operator, the Here is a piece of code of our scripts which is working: const QUERIES = new SharedArray("queries", function () {
console.info(
"Getting queries from path: " + ROOT_PATH + LOCAL_PATH + '/queries.json');
return JSON.parse(open(ROOT_PATH + LOCAL_PATH + '/queries.json'));
})[0]; We are opening in that way several files with more than 2MB each so I think it will work for bigger files. |
Thank you, this was very helpful! |
Issues like #2911 and #2962 have been making me think recently if we can't make a import { SharedArray } from 'k6/data';
// no network code allowed here
const data = new SharedArray('some name', function () {
// we allow networked code here
let resp = http.get("https://some.url/that-returns-a-ton-of-data")
return resp.json();
});
// no network code allowed here Implementing this will probably be quite tricky and might require some refactoring of how we initialize the JS runtimes, but it seems doable 🤔 For example, it's probably fine to drop any metrics generated from these network calls, it might even be desirable to do so... On the other hand, the script There is also the problem of distributed execution and .tar archives. If we allow networking code in the On a somewhat related note, the proof-of-concept architecture I did in #2816 for distributed execution (#140) and test suites (#1342) would also allow us to relatively easily have a It will to that by basically making Lines 5 to 7 in 49a2e27
That is used for the Lines 443 to 457 in 49a2e27
Still, while I think that approach is more internally consistent and easier to reason about, both approaches are not necessarily mutually exclusive, with a bit of work and thought 🤔 If allowing network requests in the init context So, yeah, to conclude this somewhat rambling stream of thoughts, more proofs-of-concept are needed here 😅 I am probably missing quite a lot of critical details, e.g. it might be completely impossible to allow init network calls only in |
initialize SharedArray in non init context
There is nothing preventing us from just letting SharedArray be created outside of the init context, I think it was previous not technically possible as it was using a different mechanism to share the data, but this is no longer a problem.
The only caveat is that it should probably disallow creating SharedArray from
setup
,teardown
andhandleSummary
as ... well the last 2 aren't all that useful and the first one will not work in a distributed setup (unless we decided that SharedArray should be shared between distributed instances, but I think that will be technically ... very hard ™️ )SharedArray should be returnable from
setup
This will need some kind of special finding out that whatever was returned is SharedArray first, and then a way to save this in the JSON and reproduce the array on the other side.
This gets even more complicated due to the fact that you might want to return an array of SharedArrays ... which will mean that this might need to support on many levels 😭 , we might decide to not support this at first.
Because of the caveat mentioned above, it probably will need some ... specific code ?!?
Alternative: Maybe as suggested in this comment we should just make another breaking change and make the returned setup data always a "SharedObject" or something like that. I would expect that depending on how deep it is "shared" it will either not save memory, or be slower at least to some degree
Use cases:
When a lot of data will be generated in
setup
(for example because of needing to use HTTP request to get the data) and then it will be nice to not get this data copied for each VU but instead have it as a single copy.Making the SharedArray makeable in the
default
function also can help that as a single VU only will call the initialize code ... once. The problem is thatsetup
might still need to do some setupping and there is no way to return this data toteardown
if it's needed there.edit: This is likely going to wait for #140 in order to be doable.
The text was updated successfully, but these errors were encountered: