Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Testing the modification introduced from #64 seems the speed of LINQ query is very poor #65

Closed
masesdevelopers opened this issue Oct 4, 2023 · 1 comment · Fixed by #67
Assignees
Labels
bug Something isn't working enhancement New feature or request

Comments

@masesdevelopers
Copy link
Collaborator

masesdevelopers commented Oct 4, 2023

Testing the modification introduced from this PR seems the speed of LINQ query is very poor; the following code

context.Posts.Single(b => b.BlogId == 2)

takes more or less 16 seconds to be executed as reported from the following output:
Elapsed context.Posts.Single(b => b.BlogId == 2) 16299 ms. Result is PostId: 2 Title: title Content: 2 BlogId: 2

Originally posted by @mariomastrodicasa in #64 (comment)

@masesdevelopers masesdevelopers self-assigned this Oct 4, 2023
@masesdevelopers masesdevelopers added bug Something isn't working enhancement New feature or request labels Oct 4, 2023
@masesdevelopers
Copy link
Collaborator Author

Adding some measurement objects we found this:

KafkaEnumerator _moveNextSw: 00:00:00.2211977 _currentSw: 00:00:13.7689249 _valueSerdesSw: 00:00:00.1202403 _valueBufferSw: 00:00:13.4891879
Elapsed context.Posts.Single(b => b.BlogId == 2) 16586 ms. Result is PostId: 2 Title: title Content: 2 BlogId: 2

The highest time lost is measured from _valueBufferSw which is associated to:

var data = new ValueBuffer(entityTypeData.GetData(_entityType));

Since the enumerator iterates over 1000 elements, the mean time is 13 milliseconds in this test.

So the code wasting time is:

    public object[] GetData(IEntityType tName)
    {
        if (Data == null) return null;

        var array = Data.Select((o) => o.Value.Value).ToArray();

        return array;
    }

Updating the code with the following:

    public object[] GetData(IEntityType tName)
    {
#if DEBUG
        Stopwatch fullSw = new Stopwatch();
        Stopwatch newSw = new Stopwatch();
        Stopwatch iterationSw = new Stopwatch();
        try
        {
            fullSw.Start();
#endif
            if (Data == null) return null;
#if DEBUG
            newSw.Start();
#endif
            object[] array = new object[Data.Count];
#if DEBUG
            newSw.Stop();
            iterationSw.Start();
#endif
            for (int i = 0; i < Data.Count; i++)
            {
                array[i] = Data[i].Value;
            }
#if DEBUG
            iterationSw.Stop();
#endif
            return array;
#if DEBUG
        }
        finally
        {
            fullSw.Stop();
            Trace.WriteLine($"Time to GetData with length {Data.Count}: {fullSw.Elapsed} - new array took: {newSw.Elapsed} - Iteration took: {iterationSw.Elapsed}");
        }
#endif
    }

The previous code reports multiple occurrences like:

Time to GetData with length 4: 00:00:00.0115625 - new array took: 00:00:00.0000003 - Iteration took: 00:00:00.0000006

The time used from each invocation of GetData is variable between 10 and 22 milliseconds, however only the iteration needs between 5 and 10 microseconds and from the tests seems the invocation of GetData waste most of the time returning the array. Changing the signature of the GetData with following:

void GetData(IEntityType tName, ref object[] array);

a new test reports:

KafkaEnumerator _moveNextSw: 00:00:00.0862662 _currentSw: 00:00:00.1839915 _valueSerdesSw: 00:00:00.0620605 _valueBufferSw: 00:00:00.0006946
Elapsed context.Posts.Single(b => b.BlogId == 2) 1990 ms. Result is PostId: 2 Title: title Content: 2 BlogId: 2

Continuing the tests the output is:

KafkaEnumerator _moveNextSw: 00:00:00.1052086 _currentSw: 00:00:00.1974484 _valueSerdesSw: 00:00:00.0492837 _valueBufferSw: 00:00:00.0012247
Elapsed context.Posts.Single(b => b.BlogId == 1) 340 ms. Result is PostId: 1 Title: title Content: 1 BlogId: 1

More or less the KafkaEnumerator takes the same time, but the test is executed with an highest speed. The first test has a lower speed due to data retrieve from the Kafka cluster, while the second test uses the data stored in the KafkaEnumerator.

masesdevelopers added a commit to masesdevelopers/EntityFramework4Kafka that referenced this issue Oct 4, 2023
masesdevelopers added a commit that referenced this issue Oct 4, 2023
* Code update based on #65 (comment)

* Removed IKafkaSerdesFactory and other code clean-up
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working enhancement New feature or request
Projects
None yet
1 participant