Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve filter speed #2

Open
sandorkertesz opened this issue Nov 5, 2019 · 2 comments
Open

Improve filter speed #2

sandorkertesz opened this issue Nov 5, 2019 · 2 comments

Comments

@sandorkertesz
Copy link
Collaborator

The performance of the bufr filter should be improved. It is currently 4-5 times slower than the BUFR filter in Metview Python (it is based on a C++ wrapper around ecCodes), which is already slower than the bufr_filter ecCodes command line tool. The following test case illustrates the problem:

File test.bufr contains 3927 synop messages and we want to extract the 2m temperature values form it. This is the test code in Metview Python:

import metview as mv
f=mv.read('test.bufr')
gpt = mv.obsfilter(data=f,
    output="csv", 
    parameter='airTemperatureAt2M'
)
res= gpt.to_dataframe()
print(len(res))

and this is the code with pdbufr:

import pdbufr
f = 'test.bufr'
res = pdbufr.read_bufr(f, columns=('latitude', 'longitude', 'airTemperatureAt2M'))
print(len(res))

The execution time is as follows:

  • Metview Python: 2.788s
  • pdbufr: 11.861s
@shahramn
Copy link

shahramn commented Nov 5, 2019 via email

@alexamici
Copy link
Contributor

The main bottleneck is going from Python to C, by far.

A major break-trough was reached here:

479c254

by caching the message keys for similar messages.

Several benchmark cases gain a 30-35% speed-up.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants