-
Notifications
You must be signed in to change notification settings - Fork 8
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve filter speed #2
Comments
Re: BUFR Decoding performance
Please note the trick of excluding some keys documented here:
https://confluence.ecmwf.int/display/UDOC/Performance+improvement+by+skipping+some+keys+-+ecCodes+BUFR+FAQ
…________________________________
From: Sandor Kertesz <[email protected]>
Sent: 05 November 2019 10:20
To: ecmwf/pdbufr <[email protected]>
Cc: Subscribed <[email protected]>
Subject: [ecmwf/pdbufr] Improve filter speed (#2)
The performance of the bufr filter should be improved. It is currently 4-5 times slower than the BUFR filter in Metview Python (it is based on a C++ wrapper around ecCodes), which is already slower than the bufr_filter ecCodes command line tool. The following test case illustrates the problem:
File test.bufr contains 3927 synop messages and we want to extract the 2m temperature values form it. This is the test code in Metview Python:
import metview as mv
f=mv.read('test.bufr')
gpt = mv.obsfilter(data=f,
output="csv",
parameter='airTemperatureAt2M'
)
res= gpt.to_dataframe()
print(len(res))
and this is the code with pdbufr:
import pdbufr
f = 'test.bufr'
res = pdbufr.read_bufr(f, columns=('latitude', 'longitude', 'airTemperatureAt2M'))
print(len(res))
The execution time is as follows:
* Metview Python: 2.788s
* pdbufr: 11.861s
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub<#2?email_source=notifications&email_token=AF4HFU2X6NDHUSDJ5LMRREDQSFCITA5CNFSM4JJASWFKYY3PNVWWK3TUL52HS4DFUVEXG43VMWVGG33NNVSW45C7NFSM4HW3FFFQ>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AF4HFU2VTNP6HZMHTMDJ3FDQSFCITANCNFSM4JJASWFA>.
|
The main bottleneck is going from Python to C, by far. A major break-trough was reached here: by caching the message keys for similar messages. Several benchmark cases gain a 30-35% speed-up. |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
The performance of the bufr filter should be improved. It is currently 4-5 times slower than the BUFR filter in Metview Python (it is based on a C++ wrapper around ecCodes), which is already slower than the bufr_filter ecCodes command line tool. The following test case illustrates the problem:
File test.bufr contains 3927 synop messages and we want to extract the 2m temperature values form it. This is the test code in Metview Python:
and this is the code with pdbufr:
The execution time is as follows:
The text was updated successfully, but these errors were encountered: