-
Notifications
You must be signed in to change notification settings - Fork 5
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Issue 361: ignore data conversion errors #368
Conversation
The general request is to support bad data in the database so add try/catch blocks around the conversions of the document such that processing continues. Catch all exceptions, log them, and press forward.
Sonatype Lift is retiringSonatype Lift will be retiring on Sep 12, 2023, with its analysis stopping on Aug 12, 2023. We understand that this news may come as a disappointment, and Sonatype is committed to helping you transition off it seamlessly. If you’d like to retain your data, please export your issues from the web console. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See suggestions re log levels/messages, and required fix of erroneously-committed change.
Re testing, are you 100% certain, or should it be tested by spinning up an empty docker registry and manually inserting a bad document?
service/src/main/java/gov/nasa/pds/api/registry/model/Pds4ProductBusinessObject.java
Show resolved
Hide resolved
service/src/main/java/gov/nasa/pds/api/registry/model/Pds4ProductBusinessObject.java
Outdated
Show resolved
Hide resolved
service/src/main/java/gov/nasa/pds/api/registry/model/PdsProductBusinessObject.java
Outdated
Show resolved
Hide resolved
Happy to use another word other than critical. To me, bad data in the database is not a warning but hey. So what is going to happen is that the user is going to see there are 10 hits but they only get 9. Then a flurry of tickets will fly. We need a word in the log message for which I chose critical because, well, it is a critical error that is not used elsewhere for quick searches to stop the flurry before it begins. So, any rather unique word that we can use for searching is fine to include tinkerbell but I do use that one occasionally. Yeah, I used to spin up my own and add data willy nilly but got in trouble for it. So, now I do not. Also, using registry to do it all makes way, way harder. |
@al-niessner can we standardize on Re testing, sounds like maybe I'm missing some context, but are we talking about the same thing? I can't see why anyone else should care what you do with your own ephemeral dockerized registry, and it should be a matter of a |
Done. |
I hardly know PDS rules making it hard to synthesize a test case but anything that throws an error will work. Maybe one of the databases that has strings where arrays should be? |
@al-niessner @tloubrieu-jpl I've tried messing with the document structure of an existing document, but haven't managed to do it in a way that breaks the API. Is it possible to grab one of the problematic documents from #361 for me to use? Are there specific endpoints which break, or should I be able to observe the error from |
Change your output type to json+pds4 (or the other way round) because it is way more critical about names, structures, etc. |
@al-niessner no dice with Nowhere to go, without a known-problematic doc and curl to reproduce the initial error. @tloubrieu-jpl if you have any relevant info? |
Thanks for giving a try and sorry my suggestion did not help. |
Hi @al-niessner , @alexdunnjpl , I found this document from one of the linked tickets: {
"_id" : "urn:nasa:pds:gbo-kpno:nirimage-9p:reduced_n6_n6137final_h_fit::1.0",
"_index" : "registry",
"_score" : 1,
"_source" : {
"_package_id" : "97051a77-b0f9-43a8-83b8-4b818929c26e",
"description" : [
"J, H, and K band observations of Comet 9P/Tempel 1. Airmass: 1.84",
"Migration from PDS3 (B. Hirsch)"
],
"disp:Display_Direction/disp:horizontal_display_axis" : "Sample",
"disp:Display_Direction/disp:horizontal_display_direction" : "Left to Right",
"disp:Display_Direction/disp:vertical_display_axis" : "Line",
"disp:Display_Direction/disp:vertical_display_direction" : "Bottom to Top",
"geom:Display_Direction/geom:horizontal_display_axis" : "Sample",
"geom:Display_Direction/geom:horizontal_display_direction" : "Left to Right",
"geom:Display_Direction/geom:vertical_display_axis" : "Line",
"geom:Display_Direction/geom:vertical_display_direction" : "Bottom to Top",
"geom:Object_Orientation_RA_Dec/geom:celestial_north_clock_angle" : "0",
"geom:Object_Orientation_RA_Dec/geom:declination_angle" : "-10.77363888888889",
"geom:Object_Orientation_RA_Dec/geom:right_ascension_angle" : "205.93125",
"geom:Reference_Frame_Identification/geom:name" : "J2005.51001",
"img:Exposure_Parameters/img:exposure_duration" : "8",
"img:Filter/img:filter_name" : "H",
"lid" : "urn:nasa:pds:gbo-kpno:nirimage-9p:reduced_n6_n6137final_h_fit",
"lidvid" : "urn:nasa:pds:gbo-kpno:nirimage-9p:reduced_n6_n6137final_h_fit::1.0",
"ops:Data_File_Info/ops:creation_date_time" : "2007-01-19T15:31:41Z",
"ops:Data_File_Info/ops:file_name" : "n6137final_h.fit",
"ops:Data_File_Info/ops:file_ref" : "/bx/sbnarch04/PDS4/gbo-kpno/nirimage-9p/data/reduced/n6/n6137final_h.fit",
"ops:Data_File_Info/ops:file_size" : "1056960",
"ops:Data_File_Info/ops:md5_checksum" : "09e395078e01b92b6ed99af6f90a08dd",
"ops:Data_File_Info/ops:mime_type" : "application/fits",
"ops:Harvest_Info/ops:harvest_date_time" : "2022-02-18T19:07:58.862147Z",
"ops:Harvest_Info/ops:node_name" : "PDS_SBN",
"ops:Label_File_Info/ops:blob" : "eJzNWVlv4zgSft4G+j8QfpoB2pJl5zTUGqQ7l6eTTjbOw2JfCEaiHW5Looakknh//Rapi5LlI7u9gwmC2FJ9rPrqYPGI/9tbEqMXKiTj6eeB54wGiKYhj1i6/DzI1WJ4Mvgt+Pjh4wdfI4cJj2iMngVdfB48K5VNXTeLpJMSSZwlf9EPB/qP++K59+fzAwx/sPdlNHJk+Dz4+OFv8EETolYZTWWjIhexw8M4dLhYupGMYrfECZ4a+++wHjGZ1ebPZ/P7P9k+S5a1+dntFfbOvD/R+pLypDZ/dXF3a7zH3tH/TAJo3Ase5aHCd0+SiheioGZIjICbrWhDOQwQmDbQqc7Q7gRaA7RTu322BkAOdmbIgr9JVsNfX1+d14mJwng08tx/3N7MTSCGLJWKpCEtBko2LQJ0w0MTiZ0RQO+bMG8yQpuHlEHaCehOg+1ai8jskndqe7vKMjk7Ab0VC6oHUHoI+bOIpootWBFrfCYoMQIQxXwJr2PMSggVQS7SqbYyBQvT5RMf/shSPk2ZYAlZ0uFpNhUUKplGOD2CX29yvGBQyvgZL5jy3R6Npa2yU4IkgF7pu9ZziVBMxTR4KNSj75SI4SxdCAIG0UxbR3yBvvKEKnR67z7SJINJ7fluMa5UwtIFF0nhqpn2uDQEVj0wrE1vxpRKsnK2hjGRMuidu77bBpUjvzJV6J01NkoZSEmunrnAMZMq+Jay5bP6hG6dW8d3bUkNz/KnuErbCsIRjEfeKRjuvq4HRFSGgmXG6O+f0PUnRNIIfUNP+oM39GVvJB10xoCylFP4fnLgu7a60j93i4P+LSx+dZ1dgytcrBpyLek5VYTFtRDEiS2OiKLG2+HocDgGKutSa+jO0loLzi1bCqMLLQRPEEygCfrli4OumYDW9Guf69r5zS50ZLbzvrtpBvpWRbXm5SNLKP7KuYDNBPgqGwrQR4UyAcAKQBAjCNDoGH4fRwfTg5Pp5PCfvttFWcN5tsfoFqhycAMpWNqgNYgVfqAyjxWe54l+tKtYZFzSYB4y2B9RXb/FiwYheEilhI0TjukLjYN7oM9IHK/QfSGikZlvbVQ9vtSML0lIlbRz/koASNOlesaCpEsa6K6Cqq7iu2tyu16YDFkWs5TilEAcdAsC477bFVhjFpqBF1wJspLQBsHZ8k1TQ71kfXdrFP1Z+kKlYsu1UgGZ4fCdp2CsRcfX+5TgTj1T7bE1HjqmltS4Waqo0D38gS6o0ORsl2IWYVELWstDyGHkm5oyWzs8ReyFRTmJndSwamuwVNcvsSEEJUew4pi1yXZQTSQ38taiDfEqp5wuovlKKpo0+roSqPUkAwdSZXM2If7GlEL3lPxA36utXDWVYdZ38lBnwka0XNmVg/2yAJXGYqZWU94YcvTqvTkD6zlgEld6NkZ+a+y1cI9IvifaY8ebDBNYr0QKePQVVigK7Zul7leeRxQBzZiG4O6GwD/CFJchz+j/Jeyq0j7VsXbCguU48SbvjLtqaP5FAv/97uwOzVkCDYmklOcS/T0nEWptynRGzukLC7vtp4r+DPb+Ik9Af7fv7Ka7DqlXSCKWVOH2ytrpibC18U6OjtGVh36p9ji/9jZJsxX66W1RGY7wCMqd0wwSrDlg7/0dsdD037XCbZHyz5ulrL2o6DPQFKRZTFZ4TpWCFLRWVn12i/HO+cPbhwHLa3OewETAUqmPDRtxPfq6USppypKmjlhL75bptIcb7VicMwG9xi62BgM7ePZvSD4oqjiRNyaDOUkyvRfYCtpLXVRbv6ELhRRHD/oUsVl1tIUubJOVOay1eNywtKLaD9hDUWP1C1cKNthA9JFnm9T2kSyxm8PeAazVqA8n7mnZof6ihcsKdjgjAnoSEDKlW+jnT/8Ch39SBVehoLg6y/YdT2voxRvsz3MB6JpYC1TCaAWL8vIwladMfR4kchCc+G4vpG3M3ceasXXJYkVFH4uFkRQ78etCo/1q3d6aquL17vg0OLukfH0LM72iustbx55KUKitqrQHtm9J/vyi3NVPoRbNArRnKe5ZjFVgtjfUCrWrpW4F7alwY1PdPaDPwra2ugWwl6qNjXUXvJ2mPRJQELgzqcd3As6s5b3Pwxk+p2EvXaHDhokMaWruX+BEHdOyKUR0OQjGo0PndOKND0sKvQN6VUc0jPW1Q6/aoTdyjo8nR5OT8ue01L82qld3qDfe+r4Bp1yoZxxCof9YtzIqlW6B96qvJwG+1P1t0561Pcj0rt/1/Yxz6I1GXml7raXV2XyHlXLEPrktoTuaWInqvPfdvt1ltaXvXoVBVy5Q7WvWSpcWN/YWGmxiYd9AO+YGupFVNKyx/jUlkdX8fb5YwLaxTPPTSlGT5+KthSpCVdwUtcCHx0cab8utey0izHWV/udHRESkLyQvZ49zNNG3lH3SinKLpn+m2zoen2OTh0Z/t89vXwW2Ol360fGbvFEZjH3XfFpv4bTK0oi+YS40zRsiFZwK4QW6hK8wO/SQDqgefhFTfRrExqvWPtucdsyZ8eLi4vjw4Hb+Zc70vIK9Xi1riq5fkX+mLa8pN3xMWRSduHm2QLTQKINDD7yunyyEpH/kZpalefIEXsHE7L5qCPYw2cmuWth+Er/xe/mVXWGuOwJ0ubClPQFWSZ5gc7cZ4RcS5zQYH3jHx84BWOoV28NZui4fwhH9UM+HXql9WdBPDNxYnx7FpO/vJ/qutecfOsF/AK/2Qjg=",
"ops:Label_File_Info/ops:creation_date_time" : "2019-06-04T16:14:18Z",
"ops:Label_File_Info/ops:file_name" : "n6137final_h.xml",
"ops:Label_File_Info/ops:file_ref" : "/bx/sbnarch04/PDS4/gbo-kpno/nirimage-9p/data/reduced/n6/n6137final_h.xml",
"ops:Label_File_Info/ops:file_size" : "8145",
"ops:Label_File_Info/ops:json_blob" : "eJy9V1tv2zYU/iuEnlbAli1f4kRvucdtXHtxHoYVA8FItM1VEjWScuIF+e87h5RkxUnsDsVWBGlEnvOd+4XP3kzJuIgMnT5ortbMCJmxxAufvXHMMyMWIrJn9FRxhsd5SR8lTGsv/IC/5U1kvGW+EdpItUH+V+cX3DBhpaXN45gZDtC9bnDS7g7bvQHgxVxHSuR4D1cTsVSWliyUTMnsYt4nv5z55EYoHa0+Af2aK41YIvbC4OWl5YlsIVXqBIA0ntCSBOACPwj8rt8FvkQuQY0E+Jz5XMF9obIwY5qFeazD5YNsf88zGWZCiZQtefskDxUHL/CYZkfwE/RHCwFuoCu6EGZXmZZ3LozTY7zVyfq2eEgqH2w4A8nogl3bP7fITYuwLCZfyAP+J7eu10QuyLlMuSEns849T3OekMAnpwLEaB3C38foTVaYlVQ0gbgA4pdMLFemRSb+xPfAVUaYBANw54wiX0GXNuiqGJhJxmjzu3KQ9ylNMvARlykArIzJw04HvOaj+/ylXOPHoIP3nXXglfT7SOFXSalFCMHlKbuVUemzQ2zkwH0HMmdA4RcNzrpd/0nH5GOWWOh8L2ZJ4EAvxvPZj6CKdLkX1N07zPHkmganwSHI0rsHCRzo9eV0YhWlwZGDrsISokH7nFwaXNNDiLbkj4+P/mPfl2rZ6XW7Qee3ye3chq8tMm1YFnHguxIJt73lbQO64SzG6nv25GKhucG/ikxgwj5sDDJHMjNQpV7YhcTLGdRYtqQIHTMV22Lzrsb3c9K3lS0f/uTQqBKeLc1qD9hwdNTFhnGqFNvQ3gW1CY8MiXzTGWwDoAxJUcQhRS02lN+T0FRkMX+iUlkjvVumDRnjEbmCP7nGvjF1Ks+xXWgjIm1bpchEWqRUgzLQcdYsKUC79vHRaNjyUoB+e9kbBKORP+hZ0RxQei3vMuEp6EStmYgLXZdRs8mx8seXl5ej4WAyP5uDT6EZoD9Q6ZL627On+V8FhyDSrEgf0ARobNxhgoBh0CvNzFiKiLciQ5R3+Hp7+eYszVH+Hy8uWVDTBSZNed/stj5225e6CUHl7EteV1jWy3Xm1WNunK0hBGK5e2q4QlF3fMEVWmHTQsRUbQ9eTwsb+icTiiYgfMViLeKCJX4mM0yRGqAKgQuHpK8YMQ6l4V8dX0k9NSuuyHiX9kLoSOQJ+L62AWs/vObYu91Etgc2xymQ5wkkffP61ib9B5bvFMQrN7wuDUf61spSIhhqDXZFiqpbtcoCmCoBIlws7k5hbYhqxRWOLsp0xDM7YplN120FxnzZLMBed+if9IPesJIQ8wjc46AP8LahO45G/aP+cfnvpAKpnUKvFESHvt6dal3LyH2Gdjj0hwE0xdrQCEoAIgcuyqQyK1ivZPT9kEK2TVn2KnAXQoG/mjJhzIu/gQGQK19jeW0rq0SAFcXYxaeiirdQ3pk0BtYsI8m9zCuOd5CbPLd8YZDjDgP0oZRSF9cd0By8qM2Zc2Og/+j/Lw91KRET0rHs6vTKxfbmX7nufY6mG0qaHwjcR3Qfh8EuwunSljuY+d/7VThBFKYzJD/IsJ51rNtar1TitHpO7GzGSHD5lEtdKKCpsaorXl3Fhap5yppJdbNkjisPwCwxbrvAr4X9qmbKDabiSzUYUP35BkZyitS7ZxS24BxaMYJ/+6kJsWCRSITZhNU+Dy8mHx8a700HGJAVfWMmfBHGkBln38nXcpEi0y1YY1o0DnEm/4TWBhtXJHMeoqp+JFUGzumlQf8DtWuGht49P+i3U3hQWGZyDk8VDg88kXXOZRFzAjgJpIpUWxPuGzDP9Uycnk7JHBagBJZALgtNfi0YvFiaTxcUcMHXImpMzzEspKrAFcQtGjN82KkNuEMDFJ0XKX66J5qCVLOLSSS422HhSRxxbZfPhK95gq9iprCZJxsyc5ccl+qShV6xiBtd9o9qPpcmVJXZ8hZIFcDRNVSY3efg8JGBALvBUgXTwRoNr7PaRFvh9wJm0LmE1RInG7eS4AGe24c1NcL5HIZQuzuCn/vuIBwch/3h7yAA1mdlDhOiFKaW3Lwz634mmywmfEJ9+yc5JAu+LGmwb0FyPI1sghdpANswuQ7IL9XT9NM22PbZaiv8HzlN95g=",
"ops:Label_File_Info/ops:md5_checksum" : "e7c521a71baf0da4646987707f427f8b",
"ops:Tracking_Meta/ops:archive_status" : "archived",
"pds:Array_2D_Image/pds:axes" : "2",
"pds:Array_2D_Image/pds:axis_index_order" : "Last Index Fastest",
"pds:Array_2D_Image/pds:local_identifier" : "image_array",
"pds:Array_2D_Image/pds:offset" : "5760",
"pds:Axis_Array/pds:axis_name" : [
"Line",
"Sample"
],
"pds:Axis_Array/pds:elements" : [
"512",
"512"
],
"pds:Axis_Array/pds:sequence_number" : [
"1",
"2"
],
"pds:Citation_Information/pds:author_list" : "Knight, M.M.",
"pds:Citation_Information/pds:description" : "J, H, and K band observations of Comet 9P/Tempel 1. Airmass: 1.84",
"pds:Citation_Information/pds:publication_year" : "2019",
"pds:Element_Array/pds:data_type" : "IEEE754MSBSingle",
"pds:File/pds:file_name" : "n6137final_h.fit",
"pds:Header/pds:object_length" : "5760",
"pds:Header/pds:offset" : "0",
"pds:Header/pds:parsing_standard_id" : "FITS 3.0",
"pds:Identification_Area/pds:information_model_version" : "1.11.0.0",
"pds:Identification_Area/pds:logical_identifier" : "urn:nasa:pds:gbo-kpno:nirimage-9p:reduced_n6_n6137final_h_fit",
"pds:Identification_Area/pds:product_class" : "Product_Observational",
"pds:Identification_Area/pds:title" : "Reduced Near-Infrared Image of Comet 9P/Tempel 1",
"pds:Identification_Area/pds:version_id" : "1.0",
"pds:Internal_Reference/pds:lid_reference" : [
"urn:nasa:pds:context:investigation:individual.none",
"urn:nasa:pds:context:facility:observatory.kpno",
"urn:nasa:pds:context:telescope:kpno.corning2m13",
"urn:nasa:pds:context:target:comet.9p_tempel_1"
],
"pds:Internal_Reference/pds:reference_type" : [
"data_to_investigation",
"is_facility",
"is_telescope",
"data_to_target"
],
"pds:Investigation_Area/pds:name" : "None",
"pds:Investigation_Area/pds:type" : "Other Investigation",
"pds:Local_Internal_Reference/pds:local_identifier_reference" : [
"image_array",
"image_array",
"image_array"
],
"pds:Local_Internal_Reference/pds:local_reference_type" : [
"display_settings_to_array",
"imaging_parameters_to_image_object",
"display_to_data_object"
],
"pds:Modification_Detail/pds:description" : "Migration from PDS3 (B. Hirsch)",
"pds:Modification_Detail/pds:modification_date" : "2019-05-24T00:00:00Z",
"pds:Modification_Detail/pds:version_id" : "1.0",
"pds:Object_Statistics/pds:maximum_scaled_value" : "24177.42",
"pds:Object_Statistics/pds:minimum_scaled_value" : "-8675.0",
"pds:Observing_System_Component/pds:name" : [
"Kitt Peak National Observatory",
"2.13-m Corning Cassegrain/Coude reflector",
"NOAO Simultaneous Quad Infrared Imaging Device"
],
"pds:Observing_System_Component/pds:type" : [
"Observatory",
"Telescope",
"Instrument"
],
"pds:Primary_Result_Summary/pds:processing_level" : "Partially Processed",
"pds:Primary_Result_Summary/pds:purpose" : "Science",
"pds:Science_Facets/pds:discipline_name" : "Imaging",
"pds:Science_Facets/pds:facet1" : "Grayscale",
"pds:Science_Facets/pds:wavelength_range" : "Near Infrared",
"pds:Target_Identification/pds:name" : "9P/1867 G1 (Tempel 1)",
"pds:Target_Identification/pds:type" : "Comet",
"pds:Time_Coordinates/pds:start_date_time" : "2005-07-07T04:48:35Z",
"pds:Time_Coordinates/pds:stop_date_time" : "2005-07-07T04:48:35Z",
"product_class" : "Product_Observational",
"ref_lid_facility" : "urn:nasa:pds:context:facility:observatory.kpno",
"ref_lid_investigation" : "urn:nasa:pds:context:investigation:individual.none",
"ref_lid_target" : "urn:nasa:pds:context:target:comet.9p_tempel_1",
"ref_lid_telescope" : "urn:nasa:pds:context:telescope:kpno.corning2m13",
"title" : "Reduced Near-Infrared Image of Comet 9P/Tempel 1",
"vid" : "1.0"
},
"_type" : "_doc"
} That should break the |
@al-niessner are you good to test this with the above document while you're "there" testing that registry-sweepers PR? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That looks good to me.
@alexdunnjpl to me it is good to merge. We unfortunately don't have don't have the right reference test for that case but I don't want to block the pull request from being merged because of that. Now the management of the reference datasets is is under the management of I&T team (@miguelp1986 ) who might be blocked on that for other reasons. Can you check that your requested changes were otherwise implemented ? Thanks |
@al-niessner @tloubrieu-jpl yep, just testing that's outstanding. Good to merge on that basis |
Sorry it took so long but me and eclipse were having it out in a closed caged no holds barred match. Anyway, changed all ops:Label_File_Info and ops:Data_File_Info to strings for urn:nasa:pds:mars2020.spice:spice_kernels:sclk_m2020_168_sclkscet_refit_v01.tsc::1.0. Individual fails as, hopefully desired, but shows that it is erroneous:
However, when looking at the products for a full list:
The error that should be in the log when it passes over them was:
Point is, it has 19 hits and returned 18 objects. All good. |
🗒️ Summary
Ignore any errors while converting from the opensearch document to PDS defined data types/structures. This allows processing to continue with just errors in the log.
⚙️ Test Data and/or Report
None - do not have access to a db with bad data
♻️ Related Issues
Closes #361