Add initial layout for report diffing #758

DavidKorczynski · 2023-01-10T22:24:29Z

Signed-off-by: David Korczynski [email protected]

DavidKorczynski · 2023-01-10T22:26:36Z

An example of the current output between two runs of htslib where one run is 10 seconds and the other run is 300 seconds is:

$ python3 ./main.py diff --report1 ../tmp2/summary1.json --report2 ../tmp2/summary2.json 
INFO:__main__:Running fuzz introspector post-processing
Report 2 has similar Total complexity to report 1 - {report 1: 16612 / report 2: 16612})

## Code coverge comparison
The following functions report 2 has decreased code coverage:
Report 2 has less coverage {  60.0 vs    0.0} for sam_hrecs_find_key
Report 2 has less coverage { 100.0 vs   80.0} for sam_hrecs_global_list_add
Report 2 has less coverage { 15.86 vs  14.42} for sam_hrecs_update_hashes

The following functions report 2 has increased code coverage:
Report 2 has more coverage { 68.29 vs  78.04} for kh_put_sam_hrecs_t
Report 2 has more coverage { 82.52 vs  90.29} for sam_hrecs_parse_lines
Report 2 has more coverage { 30.55 vs  52.77} for hseek
Report 2 has more coverage {   0.0 vs  100.0} for hgetc2
Report 2 has more coverage { 73.07 vs  81.86} for hts_detect_format2
Report 2 has more coverage {   0.0 vs  100.0} for decompress_peek_gz
Report 2 has more coverage {   0.0 vs   65.0} for parse_version
Report 2 has more coverage {   0.0 vs  68.29} for hts_resize_array_
Report 2 has more coverage { 47.05 vs  100.0} for parse_sam_flag
Report 2 has more coverage {   0.0 vs  78.57} for hts_str2uint
Report 2 has more coverage {   0.0 vs  100.0} for known_stderr
Report 2 has more coverage {  62.5 vs  100.0} for warn_if_known_stderr
Report 2 has more coverage { 63.93 vs  67.21} for fastq_parse1
Report 2 has more coverage {  95.0 vs  100.0} for sam_hdr_destroy
Report 2 has more coverage {   0.0 vs  45.45} for sam_read1_bam
Report 2 has more coverage {   0.0 vs  64.81} for bam_read1
Report 2 has more coverage {   0.0 vs  57.14} for fixup_missing_qname_nul
Report 2 has more coverage { 63.88 vs  72.22} for sam_read1
Report 2 has more coverage { 42.79 vs  44.65} for sam_hdr_create
Report 2 has more coverage { 73.46 vs  75.51} for sam_hdr_sanitise
Report 2 has more coverage { 65.11 vs  90.69} for bam_hdr_read
Report 2 has more coverage {   0.0 vs  100.0} for hgetc
Report 2 has more coverage {   0.0 vs  54.83} for kh_resize_vdict
Report 2 has more coverage {   0.0 vs   55.0} for bcf_hdr_set_idx
Report 2 has more coverage {   0.0 vs  100.0} for bcf_hrec_set_type
Report 2 has more coverage {   0.0 vs  30.76} for kh_get_vdict
Report 2 has more coverage {   0.0 vs  48.78} for kh_put_vdict
Report 2 has more coverage {   0.0 vs  48.33} for bcf_hdr_parse
Report 2 has more coverage { 90.47 vs  100.0} for bcf_hdr_destroy
Report 2 has more coverage {   0.0 vs  100.0} for bcf_hrec_destroy
Report 2 has more coverage {   0.0 vs   68.8} for bcf_hdr_parse_line
Report 2 has more coverage {   0.0 vs  82.85} for bcf_hdr_add_hrec
Report 2 has more coverage {   0.0 vs  17.77} for bcf_hrec_check
Report 2 has more coverage {   0.0 vs  24.26} for bcf_hdr_register_hrec
Report 2 has more coverage {   0.0 vs  84.21} for hrec_add_idx
Report 2 has more coverage {   0.0 vs  100.0} for bcf_hrec_add_key
Report 2 has more coverage {   0.0 vs  100.0} for is_escaped
Report 2 has more coverage {   0.0 vs  71.87} for bcf_hrec_set_val
Report 2 has more coverage {  44.0 vs   46.0} for bcf_hdr_read
Report 2 has more coverage {   0.0 vs  37.03} for vcf_hdr_read
Report 2 has more coverage {   0.0 vs  100.0} for uint7_decode_crc64
Report 2 has more coverage {   0.0 vs  100.0} for sint7_decode_crc32
Report 2 has more coverage {   0.0 vs  100.0} for uint7_decode_crc32
Report 2 has more coverage {   0.0 vs  54.05} for cram_init_varint
Report 2 has more coverage {   0.0 vs  62.31} for cram_init_tables
Report 2 has more coverage {   0.0 vs  100.0} for cram_free_file_def
Report 2 has more coverage {   0.0 vs  36.95} for cram_free_container
Report 2 has more coverage {  19.8 vs  27.72} for cram_dopen
Report 2 has more coverage { 30.43 vs  60.86} for cram_read_file_def
Report 2 has more coverage {   0.0 vs  12.12} for cram_read_SAM_hdr
Report 2 has more coverage {   0.0 vs  100.0} for int32_decode
Report 2 has more coverage {   0.0 vs  57.25} for cram_read_container
Report 2 has more coverage { 53.84 vs  84.61} for bgzf_check_EOF_common
Report 2 has more coverage { 35.71 vs  47.61} for bgzf_close

## Reachability comparison
The reachability in the reports is similar
INFO:__main__:Ending fuzz introspector post-processing

The following is a result from running libarchive first with the following lines commented out https://github.com/google/oss-fuzz/blob/a8cb9370f0dddf33111b1a7ce6d715633d5400df/projects/libarchive/libarchive_fuzzer.cc#L39-L73 and then the complete fuzzer afterwards.

$ python3 ./main.py diff --report1 ../tmp4/summary1.json --report2 ../tmp4/summary2.json 
INFO:__main__:Running fuzz introspector post-processing
Report 2 has a larger Total complexity than report 1 - {report 1: 9763 / report 2: 9787})


## Code coverage comparison
...
...

## Reachability comparison
The following functions are only reachable in report 1:
- All functions reachable in report 1 are reachable in report 2

The following functions are only reachable in report 2:
archive_read_data
mbrtowc
get_current_oemcp
default_iconv_charset
nl_langinfo
get_current_codepage
archive_string_conversion_from_charset
archive_strncpy_l
free_sconv_object
archive_wstring_append_from_mbs
iconv_close
archive_strncat_l
utf16nbytes
mbsnbytes
get_current_charset
archive_mstring_get_mbs
archive_mstring_get_wcs
archive_mstring_get_utf8
archive_string_conversion_to_charset
archive_read_data_block
archive_read_next_header
archive_entry_digest
archive_entry_is_encrypted
archive_entry_is_metadata_encrypted
archive_entry_is_data_encrypted
archive_entry_uid
archive_entry_size
gnu_dev_makedev
archive_entry_pathname_w
archive_entry_pathname_utf8
archive_entry_pathname
archive_entry_mtime
archive_entry_gid
archive_entry_filetype
archive_entry_dev
archive_entry_ctime
archive_entry_birthtime
archive_entry_atime
INFO:__main__:Ending fuzz introspector post-processing

DavidKorczynski · 2023-01-10T22:29:25Z

There is a lot that can be diffed and we should enable some form of granularity selection.

I also think it would be good to start making the code more object oriented, which will make certain things such as serializing data and comparing data more intuitive from a code-level perspective.

Ref: #734 Signed-off-by: David Korczynski <[email protected]>

Signed-off-by: David Korczynski <[email protected]>

DavidKorczynski · 2023-01-10T23:44:05Z

The ClusterFuzzLite issue is a false positive

Signed-off-by: David Korczynski <[email protected]>

Add initial layout for report diffing

08f3414

Ref: #734 Signed-off-by: David Korczynski <[email protected]>

DavidKorczynski marked this pull request as draft January 10, 2023 22:40

DavidKorczynski added 4 commits January 10, 2023 14:50

fix nits

58eda6a

Signed-off-by: David Korczynski <[email protected]>

Add reachability diff of all functions

7217b18

Signed-off-by: David Korczynski <[email protected]>

change name of function

bbd3102

Signed-off-by: David Korczynski <[email protected]>

fix soem typing

bb266c1

Signed-off-by: David Korczynski <[email protected]>

DavidKorczynski marked this pull request as ready for review January 10, 2023 23:39

DavidKorczynski requested review from Navidem and oliverchang January 10, 2023 23:40

nit

0df5729

Signed-off-by: David Korczynski <[email protected]>

AdamKorcz approved these changes Jan 11, 2023

View reviewed changes

AdamKorcz merged commit 272191f into main Jan 11, 2023

AdamKorcz deleted the 2023-26 branch January 11, 2023 10:15

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add initial layout for report diffing #758

Add initial layout for report diffing #758

DavidKorczynski commented Jan 10, 2023

DavidKorczynski commented Jan 10, 2023 •

edited

Loading

DavidKorczynski commented Jan 10, 2023

DavidKorczynski commented Jan 10, 2023

Add initial layout for report diffing #758

Add initial layout for report diffing #758

Conversation

DavidKorczynski commented Jan 10, 2023

DavidKorczynski commented Jan 10, 2023 • edited Loading

DavidKorczynski commented Jan 10, 2023

DavidKorczynski commented Jan 10, 2023

DavidKorczynski commented Jan 10, 2023 •

edited

Loading