Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add initial layout for report diffing #758

Merged
merged 6 commits into from
Jan 11, 2023
Merged

Add initial layout for report diffing #758

merged 6 commits into from
Jan 11, 2023

Conversation

DavidKorczynski
Copy link
Contributor

Ref: #734

Signed-off-by: David Korczynski [email protected]

@DavidKorczynski
Copy link
Contributor Author

DavidKorczynski commented Jan 10, 2023

An example of the current output between two runs of htslib where one run is 10 seconds and the other run is 300 seconds is:

$ python3 ./main.py diff --report1 ../tmp2/summary1.json --report2 ../tmp2/summary2.json 
INFO:__main__:Running fuzz introspector post-processing
Report 2 has similar Total complexity to report 1 - {report 1: 16612 / report 2: 16612})

## Code coverge comparison
The following functions report 2 has decreased code coverage:
Report 2 has less coverage {  60.0 vs    0.0} for sam_hrecs_find_key
Report 2 has less coverage { 100.0 vs   80.0} for sam_hrecs_global_list_add
Report 2 has less coverage { 15.86 vs  14.42} for sam_hrecs_update_hashes

The following functions report 2 has increased code coverage:
Report 2 has more coverage { 68.29 vs  78.04} for kh_put_sam_hrecs_t
Report 2 has more coverage { 82.52 vs  90.29} for sam_hrecs_parse_lines
Report 2 has more coverage { 30.55 vs  52.77} for hseek
Report 2 has more coverage {   0.0 vs  100.0} for hgetc2
Report 2 has more coverage { 73.07 vs  81.86} for hts_detect_format2
Report 2 has more coverage {   0.0 vs  100.0} for decompress_peek_gz
Report 2 has more coverage {   0.0 vs   65.0} for parse_version
Report 2 has more coverage {   0.0 vs  68.29} for hts_resize_array_
Report 2 has more coverage { 47.05 vs  100.0} for parse_sam_flag
Report 2 has more coverage {   0.0 vs  78.57} for hts_str2uint
Report 2 has more coverage {   0.0 vs  100.0} for known_stderr
Report 2 has more coverage {  62.5 vs  100.0} for warn_if_known_stderr
Report 2 has more coverage { 63.93 vs  67.21} for fastq_parse1
Report 2 has more coverage {  95.0 vs  100.0} for sam_hdr_destroy
Report 2 has more coverage {   0.0 vs  45.45} for sam_read1_bam
Report 2 has more coverage {   0.0 vs  64.81} for bam_read1
Report 2 has more coverage {   0.0 vs  57.14} for fixup_missing_qname_nul
Report 2 has more coverage { 63.88 vs  72.22} for sam_read1
Report 2 has more coverage { 42.79 vs  44.65} for sam_hdr_create
Report 2 has more coverage { 73.46 vs  75.51} for sam_hdr_sanitise
Report 2 has more coverage { 65.11 vs  90.69} for bam_hdr_read
Report 2 has more coverage {   0.0 vs  100.0} for hgetc
Report 2 has more coverage {   0.0 vs  54.83} for kh_resize_vdict
Report 2 has more coverage {   0.0 vs   55.0} for bcf_hdr_set_idx
Report 2 has more coverage {   0.0 vs  100.0} for bcf_hrec_set_type
Report 2 has more coverage {   0.0 vs  30.76} for kh_get_vdict
Report 2 has more coverage {   0.0 vs  48.78} for kh_put_vdict
Report 2 has more coverage {   0.0 vs  48.33} for bcf_hdr_parse
Report 2 has more coverage { 90.47 vs  100.0} for bcf_hdr_destroy
Report 2 has more coverage {   0.0 vs  100.0} for bcf_hrec_destroy
Report 2 has more coverage {   0.0 vs   68.8} for bcf_hdr_parse_line
Report 2 has more coverage {   0.0 vs  82.85} for bcf_hdr_add_hrec
Report 2 has more coverage {   0.0 vs  17.77} for bcf_hrec_check
Report 2 has more coverage {   0.0 vs  24.26} for bcf_hdr_register_hrec
Report 2 has more coverage {   0.0 vs  84.21} for hrec_add_idx
Report 2 has more coverage {   0.0 vs  100.0} for bcf_hrec_add_key
Report 2 has more coverage {   0.0 vs  100.0} for is_escaped
Report 2 has more coverage {   0.0 vs  71.87} for bcf_hrec_set_val
Report 2 has more coverage {  44.0 vs   46.0} for bcf_hdr_read
Report 2 has more coverage {   0.0 vs  37.03} for vcf_hdr_read
Report 2 has more coverage {   0.0 vs  100.0} for uint7_decode_crc64
Report 2 has more coverage {   0.0 vs  100.0} for sint7_decode_crc32
Report 2 has more coverage {   0.0 vs  100.0} for uint7_decode_crc32
Report 2 has more coverage {   0.0 vs  54.05} for cram_init_varint
Report 2 has more coverage {   0.0 vs  62.31} for cram_init_tables
Report 2 has more coverage {   0.0 vs  100.0} for cram_free_file_def
Report 2 has more coverage {   0.0 vs  36.95} for cram_free_container
Report 2 has more coverage {  19.8 vs  27.72} for cram_dopen
Report 2 has more coverage { 30.43 vs  60.86} for cram_read_file_def
Report 2 has more coverage {   0.0 vs  12.12} for cram_read_SAM_hdr
Report 2 has more coverage {   0.0 vs  100.0} for int32_decode
Report 2 has more coverage {   0.0 vs  57.25} for cram_read_container
Report 2 has more coverage { 53.84 vs  84.61} for bgzf_check_EOF_common
Report 2 has more coverage { 35.71 vs  47.61} for bgzf_close

## Reachability comparison
The reachability in the reports is similar
INFO:__main__:Ending fuzz introspector post-processing

The following is a result from running libarchive first with the following lines commented out https://github.com/google/oss-fuzz/blob/a8cb9370f0dddf33111b1a7ce6d715633d5400df/projects/libarchive/libarchive_fuzzer.cc#L39-L73 and then the complete fuzzer afterwards.

$ python3 ./main.py diff --report1 ../tmp4/summary1.json --report2 ../tmp4/summary2.json 
INFO:__main__:Running fuzz introspector post-processing
Report 2 has a larger Total complexity than report 1 - {report 1: 9763 / report 2: 9787})


## Code coverage comparison
...
...

## Reachability comparison
The following functions are only reachable in report 1:
- All functions reachable in report 1 are reachable in report 2

The following functions are only reachable in report 2:
archive_read_data
mbrtowc
get_current_oemcp
default_iconv_charset
nl_langinfo
get_current_codepage
archive_string_conversion_from_charset
archive_strncpy_l
free_sconv_object
archive_wstring_append_from_mbs
iconv_close
archive_strncat_l
utf16nbytes
mbsnbytes
get_current_charset
archive_mstring_get_mbs
archive_mstring_get_wcs
archive_mstring_get_utf8
archive_string_conversion_to_charset
archive_read_data_block
archive_read_next_header
archive_entry_digest
archive_entry_is_encrypted
archive_entry_is_metadata_encrypted
archive_entry_is_data_encrypted
archive_entry_uid
archive_entry_size
gnu_dev_makedev
archive_entry_pathname_w
archive_entry_pathname_utf8
archive_entry_pathname
archive_entry_mtime
archive_entry_gid
archive_entry_filetype
archive_entry_dev
archive_entry_ctime
archive_entry_birthtime
archive_entry_atime
INFO:__main__:Ending fuzz introspector post-processing

@DavidKorczynski
Copy link
Contributor Author

There is a lot that can be diffed and we should enable some form of granularity selection.

I also think it would be good to start making the code more object oriented, which will make certain things such as serializing data and comparing data more intuitive from a code-level perspective.

Ref: #734

Signed-off-by: David Korczynski <[email protected]>
@DavidKorczynski DavidKorczynski marked this pull request as draft January 10, 2023 22:40
Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
Signed-off-by: David Korczynski <[email protected]>
@DavidKorczynski DavidKorczynski marked this pull request as ready for review January 10, 2023 23:39
@DavidKorczynski
Copy link
Contributor Author

The ClusterFuzzLite issue is a false positive

Signed-off-by: David Korczynski <[email protected]>
@AdamKorcz AdamKorcz merged commit 272191f into main Jan 11, 2023
@AdamKorcz AdamKorcz deleted the 2023-26 branch January 11, 2023 10:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants