Skip to content

mohrobati/ruxanne

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Ruxanne: A Study of Common Bug Fix Patterns in Rust

This is the supplementary material for Ruxanne, a big fix pattern miner for Rust as described in the paper "A Study of Common Bug Fix Patterns in Rust".

It contains the modules for parsing, code embedding, mining and clustering, as well as the final results to help others understand, replicate, and extend our work. Open an issue or submit a PR to help us improve Ruxanne.

Installation

  • Install cargo---1.64.0-nightly (a5e08c470 2022-06-23)
  • https://doc.rust-lang.org/cargo/getting-started/installation.html
  • Install Python packages in requirements.txt
  • git clone https://github.com/mohrobati/ruxanne.git
    cd ruxanne/
    pip install --upgrade pip
    pip install -r requirements.txt
    

    Versions in our environment:
    PyDriller==2.2
    dictdiffer==0.9.0
    ply==3.11
    numpy==1.23.5
    sklearn==0.0.post1
    pandas==1.5.2
    matplotlib==3.6.2
    scipy==1.9.3
    packcircles==0.14

Test

cd implementation/1-mining
python3 test.py

You should get this output after few seconds:

...
--alacritty/alacritty/commit/90552e3e7f8f085919a39435a8a68b3a2f633e54 mined successfully--
======


Mining Finished!

Now you can check out the mined datapoints:

vim datapoints.csv

Or the logs:

cd __logs__
ls

Files/Directory Structure

  • implementation
    • 1-mining
      • parser: Includes programs we used to parse Rust files (rust-parser) and transform them to json format (syn_compiler). We wrote our own transformer to transform Syn AST to json. We could have used syn_serde to do this serialization for us, but we figured in this way, we'd end up storing more information. Also, we have implemented the path extraction logic here (ddiff.py)
      • embedding: Includes code regarding our embedding approach (RQ1 contains the environment we used to carry out the experiment for evaluating our embedding effectiveness)
      • scheme: Contains our weighting scheme (weight_adjustments.json) and how we computed it (weight.py).
      • circle_pack.py: Visualizing the essence of each change (read the paper for more info)
      • freq.py: Recording the # Occurrences of all the non terminals from the 20 recent commits within the target repositoreis
      • miner.py: Mining all the target repositories and converting them to clusterable embeddings
      • test.py: A test mining for only one single commit, all the results will be written in implementation/1-mining/__logs__
      • projects.txt: List of all target repositories
    • 2-clustering: All the scripts we used to run the DBSCAN on our datasets.
      • borrow_only.csv.zip: After unzipping, the csv file contains Db as discussed in the paper
      • total.csv.zip: After unzipping, the csv file contains Dg as discussed in the paper
      • borrow_only.ipynb: Jupyter Notebook file used to cluster Db
      • total.ipynb: Jupyter Notebook file used to cluster Dg
      • dbscan_stats.py: Visualizing different clusterings with different parameters
    • 3-results: Contains all the information with regards to the final clusters.
      • Overview.csv: A general overview of all the clusters
      • BC.ref.add.csv: Information of all the datapoints within the cluster Adding Borrowing (read the paper for more info about this cluster)
      • Similar for other cluster...

About

A Study of Common Bug Fix Patterns in Rust

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published