Skip to content
This repository has been archived by the owner on Sep 27, 2023. It is now read-only.

Feature/assoc demo: New algorithm to perform association analysis using 2 way marginal information and supporting changes. #48

Closed
wants to merge 368 commits into from

Conversation

ananthr
Copy link
Contributor

@ananthr ananthr commented Jul 27, 2015

  • analysis/R/decode2way.R: new algorithm and helper functions are here
  • analysis/tools/sum_bits_assoc.py: python client that computes marginal inputs required for decode2way algorithms given RAPPOR inputs from stdin
    Usage: sum_bits_assoc.py
    Corresponding test analysis/tools/sum_bits_assoc_test.py tests the python file with detailed documentation of output format
  • assoctest.sh: a test suite to run end-to-end simulations and compare EM and new algorithms.
    Usage: ./assoctest.sh run-seq ‘^a-’ 5 T
    This runs all association tests 5 times comparing both algorithms (T = true flag) in sequential processes. Other options documented in assoctest.sh
  • assoctest.sh requires parameters from tests/assoctest_spec.py and HTML plumbing in tests/make_assoc_summary.py and tests/assoctest.html
  • compare_assoc.R: the main R code that is used by assoctest
  • experimental/assoc contains some old files used in experimenting with the new assoc analysis code
  • quick_assoc.sh: a simple wrapper around R functions to run both new and old assoc algorithms on input map files, rappor reports, and params.
    Usage: ./quick_assoc.sh [<EM also? T/F>].
    Note: directory assumed to have some structure (see documentation in quick_assoc.sh)
  • setup.sh: updated to include new jsonlite library
  • tests/gen_true_values_assoc{_test}.R: R files that implement and test respectivley generating distributions (correlated zipfians) analogous to tests/gen_true_values.R for RAPPOR histograms
  • tests/rappor_assoc_sim.py: tests/rappor_sim.py modified to process two variables at a time from a true values file for the purposes of assoctest.sh end-to-end simulations.

andychu and others added 30 commits March 5, 2015 18:47
positional and named arguments.

Add some logging.

Add parameter to shell function to remove actual strings from
candidates.
- Use Rscript in PATH when running tests
Current metrics calculated are l1 and l2 norms and a heuristic whether there
was a false positive detected (when a spurious candidate string is reported in
rappor estimate that doesn't exist in the real distribution).
ProcessAll in analyze.R now returns metrics.
The test parameters are defined in tests/regtest_spec.py.

Basic usage is:

$ ./regtest.sh run-all

This runs all tests in parallel, and results in an HTML table with
results.

- Calculate both false positives and false negatives in analyze.R.
  Refactor the function to be more symmetric.
- Refactor demo.sh a bit
- In gen_sim_input.py, get rid of hard-coded 7 values per client, and
  make it a parameter
- Change rappor_sim.py to use a -d <dist> flag, rather than separate
  flags
- Add test cases based on Chrome params
- Factor out util.sh script
- Rename to 'uniform' (probability 1/2) and 'f_mask' (probability f).
  Rewrite the comments.
andychu and others added 19 commits July 16, 2015 14:10
Make the Python simulation into a Python client library.
- resolved conflicts
- modified code to use new Encode interface
- modified rappor_assoc_sim.py to use same interface as rappor_sim.py
Also, some minor refactoring.
- uncommented experimental code in decode2way and documented it
- renamed function that processes assoc maps
- deleted params.csv
- inverted noise matrix outside loop
- renamed gen_assoc_reports
- added its test to test.sh
- make-summary now shows original dimensions for variables
- threw fitdistribution experimental code into separate function that is now
  only called by a flag passed to FitDistribution
- flag added to assoctest.sh to run comparisons to EM
- added package jsonlite to setup
- further documentation added in sum_bits_assoc
@ananthr ananthr assigned andychu and ilyamironov and unassigned andychu and ilyamironov Jul 27, 2015
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants