Skip to content

Latest commit

 

History

History
18 lines (11 loc) · 948 Bytes

privacy.md

File metadata and controls

18 lines (11 loc) · 948 Bytes

Privacy

Doing more of the data processing locally, enables storing or transmitting privacy sensitive data more seldom.

Ref

  • Scalable Machine Learning with Fully Anonymized Data Using feature hashing on client/sensor-side, before sending to server that performs training. hashing trick is an established way of processing data as part of training a machine learning model. The typical motivation for using the technique is a reduction in memory requirements or the ability to perform stateless feature extraction. While feature hashing is ideally suited to categorical features, it also empirically works well on continuous features

Ideas

  • In audio-processing, could we use a speech detection algorithm to avoid storing samples with speech in them? Can then store/transmit the other data in order to do quality assurance and/or further data analysis.