Skip to content

Run-length encoding for data analysis in Python

License

Notifications You must be signed in to change notification settings

tnwei/python-rle

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

12 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

python-rle

Run-length encoding (wikipedia link) for data analysis in Python. Install with pip install python-rle or conda install -c tnwei python-rle. No dependencies required other than tqdm for visualizing a progress bar.

Usage

Encode any iterable (tuples, lists, pd.Series etc.) with rle.encode.

# rle.encode(iterable) returns (values, counts)
>>> import rle 
>>> rle.encode((10, 10, 10, 20, 20, 20, 30, 30, 30))
([10, 20, 30], [3, 3, 3])

Decode (values, counts) back into a sequence with rle.decode.

>>> rle.decode([10, 20, 30], [3, 3, 3])
[10, 10, 10, 20, 20, 20, 30, 30, 30]

Set progress_bar == True for long sequences :

progress_bar_anim

Motivation

Base R contains a simple rle function that "computes the lengths and values of runs of equal values in a vector", as described by its docstring. I found it useful for calculating streaks in collected data, and is especially wonderful for compiling and summarizing categorical data that describes status over time. Hence this little utility.

About

Run-length encoding for data analysis in Python

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages