All notable changes to this project will be documented in this file.
The format is based on Keep a Changelog, and this project adheres to Semantic Versioning.
0.3.2 - 2024-10-23
- SQL UDF
days_since_epoch
to parse a date representing a string to the number of days since1970-01-01
#39 - Custom Clickhouse
ColumnExpression
with additional transformparse_date_to_int
to parse string to days since epoch #39 - Custom date comparison and comparison levels working with integer type representing days since epoch #39
0.3.1 - 2024-10-14
ClickhouseAPI
now has a function.set_union_default_mode()
to allow manually setting client state necessary for clustering, if session has timed out e.g. when running interactively #36.- Added support for Splink 4.0.4 #37.
estimate_probability_two_random_records_match
now works correctly whendebug_mode
is switched on #34.
0.3.0 - 2024-09-26
chdb
is now an optional dependency, requiring opt-in installation for use ofChDBAPI
#28.
0.2.5 - 2024-09-23
- Added support for Splink >= 4.0.2, dropped support for 4.0.0, 4.0.1 #26.
0.2.4 - 2024-09-19
- Extended
ClickhouseAPI
pandas table registration to support float columns #24. - Added Clickhouse-specific library comparisons/levels -
cll_ch.DistanceInKMLevel
,cl_ch.DistanceInKMAtThresholds
, andcl_ch.ExactMatchAtSubstringSizes
#24.
0.2.3 - 2024-09-16
0.2.2 - 2024-09-12
ClickhouseAPI
now allows for registering tables directly from pandasDataFrame
s, if they contain only integer and string columns #18.
- Create an alias for
rand
,random
so thatLinker.visualisations.comparison_viewer_dashboard
runs without error #14. - Workaround for Clickhouse
count(*) filter ...
parsing issue so thatlinker.clustering.compute_graph_metrics(...)
now runs #18.
0.2.1 - 2024-09-12
- Updated
numpy
dependency requirements to allow compatible versions for all supported python versions #9.
0.2.0 - 2024-09-11
ClickhouseAPI
and dataframe added to support running calculations in a Clickhouse instance #4.
0.1.1 - 2024-09-10
- Fix
random_sample_sql
so that u-training works when we don't sample the entire dataset #1.
try_parse_date
andtry_parse_timestamp
now useDateTime64
to extend the range to more useful values, and no longer support custom format strings #2.
0.1.0 - 2024-09-09
- Basic working version of package with api for
chdb