Description

CleanData Updates

Previously when reading .csv -type data files, all na type strings are automatically removed. Following that, columns of object datatype might be converted to float, which might conflict with the user's definition in the data dictionary. This is still the default behaviour, but with an updated option to allow such values to be loaded as they are, unless a data field/column is explicitly set out in the dictionary to be numeric.
The Generate Data Report feature now includes additional fields. Readers can now identify fields that are out-of-range (numerical types) or not-defined (categorical types), based on what is defined in the data dictionary.
The attribute var_list was not available when discrepancies are detected between the listed data fields in the dictionary and data files. It is now available, defaulted to the fields found in the data file.
Previously, the CleanData module generates the required output data folders as promised, but only after it tries to record its actions in a log file from a non-existent output data folder. Now it does the sensible thing by ensuring the data folders exist first.
An additional option is now available to modify the dataframe "index" by concatenating existing "Index"-type data in the data dictionary, so as to uniquely identify rows, when they are not already uniquely identified by existing "Index"-type columns. This is useful when generating reports, and pin-pointing the exact rows which are problematic. To activate this option, specify CREATE_UNIQUE_INDEX to True in the definitions.py. Other settings include UNIQUE_INDEX_COMPOSITION_LIST and UNIQUE_INDEX_DELIMITER.
If the value for OUTPUT_TYPE_DATA is xlsx in the definitions.py file, converting_ascii crashes if there are <NA> type values in the data. The problem is now fixed to skip ASCII conversion for <NA> type entries.
Additional function add_dictionary_row is now available to add entries to the Data Dictionary. This is useful when creating secondary variables and syncing the data dictionary along with the new creation.

TabulaCopula Updates

Bug fix for data paths in non-windows based systems.

Constraints Updates

Updated functions "multiparent_conditions", "evaluate_df_column" with new options. It is now able to create secondary columns with names that have appended suffixes, instead of replacing the original variables. It also generates more comprehensive logs, on the rows that have been replaced.
Updated function "convertBlankstoValue" to also convert strings that are empty, on top of those that are null.
New functionality "find_mismatch" to find mismatches between any two columns in a dataframe.

Utils Updates

New function extract_year_month_day is available to extract the year, month, and day from a given string-type date using a specified format.
Minor bug fixes in "mapping_dictDateFormatConversion".

VIsualPlot Updates

Added "bins" option to histogram plots.

Package URL

https://pypi.org/project/bdarpack/0.1.5/

Description

Utilities update

new function gen_interpolation for creating new datapoints via interpolation

new function conversionFromTIMSTxtToCSV for reading oddly delimited .txt files and convert them to .csv format

CleanData bug fixes

gen_data_report no longer ignores TYPE categories in data dictionary when they come with trailing spaces

gen_data_report now accepts a variety of TYPE categories in data dictionary, on top of the standard numeric, string, date, bool.

CleanData will now allow users to define sheetname for EXCEL outputs, using the RAWDICTXLSX_SHEETNAME attribute in definitions.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Description

Constraints update

CleanData bug fixes

Package URL

Description

CleanData Updates

TabulaCopula Updates

Constraints Updates

Utils Updates

VIsualPlot Updates

Package URL

Description

Utilities update

CleanData bug fixes

Package URL

First release to PyPI

Description

Package URL

Releases: BiomedDAR/copula-tabular

v0.1.6

Description

Constraints update

CleanData bug fixes

Package URL

v0.1.5

Description

CleanData Updates

TabulaCopula Updates

Constraints Updates

Utils Updates

VIsualPlot Updates

Package URL

v0.1.4

Description

Utilities update

CleanData bug fixes

Package URL

v0.1.3

First release to PyPI

Description

Package URL