Releases: storebrand/tap-sharepointsites
Unstructured files
New release adds the "text_files" stream type, that uses textract to read contents of almost any type of file. Intended for RAG type ingest pipelines, typically together with the meltano mapper map-gpt-embeddings
.
Better column name handling for files
This release changes how columns (properties) are named when syncing files, and introduces a clean_colnames
config.
Breaking change
There was a bug in the previous automatic column renaming logic, that inadvertently removed underscores. This release introduces a clean_colnames
config for files that defaults to false
and will not alter the column name, but when set to true
it will convert column names to snake_case
(a best effort, at least). The metadata column "LastModifiedDate" will not be converted to snake_case.
Sharepoint files
This release adds new functionality to read CSV and Excel files from sharepoint sides.
Initial
Just a fork in the road before we add new capabilities (and possibly introduce errors) in the tap.