Support ignoreRows for TabularResource #344

roll · 2016-12-20T11:36:43Z

Overview

Resource specification is created to describe concrete data source with metadata. When we deal with concrete real world data sources there could be some corner case like commented rows or blank rows on top etc. A publisher needs an ability to share this information with implementations.

Example

https://github.com/frictionlessdata/ADB-User-Study/blob/master/metadata.tsv

It's a valid resource (checked by goodtables) except row 2 and 3 which are comments and can't be removed because it's vital metadata for this publisher tools.

Proposal

Introduce ignoreRows (or skipRows or informationalRows or ?) attribute for TabularResource specification. This attribute MUST be an array of integers and strings where:

numbers mean row number to ignore the row
strings mean row first characters to match to ignore the row

Example

ignoreRows = [1, 2, "#","//"]

References

initial discussion - Support skip rows in goodtables.yml goodtables.io#75

The text was updated successfully, but these errors were encountered:

pwalsh · 2016-12-20T11:42:37Z

closely related to #326

rufuspollock · 2016-12-21T13:20:01Z

@roll i'm super cautious about this kind of stuff as it is a place where "ETL" logic starts to bleed into the spec and that's a slippery slope. If you delete rows, what about columns, what about transforms etc etc.

Thus, my sense is that ETL stuff like this should not go into the spec for now - at most it should be in patterns and even there i'm cautious.

PS: i am willing to consider #326 because it is so common and it is about presence of a header row.

roll · 2016-12-22T07:50:55Z

@rufuspollock
@pwalsh has said the same but there is a very common real world problem and it needs some help from specs (may be patterns?). It's only datapackage problem - on other levels implementations could use own options but datapackage encapsulates all knowledge about data sources (and that's the thing - data containerization) so we need some way to allow this information injection (cc @danfowler)

rufuspollock · 2017-02-05T10:45:40Z

AGREED with @pwalsh: this should go to "Best Practice" rather than spec for now.

roll changed the title ~~Support ignore rows for tabular resource~~ Support ignoreRows for tabular resource Dec 20, 2016

roll added the spec-tabular-dataresource label Dec 20, 2016

roll changed the title ~~Support ignoreRows for tabular resource~~ Support ignoreRows for TabularResource Dec 20, 2016

roll added the Status: Proposed-For-Version1 label Dec 20, 2016

pwalsh added this to the v1.0 milestone Feb 5, 2017

pwalsh removed the Status: Proposed-For-Version1 label Feb 5, 2017

rufuspollock added the Pattern / Best Practice label Feb 5, 2017

rufuspollock modified the milestones: Backlog, v1.0 Feb 5, 2017

rufuspollock mentioned this issue May 26, 2020

Multi-line header rows #681

Closed

roll added this to Open Knowledge Apr 14, 2023

roll removed this from the Backlog milestone Apr 14, 2023

roll added Table Dialect and removed Tabular Data Resource labels Jan 3, 2024

roll added this to the v2 milestone Jan 3, 2024

roll self-assigned this Feb 21, 2024

roll mentioned this issue Feb 22, 2024

Table Dialect specification frictionlessdata/datapackage-v2-draft#41

Merged

roll added the proposal label Feb 22, 2024

roll closed this as completed in frictionlessdata/datapackage-v2-draft#41 Apr 3, 2024

github-project-automation bot moved this to Done in Open Knowledge Apr 3, 2024

roll modified the milestones: v2.0-draft, v2.0 Sep 25, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support ignoreRows for TabularResource #344

Support ignoreRows for TabularResource #344

roll commented Dec 20, 2016 •

edited

Loading

pwalsh commented Dec 20, 2016 •

edited

Loading

rufuspollock commented Dec 21, 2016 •

edited

Loading

roll commented Dec 22, 2016 •

edited

Loading

rufuspollock commented Feb 5, 2017 •

edited

Loading

Support ignoreRows for TabularResource #344

Support ignoreRows for TabularResource #344

Comments

roll commented Dec 20, 2016 • edited Loading

Overview

Example

Proposal

Example

Related

References

pwalsh commented Dec 20, 2016 • edited Loading

rufuspollock commented Dec 21, 2016 • edited Loading

roll commented Dec 22, 2016 • edited Loading

rufuspollock commented Feb 5, 2017 • edited Loading

roll commented Dec 20, 2016 •

edited

Loading

pwalsh commented Dec 20, 2016 •

edited

Loading

rufuspollock commented Dec 21, 2016 •

edited

Loading

roll commented Dec 22, 2016 •

edited

Loading

rufuspollock commented Feb 5, 2017 •

edited

Loading