Use toml_edit's to_string_in_original_order() #312

alsuren · 2019-06-15T17:32:00Z

uses toml_edit from this PR: toml-rs/toml#63

solves #218 pretty convincingly

previously it was:

1457 (2%) also did something else (mostly moving sections of the file around, but ~10 also made whitespace changes, like removing spaces from section headers)

Now it is:

13 (0.02%) also did something else (mostly removing whitespace from section headers but a handful also fixed unix/windows line endings that I failed to clean up properly before starting).

63: Avoid reordering tables r=ordian a=alsuren Adds Document::to_string_in_original_order() This is intended to help with killercup/cargo-edit#218 We save .position = Some(i) on each Table, starting with i = 0 for the root of the Document, and then incrementing each time we see a table header. When serialising, ~we start looking for a Table with position 0, which we should find at the root of the document. We then continue scanning, looking for position 1 and so on. If the document doesn't need re-ordering then we will on need to do a single scan.~ we start by doing a walk over the tree and working out which positions exist (storing them in a Vec and then sorting them). We then do subsequent walks to find each of the known positions in turn. Each walk of the tree is guaranteed to yield at least one table, but will probably yield more. If the document doesn't need re-ordering then we will on need to do a single scan. If a Table has been added using the API rather than parsed then it will have position = None. We only want to serialise each Table once, so we only print tables with position = None if the table we visited immediately before it had the right position. If we didn't manage to serialise all Tables after a single loop then we keep looping until we are done. ~There is code to make sure we don't get stuck when someone causes a gap in the numbers by deleting a Table.~ It is impossible for the loop to get stuck, because it only searches for position values that it already knows are in the tree. Known issues: - [x] ~If you have created your Document by parsing two .toml files and merging the results together, this method may silently skip some tables. Please use Document.to_string() instead, which doesn't have this problem.~ fixed - [ ] The best case performance of this function is a bit worse than Document.to_string(). It will be significantly worse if you have lots of tables that are in strange orders. Also adds `pretty_assertions` in some tests and color-backtrace (as separate commits) because I found debugging test failures more tricky without them. Things to do before merging: - [x] make a version of cargo-edit which uses this patch, and make sure that it actually works on my test samples at https://github.com/alsuren/rust-repos/tree/cargo.toml-samples - see killercup/cargo-edit#312 - went from 1457 funky files down to 13 (and I think that some of them are because I didn't clean up the unix/windows line endings correctly) - [x] Write a few more integration tests, describing interesting edge cases from the repo. For example: - [x] An example with trailing comments after the end of the document. - [x] An example where `[package.metadata.deb]` is at the bottom of the file and it stays there. * The table_reordering test already does this. - [x] A test with the following document, where you add `[dependencies.somethingelse]` and it is added after `[dependencies.opencl]` rather than before `[[example]]`. ``` [package] [dependencies] [[example]] [dependencies.opencl] [dev-dependencies] ``` - [x] Is the API okay? Display.to_string() seems to swallow the error case (possibly by panicking). Should we do the same? (answer: yes) - [x] (edit: added a commit to use this new algorithm) Is it enough to document the issue where it can silently skip tables that are parsed from multiple files, or should I check for it? If I want to fix it properly then the best way to do it is to: * walk the tree once, gathering all of the known `position` numbers unto a vec * Sort the positions vec * Walk the tree and pop off items from the positions vec as I go (rather than incrementing an integer) * Stop looping when the vec is empty. * This is probably simpler than my current implementation, but forces me to allocate another vec, and then do at least two passes of the tree, even if the tables are in the correct order already. - [x] update the limitations section of the readme to explain that you should use Document::to_string_in_original_order() if you care about not forcing nested tables to be grouped next to their parents. - [ ] Once this PR is merged, make another PR to point `https://github.com/ordian/toml_edit/blob/f09bd5d075fdb7d2ef8d9bb3270a34506c276753/tests/test_valid.rs#L84` at the updated test. Co-authored-by: David Laban <[email protected]>

Cargo.toml

alsuren · 2019-07-20T08:55:27Z

I have rebased on top of master, and double-checked that it transforms my rust-repos dataset in the same way as the original PoC (everything is identical). I think it's ready for merging as-is:

cargo add and friends probably want to default to using to_string_in_original_order() (if a tool mangles your file on your first attempt to use it then you're more likely to drop it on the floor than go hunting in the docs for a --dont-mangle-my-shit flag).

I thought about adding a --group-nested-toml-sections flag to restore the old behaviour, but I think that this would add more complexity than it's worth. Anyone who relies on this behaviour should probably be using a dedicated tidying tool like toml-fmt instead.

ordian · 2019-07-20T11:26:41Z

Awesome, thank you!

bors r+

312: Use toml_edit's to_string_in_original_order() r=ordian a=alsuren uses toml_edit from this PR: toml-rs/toml#63 solves #218 pretty convincingly previously it was: > 1457 (2%) also did something else (mostly moving sections of the file around, but ~10 also made whitespace changes, like removing spaces from section headers) Now it is: > 13 (0.02%) also did something else (mostly removing whitespace from section headers but a handful also fixed unix/windows line endings that I failed to clean up properly before starting). Co-authored-by: David Laban <[email protected]>

bors · 2019-07-20T11:48:52Z

Build succeeded

alsuren mentioned this pull request Jun 15, 2019

Avoid reordering tables toml-rs/toml#63

Merged

11 tasks

alsuren changed the title ~~PoC to use toml_edit's to_string_in_original_order()~~ WIP: PoC to use toml_edit's to_string_in_original_order() Jun 15, 2019

ordian reviewed Jul 11, 2019

View reviewed changes

Cargo.toml Outdated Show resolved Hide resolved

alsuren force-pushed the toml_edit-with-to_string_in_original_order branch from c091a9f to 0b89f82 Compare July 19, 2019 21:17

Use toml_edit's to_string_in_original_order()

e25f246

alsuren force-pushed the toml_edit-with-to_string_in_original_order branch from 0b89f82 to e25f246 Compare July 19, 2019 21:22

alsuren changed the title ~~WIP: PoC to use toml_edit's to_string_in_original_order()~~ Use toml_edit's to_string_in_original_order() Jul 19, 2019

bors bot merged commit e25f246 into killercup:master Jul 20, 2019

This was referenced Oct 27, 2019

Some cargo.toml gets scrambled when runing cargo upgrade #218

Closed

cargo add: Nicer TOML Files #15

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Use toml_edit's to_string_in_original_order() #312

Use toml_edit's to_string_in_original_order() #312

alsuren commented Jun 15, 2019 •

edited

Loading

alsuren commented Jul 20, 2019

ordian commented Jul 20, 2019

bors bot commented Jul 20, 2019

Use toml_edit's to_string_in_original_order() #312

Use toml_edit's to_string_in_original_order() #312

Conversation

alsuren commented Jun 15, 2019 • edited Loading

alsuren commented Jul 20, 2019

ordian commented Jul 20, 2019

bors bot commented Jul 20, 2019

Build succeeded

alsuren commented Jun 15, 2019 •

edited

Loading