Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

v2 dcd format and tests #1295

Closed
wants to merge 13 commits into from
Closed

v2 dcd format and tests #1295

wants to merge 13 commits into from

Conversation

m-mcgowan
Copy link
Contributor

@m-mcgowan m-mcgowan commented Apr 10, 2017

Do not merge as-is, will require a rebase

Problem

Hypothesised write errors in flash that appear only after power cycle can corrupt the configuration data.
Flash access is not thread-safe.

Solution

Adds CRC checking to the Electron DCD implementation so that write errors are detected.
Add a critical section around flash operations and around DCD operations.

Steps to Test

Run the firmware unit tests.

Manual testing:

  • flash system firmware to an electron that was connecting to the cloud
  • verify that the device continues to connect to the cloud
  • make a change in DCD via DFU
  • verify the device still connects to the cloud (the DCD contents will have been upgraded to the V2 format.)

Example App

This is an internal system change. No example application.


Completeness

  • User is totes amazing for contributing!
  • Contributor has signed CLA (Info here)
  • Problem and Solution clearly stated
  • Run unit/integration/application tests on device
  • Added documentation
  • Added to CHANGELOG.md after merging (add links to docs and issues)

@m-mcgowan m-mcgowan requested a review from sergeuz April 11, 2017 17:49
@@ -1,37 +1,37 @@
const unsigned char bootloader_platform_10_bin[] = {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If a new bootloader binary is bundled here, it should probably have a bumped module version, otherwise it is not automatically upgraded, which causes a device to enter an endless loop, as well as DCT corruption.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Totes, I probably tested against an older version so got the upgrade anyway. Although I did also see the endless loop when developing this initially. That's a bit worrying - anything we can to do help folks out of that if things do go wrong?

@technobly technobly added this to the 0.7.0 milestone May 5, 2017
*/
Sector _currentSector(uint8_t count0, uint8_t count1, Sector sector0, Sector sector1)
{
if (((count0+1) & 3)==count1) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I can't think of a reason why the distance between counter/sequence numbers might be > 1, but having run into multiple issues with dct and flash operations in general already it will most likely happen than not. Having that in mind these checks look a bit unsafe.

I propose that 0 is treated as an initial value (only assigned when dcd is initialized for the first time) and a valid range is clamped to 1 - 3, with 3 wrapping to 1 instead of 0.

In that case the check could be implemented like this:

static const int dcd_counter_bits = 2;
static const int dcd_max_counter = (1 << dcd_counter_bits) - 1;

const int distance = (int)count0 - (int)count1;
if ((counter0 != 0) && (counter1 == 0 || (distance < 0 && distance > -dcd_max_counter)) {
  // prefer sector1
  return sector1;
}

// In all other cases, prefer sector0
return sector0;

The results table:

count0 count1 current proposed
0 0 sector0 sector0
0 1 sector1 sector0
0 2 sector0 sector0
0 3 sector0 sector0
1 0 sector0 sector1
1 1 sector0 sector0
1 2 sector1 sector1
1 3 sector0 sector1
2 0 sector0 sector1
2 1 sector0 sector0
2 2 sector0 sector0
2 3 sector1 sector1
3 0 sector1 sector1
3 1 sector0 sector0
3 2 sector0 sector0
3 3 sector0 sector0

Although I might be overthinking it.

Sector newSector = alternateSectorTo(current);
const uint8_t* existing = store.dataAt(addressOf(current));
Result error = this->writeSector(offset, data, length, existing, newSector);
Result error = this->_writeSector(offset, data, length, existing, newSector);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suggest we don't mix tabs and spaces within the same file. This PR is full of such cases.

}
error = _write_v2_footer(newSector, existing ? &existingFooter : nullptr, counter);
if (error) return error;
typename Footer::crc_type crc = computeSectorCRC(newSector);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not CRC the header as well?

Copy link
Contributor Author

@m-mcgowan m-mcgowan May 9, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The values in the header is checked fully so no need to CRC it. Also the header is invalidated when a new sector written, but we may want to also still validate the data, so the header is excluded from the CRC.

@sergeuz sergeuz removed their request for review May 10, 2017 15:07
technobly added a commit that referenced this pull request May 10, 2017
avtolstoy pushed a commit to avtolstoy/firmware that referenced this pull request May 10, 2017
@technobly
Copy link
Member

Closing due to changes being in released/v0.7.0-rc.1

@technobly technobly closed this Jul 12, 2017
@technobly technobly deleted the feature/dcd_crc branch July 12, 2017 15:00
@avtolstoy avtolstoy mentioned this pull request Dec 28, 2017
6 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants