Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix/spiffs format #1313

Closed
wants to merge 2 commits into from
Closed

Conversation

jonnykl
Copy link

@jonnykl jonnykl commented Jan 14, 2018

The code erased one sector of the memory too many

@slaff
Copy link
Contributor

slaff commented Jan 15, 2018

@jonnykl How the issue that you are fixing can be reproduced?

@jonnykl
Copy link
Author

jonnykl commented Jan 15, 2018

  1. Copy Basic_Serial example and remove code for UART1
  2. If you want to, enable spiffs, but do not add any file (then you would flash in step 5 a formatted FS and spiffs_format_internal will not be called)
  3. Add spiffs_mount(); at the end of init()
  4. Run make flashinit to make sure there is no SPIF filesystem, thus spiffs_format_internal gets called
  5. Run make flash

As you can see the "available" flash memory will be formatted and everything works. But when you power off the board and reconnect it nothing works anymore. It sends random bytes and also other code do not execute properly.

  1. Now run esptool.py --port YOUR_PORT read_flash 0xfc000 4096 corrupt-data
    0xfc000 is the address where you would flash esp_init_data_default.bin on a 8 Mbit flash. According to the datasheet this is the address of the fourth-to-last sector. This is the first sector which should not be erased in spiffs_format_internal.
  2. Run hexdump corrupt-data | less and you can see it's filled with 0xff

I'm using a ESP-01 with 8Mbit flash memory.

@jonnykl
Copy link
Author

jonnykl commented Mar 25, 2019

@slaff Why did you close this PR? The bug is still not fixed yet ..

@slaff
Copy link
Contributor

slaff commented Mar 25, 2019

The bug is still not fixed yet ..

Sorry, I will re-open it and try to test it till the end of the week.

@slaff slaff reopened this Mar 25, 2019
@@ -64,10 +64,10 @@ bool spiffs_format_internal(spiffs_config *cfg)
sect_first = cfg->phys_addr;
sect_first = flashmem_get_sector_of_address(sect_first);
sect_last = cfg->phys_addr + cfg->phys_size;
sect_last = flashmem_get_sector_of_address(sect_last);
sect_last = flashmem_get_sector_of_address(sect_last) - 1;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the correct code would be:

  sect_last = cfg->phys_addr + cfg->phys_size - 1;
  sect_last = flashmem_get_sector_of_address(sect_last);

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The first value assigned to sect_last is the physical address of the first byte which does not belong the the SpifFS. If cfg->phys_addr + cfg->phys_size is a multiple of INTERNAL_FLASH_SECTOR_SIZE (which is 4096), then it would make no difference but it would be logically wrong.
Assume phys_addr+phys_size would not be a multiple of INTERNAL_FLASH_SECTOR_SIZE (e.g. INTERNAL_FLASH_SECTOR_SIZE/2). Then you subtract one byte from sect_last - now sect_last points to the last byte which belongs to SpifFS. Now flashmem_get_sector_of_address() calculates the sector as follows: address / INTERNAL_FLASH_SECTOR_SIZE. The resulting sector is the same as if you did not subtract 1 before (because of the division of integers). In this case you would override/delete the data from the address cfg->phys_addr + cfg->phys_size to the end of the sector.
If you subtract one from sect_last after the "second step" (calculating the sector) you will not override other data.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I like good discussions especially when they lead to results. I still think that Mike is right here. I will post below my reasoning and you can follow up with your arguments.

Let's have the following setup:
Sector size: 5
Start address: 4
Size: 2

image

Mike's calculation:

sect_last = cfg->phys_addr + cfg->phys_size - 1;  // 4 + 2 - 1 = 5 <-- last byte inclusive is the 5th one
sect_last = flashmem_get_sector_of_address(sect_last); // 5/5 = 1  <-- the sector for the 5th byte is 1.

Which means that sector 0 and sector 1 have to be erased. Which IMHO is correct.

Now your solution suggest the following

sect_last = cfg->phys_addr + cfg->phys_size; // 4 + 2 = 6
sect_last = flashmem_get_sector_of_address(sect_last) - 1; // (6/5) - 1 = 1 -1 =  0  <-- the sector that has to be erased is 0

Which means that only sector 0 has to be erased. But that cannot be right because byte 5 has to be written to sector 1, which will not be erased here. Am I missing something here?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No you are not missing something. But I think it's better to not use a block instead of deleting other data (here bytes 6,7,8,9). Another thing I didn't consider is the same aspect but for cfg->phys_addr (so also bytes 0,1,2,3 are deleted). In your example should no block be erased, because both blocks may contain other data which may not be erased. I think it would be reasonable to check this in overrides.c:

#ifdef RBOOT_SPIFFS_0
    #if (((RBOOT_SPIFFS_0) % (INTERNAL_FLASH_SECTOR_SIZE)) != 0)
        #error "RBOOT_SPIFFS_0 has to be a multiple of INTERNAL_FLASH_SECTOR_SIZE"  // or warning
    #endif
#endif

#ifdef RBOOT_SPIFFS_1
    #if (((RBOOT_SPIFFS_1) % (INTERNAL_FLASH_SECTOR_SIZE)) != 0)
        #error "RBOOT_SPIFFS_0 has to be a multiple of INTERNAL_FLASH_SECTOR_SIZE"  // or warning
    #endif
#endif

#if (((SPIFF_SIZE) % (INTERNAL_FLASH_SECTOR_SIZE)) != 0)
    #error "SPIFF_SIZE has to be a multiple of INTERNAL_FLASH_SECTOR_SIZE"  // or warning
#endif

When the developer tries using values which will delete other data it prevents him from doing this or just get warned and does not have to search for the problem.

@mikee47
Copy link
Contributor

mikee47 commented Mar 25, 2019

I'm using a ESP-01 with 8Mbit flash memory.

@jonnykl Have you added RBOOT_SPIFFS_0 to user makefile? (I guess 0xFE000 would work for 64M partition.)

@slaff
Copy link
Contributor

slaff commented Mar 26, 2019

Ok, I analyzed the problem a bit and here is what I can say: There is a problem with the automatic formatting, but this PR does not suggest proper fix.

The problem originates in the overrides.c and the function below.

/*
 * rBoot uses different spiffs organization and we need to override this method
 * during application compile time  in order to make automatic
 * mounting with `spiffs_mount()` work as expected.
 */
spiffs_config spiffs_get_storage_config()
{
  spiffs_config cfg = {0};
#ifdef RBOOT_SPIFFS_0
  cfg.phys_addr = RBOOT_SPIFFS_0;
#elif RBOOT_SPIFFS_1
  cfg.phys_addr = RBOOT_SPIFFS_1;
#else
#error "Define either RBOOT_SPIFFS_0 or RBOOT_SPIFFS_1"
#endif
  cfg.phys_size = SPIFF_SIZE;
  cfg.phys_erase_block = INTERNAL_FLASH_SECTOR_SIZE; // according to datasheet
  cfg.log_block_size = INTERNAL_FLASH_SECTOR_SIZE * 2; // Important to make large
  cfg.log_page_size = LOG_PAGE_SIZE; // as we said
  return cfg;
}

In it the cfg.phys_size = SPIFF_SIZE; is not correct. Because it does not account for the offset, defined in cfg.phys_addr, and also the fact that the last 4 sectors must be left intact.

A better solution would be to change the cfg.phys_size to be calculated as:

cfg.phys_size = INTERNAL_FLASH_SIZE - ( ( u32_t )cfg.phys_addr);

In that case the SPIFFS will start from the desired location and will use all available FLASH memory minus the system four sectors and minus the start offset.

That code can be improved further. Instead of auto-creating a SPIFF that takes all available and allowed FLASH the SPIFF_SIZE can be used to provide a sane default for the SpifFS length. So we can have something like this:

u32_t maxAllowedEndAddress = INTERNAL_FLASH_SIZE;
u32_t requestedEndAddress =  cfg.phys_addr + SPIFF_SIZE;

if(requestedEndAddress > maxAllowedEndAddress) {
    debug_w("The requested SPIFFS size is too big.");
    cfg.phys_size = maxAllowedEndAddress -  ( ( u32_t )cfg.phys_addr);
}
else {
    cfg.phys_size = SPIFF_SIZE;
}

Any comments?

@jonnykl
Copy link
Author

jonnykl commented Mar 26, 2019

Without checking your fix (i will do it later and comment it) there is still a logic error in spiffs_format_internal().
sect_first points to the first sector of SpifFS. sect_last points to the sector after the last sector of SpifFS. total is the number of sectors to be erased (which is calculated correctly). In the following while-loop all sectors including the "last" sector (which is the sector after the real last sector of SpifFS) will be erased (because of <= instead of <).
Let's assume cfg->phys_size would equal 0. Then sect_first equals sect_last. So the body of the while-loop would be executed exactly once and erase the sector sect_first. But in this case no sector should be erased (because the size is zero).

@slaff
Copy link
Contributor

slaff commented Mar 26, 2019

I might be wrong but for me the formatting itself is correct. The total and the percentage calculations are wrong if they have to represent the percentage after an erase and not before that.

Zero size

But in this case no sector should be erased (because the size is zero).

I agree, zero sizes are special case that should be handled better. The following needs to be added:

bool spiffs_format_internal(spiffs_config *cfg)
{
  if (cfg->phys_addr == 0)
  {
	SYSTEM_ERROR("Can't format file system, wrong address given.");
	return false;
  }

  // new check...
  if (cfg->phys_size == 0)
  {
	SYSTEM_ERROR("Can't format file system, wrong size given.");
	return false;
  }

For the total and percentage fixes:

Further the calculation of the total is not exactly correct. It should be:

int total = sect_last - sect_first  + 1;

And the percentage calculation needs updating:

int percent = ++cur * 100 / total;

We can test if the changes will work using the following test setup:

Pre: Sector size is 5 bytes (for easier calculation)
Use-case 1: Start address 0, Size: 0
Use-case 2: Start address 5, Size 0
Use-case 3: Start address 5, Size 13

Use-case 1

Will return false due to invalid start address

Use-case 2

Will return false due to invalid size

Use-case 3

Start address 5, Size 13
first sector: start_address / sector_size = 5/5 = 1 (1 is the second sector. The first sector is with index 0).
last sector: (13 + 5) / 5 = 3

The total calculation:

int total = sect_last - sect_first  + 1;

Gives us 3 - 1 + 1 = 3. Total of 3 sectors have to be erased. Sectors with indexes 1, 2 and 3. Which is correct.

For the first iteration the percentage will be calculated using:

int percent = ++cur * 100 / total;

1 * 100/ 3 = 33 %. Which is correct.

@mikee47
Copy link
Contributor

mikee47 commented Mar 26, 2019

What about size=1 ?

@jonnykl
Copy link
Author

jonnykl commented Mar 26, 2019

@jonnykl Have you added RBOOT_SPIFFS_0 to user makefile? (I guess 0xFE000 would work for 64M partition.)

At the time of creation of this PR I did not use rBoot. So I also did not add RBOOT_SPIFFS_0 to user makefile. Because projects without rBoot are now deprecated I tested it now with rBoot.
To reproduce the bug you have to set RBOOT_SPIFFS_0 to an address after the end of the code/data. I tried 0x0e0000 for my 8Mbit flash. SPIFF_SIZE has to be set to the maximum allowed size. This is INTERNAL_FLASH_SIZE - RBOOT_SPIFFS_0. For my 8Mbit flash this is 0x01c000.
Make sure you don't add any files in the user makefile to the SpifFS which are flashed with the SpifFS image to the board. Only if an empty SpifFS image is flashed, spiffs_mount will call spiffs_format_internal (at the first boot). You can just disable SpifFS in the user makefile to make sure an empty SpifFS image will be flashed.
Now you should see that the fourth-last sector has been erased (check this with esptool.py as described in my post from 2018-01-15).

# For the total and percentage fixes:
Further the calculation of the total is not exactly correct. It should be:

int total = sect_last - sect_first + 1;

No the calculation of the total was correct. I think the name of the variable sect_last is misleading because it suggests the sector it points to is inclusive but (without my fix) it's exclusive, it's pointing to the sector after the "real" last sector. Just substitute the variables with their previous definitions:

int total = sect_last - sect_first;
int total = flashmem_get_sector_of_address(cfg->phys_addr + cfg->phys_size) - flashmem_get_sector_of_address(cfg->phys_addr);
int total = (cfg->phys_addr + cfg->phys_size)/INTERNAL_FLASH_SECTOR_SIZE - (cfg->phys_addr)/INTERNAL_FLASH_SECTOR_SIZE;
// and assuming cfg->phys_addr and cfg->phys_size are multiples of INTERNAL_FLASH_SECTOR_SIZE ...
int total = (cfg->phys_size)/INTERNAL_FLASH_SECTOR_SIZE;

This makes sense -> you would only then have to add 1 if sect_last would be inclusive.

And the percentage calculation needs updating:

int percent = ++cur * 100 / total;

That's correct. This also indicates that there is bug at the formatting. Now (with cur++ instead of ++cur) 100% should never be printed, but it is. And that is because one sector too many is erased.

What about size=1 ?

For size=1:
first sector: 1
last sector: (1 + 5) / 5 = 1
total = 1 - 1 + 1 = 1

This seems correct but then you would erase 5-1=4 (block_size-size) which does not belong to SpifFS. So this sector should not be excluded. Maybe it's a good idea to replace SPIFF_SIZE with SPIFFS_NUM_SECTORS (or something similar) from which the size in bytes can be calculated (if needed at all).

And what about size=n*sector_size (n is a natural number including zero)?
E.g.:

size=1*sector_size=5
first sector: 1
last sector (5 + 5) / 5 = 2
total = 2 - 1 + 1 = 2

That's not correct because we set size to 1*sector_size, so a total of one sector would be the expected result.

@slaff slaff closed this in #1653 Mar 28, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants