Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[cc26x2x7] Implement Key Value Store #7472

Merged
merged 5 commits into from
Jun 21, 2021

Conversation

srickardti
Copy link
Contributor

Problem

Change overview

Adds basic key value storage implementation for the TI platform.

Testing

Manual testing. Rendezvous was performed and the OpCSR was observed in NV with the CCS debugger.

* an NV sub ID is 10 bits long. Consult
* `<simplelink_sdk>/source/ti/common/nv/nvocmp.c:NVOCMP_MAXSUBID`.
*/
static uint16_t calc_key_10(const char * key)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is clever and no doubt expedient for this platform. I think we should all consider though that this type of implementation precludes ever implementing KVS iterators (get all key=value pairs) or filters (get all key=value pairs, where key matches PATTERN), unless we are willing to break compatibility with implementations like this to achieve such things.

It also seems like with only 10 bits to work with, users and stakeholders for this platform would benefit from some sort of unit test to check for likely key collisions.

As to the CRC algorithm itself, it might be worth specifying this more completely (polynomial, xor in and out, etc.). Once this is deployed for this platform, it can't really be changed. It would give some confidence that this is the CRC algorithm it claims to be if there was a pointer to some reputable snippet of source code, or if it could be shown that this produces the same output as e.g. source code from this code generator:

git clone https://github.com/tpircher/pycrc
python3 pycrc/pycrc.py --generate c -o other_crc_ccitt_impl.c --algorithm=bbf --model=crc-16-ccitt

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll definitely add that python reference, I didn't know it existed. This was the result of some googling for a CRC.

The underlying driver for this implementation can support iterators, we made it to support those goals when OpenThread asked for it. But those implementations knew what they were storing and knew if there were multiple instances of an item. That is information that can be known at compile time, but is obfuscated here by a string concatenation. It may be possible to do something like scanf to reconstruct that information at runtime.

This implementation is natural on a platform like Linux with a file system that can handle arbitrary paths to files. However, choices like this are going to freeze out existing embedded SoCs That do not have the space for a filesystem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I see. Yes, you're right; if you know the possible key space, you can still iterate. I hope that's sufficient, and can see why it's important to support this type of implementation. And anyway, this implementation is conformant to the current kvstore API, so can't really be called on that account.

Glad you may potentially find pycrc useful.

However, choices like this are going to freeze out existing embedded SoCs That do not have the space for a filesystem.

Things like fat12 are pretty micro friendly, but then you still only have 8.3 file names. So still no solution for our key lengths.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The FS we supply for internal storage in our SDK is spiffs. This increases the number of Flash pages for storage from 2 (16Kbytes) to 3 (24Kbytes) and adds the ~20 Kbytes of code increase for the software itself. Not to mention our BLE stack would have to be modified to use the file system, or some shim code would have to be added to support the existing storage API.

I will look into a way to store the keys in a parallel item ID, but that implementation will be pretty inefficient.

VerifyOrReturnError(key, CHIP_ERROR_INVALID_ARGUMENT);
VerifyOrReturnError(value, CHIP_ERROR_INVALID_ARGUMENT);

return CC13X2_26X2Config::ReadKVS(calc_key_10(key), value, value_size, read_bytes_size, offset_bytes);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

How much storage space do we have?
I wonder if we could store both key and value so that on 'Get' (and maybe also put) we can validate for colisions.

For 10 bits I believe we have about 50% of collisions when using around 38 different keys.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There should be a 40% chance of collision with 511 keys, (2^9). you're correct, the underlying driver was not really made for this, I can increase the keysize by about 8 bits by using the major key bits.

The key size of the key is not a problem for storage, but concatenating them would have to be done in memory because appending to an NV item is not supported. Alloc'ing a certificate plus the key would not be fun.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Being able to have a 16 or 18-bit key seems like that would make this a much more workable solution, at least in the near term. And the more elaborate solutions (introducing the available filesystem, incurring the resulting increased flash space and refactoring other code to use this, OR reallocing buffers to embed the key) definitely present tough tradeoffs.

Is it difficult to add use of the major key bits? That seems like a good compromise to get something working immediately.

Copy link
Contributor

@andy31415 andy31415 Jun 10, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am actually not sufficiently up to date with birthday collisoion formulas (https://en.wikipedia.org/wiki/Birthday_problem) so I got my stats for 50% collision via a python script.

My point is: I believe there is a high chance of collision, much faster than number of available keys will indicate (I believe the 40% collision at 511 key is way too optimistic and stand by my 50% chance on only 38 keys). I assume keys behave like rand(1, 1024) for our estimation purposes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the platform could provide some 'preallocated numbers map' instead, that requires manually adding keys and would fail otherwise. This would ensure no colision.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is my test:

import random
from collections import Counter

def test(l):
   data = random.choices(range(1024), k=l)
   return len(data) == len(set(data))

print(Counter([test(38)  for x in range(10000)]))
print(Counter([test(511)  for x in range(10000)]))

511 random keys seem to have near 100% colision chance.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be specific:

x = 1
for i in range(39):
     x = x * (1024-i)/1024.0
print(x) # 0.4805249249401561

x = 1
for i in range(512):
     x = x * (1024-i)/1024.0
print(x) # 8.299705841330013e-69

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if the platform could provide some 'preallocated numbers map' instead, that requires manually adding keys and would fail otherwise. This would ensure no colision.

That was the original design before the KeyValueStorage was added.

I must have been confused on the birthday problem.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case we would need a map of string -> integer if we keep KVS "char *" support.

Not sure if this is practical though: I believe for node data, we format things like "prefix_" as key names currently and given nodeid is random it cannot be preallocated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @andy31415 , I removed the hashing algorithm. It didn't seem like a good solution when everyone pointed out the problems. Now the implementation is a pair of parallel arrays that store the keys and values.

@srickardti
Copy link
Contributor Author

I implemented a commit with the full 16 bit CRC by using the itemID as part of the subID. Currently this is HEAD^.

I also implemented a parallel list version. This can store 2^10 key/value pairs without any hashing. But it does maintain two NV items per key/value pair. I think this is more brittle, but it does remove the dependency on a hashing function.

Copy link
Contributor

@woody-apple woody-apple left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like good stuff, but I don't see any unit tests added for pretty key functionality here. Can we put something in place?

@srickardti
Copy link
Contributor Author

This looks like good stuff, but I don't see any unit tests added for pretty key functionality here. Can we put something in place?

I don't think there is a unit test framework for this platform. But if you can point me to the documentation I would be happy to add one.

Is this the test driver in src/test_driver or the platform tests src/platform/tests/TestKeyValueStoreMgr.cpp or would this be the test application in examples/persistent-storage.

@andy31415
Copy link
Contributor

With 16-bit keys, 50% collision chance is at 300 keys. Less than 1% collision chance on 35 keys or less.

I would still feel a lot better if we could either detect colison or ask humans to create a curated list.

for (uint16_t i = 0; i < 0x3FF; i++)
{
key_item.subID = i;
if (sNvoctpFps.getItemLen(key_item) == 0U)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we need to worry about valid zero-length values being stored and then treated as available slots?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That would be a key of zero length. I can't store a key/value pair with a NULL key.

But yes for the underlying driver, writing a value of length zero is invalid. Do we have key/value pairs that would only be a key?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, good point about empty key, ok!

As far as values go: so far I think we don't have any empty values, based on a brief audit, but I don't see anything in the API that prevents them. Maybe we should document something about values needing to be nonempty.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, OK, great. :)

@srickardti
Copy link
Contributor Author

Added support for the persistent-storage test application.

@srickardti srickardti requested a review from woody-apple June 11, 2021 18:28
@srickardti srickardti requested a review from andy31415 June 15, 2021 13:36
@woody-apple
Copy link
Contributor

/rebase

@woody-apple
Copy link
Contributor

/rebase

Create a key itemID and a value itemID within the config module. The subIDs of
each of these are kept in synch to allow arbitrary key creation.
@srickardti
Copy link
Contributor Author

@woody-apple have my changes addressed your request?

@woody-apple woody-apple merged commit 82e6290 into project-chip:master Jun 21, 2021
nikita-s-wrk pushed a commit to nikita-s-wrk/connectedhomeip that referenced this pull request Sep 23, 2021
* Implement Key Value Store

* Widened KVS crc to 16 bits

* Implement parallel key value storage

Create a key itemID and a value itemID within the config module. The subIDs of
each of these are kept in synch to allow arbitrary key creation.

* Add persistent storage example for cc26x2x7

* fix print formatting for persistent-storage app
@srickardti srickardti deleted the master branch February 25, 2022 18:12
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[cc26x2x7] OpCSR is not saved
5 participants