-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
config.json and log.json can get out of sync, preventing startup of the credential manager #14
Comments
At the moment, I am leaning toward adding a credential validation check early in With this solution, there is still a valid argument to be made regarding consistency and fault tolerance in the face of network failure. Many of the operations in this library involve updating files both in the same repo and in separate repos (i.e., status credential in the main repo and config or log file in the metadata repo). Considering that this library will likely be used in large scale course distributions, such as Harvard's CS50, we may want to consider a mechanism that implements transaction rollback when one operation of a logical transaction is interrupted. @dmitrizagidulin your thoughts are welcome here. |
I have done a bit more thinking on this over the last couple of days and I think I have a solution that is an extension of the locking/serialization work that I have already done with this library. The main update would be to check core invariants in the relationship between the three status documents (status credential, log file, config file) prior to releasing the lock in transactional functions. If any of the invariants fail, restore the value of these three documents, release the lock to allow other processes a chance to execute, and try again. The main code delta would involve storing the current values of these documents in memory at the beginning of all transactional functions and restoring/retrying as needed. @jchartrand @dmitrizagidulin thoughts on this approach? |
One potential problem might still be that if a write is made to one of the files (e.g., config) and then the network cuts out (e.g, Github goes down) before you can also write to the other files (e.g, log), and the network stays down for a while, then you might not be able to revert the write to the config.
Worse might be that you write to the config file and then the machine on which the code is running crashes before you can write to the log, so again you can’t revert the write to the log.
Or is your lock actually in the GitHub repo itself and so you can fix things up on restart?
Would another option be to combine the config file and log file into a single file? So you’d have a single atomic write?
If the revocation list itself (the bit vector) gets out of whack that seems a little less important because at least it won’t prevent the whole system from continuing to run. At worst it might mean that a position didn’t get revoked when it should have.
… On Aug 7, 2023, at 12:31 PM, Kayode Ezike ***@***.***> wrote:
I have done a bit more thinking on this over the last couple of days and I think I have a solution that is an extension of the locking/serialization work that I have already done with this library. The main update would be to check core invariants in the relationship between the three status documents (status credential, log file, config file) prior to releasing the lock. If any of the invariants fail, restore the value of these three documents, release the lock to allow other processes a chance to execute, and try again. The main code delta would involve storing the current values of these documents in memory prior to beginning any parallel operation and retrying as needed. @jchartrand <https://github.com/jchartrand> @dmitrizagidulin <https://github.com/dmitrizagidulin> thoughts on this approach?
—
Reply to this email directly, view it on GitHub <#14 (comment)>, or unsubscribe <https://github.com/notifications/unsubscribe-auth/AAEFSXLOIZVVWGURUFGRMMLXUEJ6FANCNFSM6AAAAAA26R3AT4>.
You are receiving this because you were mentioned.
|
Resolution: we will apply my proposal from above, plus a modification that addresses @jchartrand's concerns: instead of temporarily saving the repo data locally, we will temporarily save it in the metadata repo, such that a disrupted client service that finds this data on restart understands that it needs to restore the repo data to this previous state. Additionally, we will combine the config and log files to prevent them from getting out of sync, per @jchartrand's recommendation. |
Tested. |
Keeping open until after deployment |
Deployed to both Google Play and App Store (release 2.1.0-build80), closing ticket. |
If I pass a badly formed VC to the allocateStatus method, it fails, but leaves the repos in a state such that when I try to restart, it gives me an error. I think the problem is that when allocating a status, the manager first updates the config, and in particular increments the number of credentials issued and saves this back to Github:
credential-status-manager-git/src/credential-status-manager-base.ts
Lines 215 to 218 in 076921b
And then when it goes on to issue the credential, that fails, and so the credential never gets issued and the log doesn't get updated, which leaves the number of credentials issued in the config ('credentialsIssued') one greater than are in the log. Which is what later I think causes the error when restarting, specifically because of this check:
credential-status-manager-git/src/credential-status-manager-base.ts
Line 530 in 076921b
Fundamentally I think the problem is that the two operations (updateConfig and updateLog) aren't atomic and so the log and config can get out of whack. I could imagine this causing other problems too.
If the 'credentialsIssued` in the log is only used to know when the current list is full and so that a new list needs to be created, then maybe better to just calculate the # of credentials issued from the log?
The text was updated successfully, but these errors were encountered: