Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Decentralized discovery for object store instances #105

Closed
14 tasks done
iramiller opened this issue Feb 24, 2021 · 7 comments · Fixed by #150
Closed
14 tasks done

Decentralized discovery for object store instances #105

iramiller opened this issue Feb 24, 2021 · 7 comments · Fixed by #150
Assignees
Labels
metadata Metadata Module security Security related request/issue
Milestone

Comments

@iramiller
Copy link
Member

iramiller commented Feb 24, 2021

Summary

A decentralized discovery method for object stores to connect to each other using the blockchain

Problem Definition

Existing P8e object stores use a centralized system for connection and discovery. With the transition to a decentralized public network this central relay point will no longer exist. A method for an address to publish a list of endpoint(s) where partner object stores can authoritatively discover and connect is required.

Proposal

Create a record on the blockchain that is controlled by an address and allows it to publish records indicating endpoints available for partners to connect to.

  1. Add a new structure and store on chain under the owner address
  2. Add appropriate read/write methods that allow the owner address to maintain a given entry
  3. Provide a method that returns a list of records for a scope by linking against the "data_access" list of addresses that control which accounts the off-chain information should be provisioned for.

Implementation

The following simple structure will be used to hold a reference for a locator endpoint for a given account. The identified account will be the one that owns/controls the record and must sign requests to modify it.

message ObjectStoreEndpoint {
  // account address the endpoint is owned by
  string owner = 1;
  // locator endpoint uri
  string locator_uri = 2;
}
  • Create a new proto message for ObjectStoreEndpoint
  • Implement ValidateBasic for ObjectStoreEndpoint
  • Add NewObjectStoreEndpoint functions for creating an instance of ObjectStoreEndpoint
  • Create test suite to cover basic validation logic
  • Implement appropriate Stringer interface method, add test case
  • Create associated keeper file for ObjectStoreEndpoint.
    • Create a byte code in the keys.go file for the new type, add methods to create a kvstore key against the address of the user appended to this new type type byte.
    • Get/Set/Remove methods for object in state store (include events and instrumentation publishing).
    • Special query method that returns all of the registered endpoints for a scope / data_access list
  • Wire up msg_server and query_server endpoints to new keeper methods.

For Admin Use

  • Not duplicate issue
  • Appropriate labels applied
  • Appropriate contributors tagged
  • Contributor assigned/self-assigned
@iramiller iramiller added security Security related request/issue metadata Metadata Module labels Feb 24, 2021
@iramiller iramiller added this to the 0.1.5 milestone Feb 24, 2021
@iramiller
Copy link
Member Author

This issue/proposal should be reviewed by @scirner22 for object store implementation requirements as well as @rjmarkel for security sign-off.

@scirner22
Copy link
Contributor

scirner22 commented Feb 24, 2021

@mlatimer-figure can you look at this as well? The background here is that object-store mailbox can go away if public third party object stores can communicate directly with each other. We're planning on using the blockchain to broadcast a public_key (address) that is owned by a given object-store instance.

A summary of my comments from slack over to this issue:

In my mind there's two things that will be reaching out to fetch items from a given object-store:

  • another object-store
  • any machine that's owned by the same org as another object-store

As an owner of an object-store there's ample security in place to have the instance be completely public and allow application level security and encryption to handle data. That being said it's better practice to not have these instances be open to the public and leverage a firewall instead. As an example let's say object-store A is sharing data with object-store B. Object-store B would be privy to this fact when it reads scope events from chain and notices that key A attached key B on the data_access list or is a member of the party list. B knows object-store A's address because it has seen a block like Ira outlined above. Based on the two bullet points above, imo B will reach out to A either via the object-store B directly, or from a machine inside of A's. public or private network. A can correctly share with B and whitelist them by reading B's object-store block and whitelisting object-store B's IP and also by whitelisting any NATs that are listed.

So in my mind

repeated string outbound_cidr = 4;

should become

repeated string addr = 4; // where addr is the object-store's IP address and possibly all NAT addresses coming out of this object-stores infrastructure.

Because of this I don't see any value of allowing cidr blocks and only more complications around validating that a bad actor isn't trying to get you to whitelist a range that is far too large or even whitelisting everything (is 0.0.0.0/1 the largest valid range?)

@iramiller
Copy link
Member Author

iramiller commented Feb 24, 2021

addr = 4;

We want to avoid this abbreviation if possible due to existing use within other areas of the blockchain.

bad actor isn't trying to get you to whitelist a range

All ranges are suggestions, no object store should accept these without verification and compliance with local constraints and configuration

The use of the CIDR suffix really only makes sense with IPv6 addressing where its use as a /64 is strongly encouraged.

Getting a read from @rjmarkel on these aspects specifically is why he was tagged.

@scirner22
Copy link
Contributor

@iramiller after discussing this with Latimer he doesn't think it's good practice to ever broadcast a range or a NAT address for that matter. Should we keep this strictly to object-store's public ip for now and think more about how to expand that in the future?

@iramiller
Copy link
Member Author

Should we keep this strictly to object-store's public ip for now and think more about how to expand that in the future?

That might be the most prudent approach ... another one to consider is that the endpoint we publish here maybe shouldn't even be the object store itself ... it could be just a service to ask for the connection details that would return results signed by the key associated with the address..

@iramiller
Copy link
Member Author

iramiller commented Feb 25, 2021

Based on the service locator endpoint idea the on chain record could be streamlined extensively

message ObjectStoreLocator {
  // account address the endpoint is owned by
  string owner = 1;
  // locator endpoint uri
  string locator_uri = 2;
}

@arnabmitra arnabmitra self-assigned this Mar 1, 2021
@arnabmitra
Copy link
Contributor

i think we will be making this in time for 0.20, moving to 0.30 :(

@arnabmitra arnabmitra modified the milestones: 0.2.0, 0.3.0 Mar 4, 2021
@iramiller iramiller linked a pull request Mar 17, 2021 that will close this issue
8 tasks
arnabmitra added a commit that referenced this issue Mar 19, 2021
* This PR add's "A decentralized discovery method for object stores to connect to each other using the blockchain"
See #105 for more details.

Co-authored-by: Ira Miller <[email protected]>
@iramiller iramiller moved this from Todo to Done in Provenance Core Protocol Team Jun 1, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
metadata Metadata Module security Security related request/issue
Projects
Development

Successfully merging a pull request may close this issue.

3 participants