Skip to content
This repository has been archived by the owner on Dec 27, 2022. It is now read-only.

Questions about the intro #16

Open
rauschma opened this issue Jun 6, 2019 · 4 comments
Open

Questions about the intro #16

rauschma opened this issue Jun 6, 2019 · 4 comments

Comments

@rauschma
Copy link

rauschma commented Jun 6, 2019

There were a few things I didn’t fully understand. I’m mentioning them here, in case you are interested in feedback (others may have questions similar to mine):

  • Why are schema IDs stored in metadata fields? Why not inline in the files? Why do directories have a schemas?
  • Do only my followers see my comments to a post?
  • How do the predefined paths work – don’t they limit the extensibility? Why are both schemas and predefined paths needed?
@pfrazee
Copy link
Member

pfrazee commented Jun 6, 2019

Why are schema IDs stored in metadata fields? Why not inline in the files?

I went back & forth on that question and could end up revising later. We think the hyperdrive data structure will eventually allow arbitrary queries on the metadata, and so this would enable you to fetch all "posts" from a dat quickly, etc.

The downside of using metadata is that the type doesn't get included in the file, so a copy-paste of a .json would fail to carry the information. I'm open to some discussion on this.

Why do directories have a schemas?

It's mostly important for the root directory, which identifies the entire dat. There will eventually be other types, like "application" and "module," which will drive some advanced behaviors.

Do only my followers see my comments to a post?

See https://twitter.com/pfrazee/status/1136683879950180352

How do the predefined paths work – don’t they limit the extensibility?

Most of the predefined paths include the namespace in the path:

/.data/unwalled.garden/*

That's how other schemas will get integrated. The other directories (ie /.refs) have utility outside of unwalled garden and may eventually need type IDs which are not UG specific.

Why are both schemas and predefined paths needed?

You're right that they're redundant. I'm not sure yet whether we should take advantage of the redundancy and not include individual file types or not.

@rauschma
Copy link
Author

rauschma commented Jun 7, 2019

The downside of using metadata is that the type doesn't get included in the file, so a copy-paste of a .json would fail to carry the information. I'm open to some discussion on this.

I think it’d be useful, if it were easy to back up the data to different media, create ZIP files, etc.

You're right that they're redundant. I'm not sure yet whether we should take advantage of the redundancy and not include individual file types or not.

I‘d reduce the redundancy (but my understanding is limited). The best approach may depend on what the file structure should be optimized for. Some aspects remind me of stream processing. If that’s the best way of thinking about it, then the files are only input for a database. In that case: do individual file names matter? Should they be named so that most recent data can be displayed first?

@rauschma
Copy link
Author

rauschma commented Jun 7, 2019

(Oh, I forgot to mention: I’m really excited about this project. Lots of possibilities.)

@pfrazee
Copy link
Member

pfrazee commented Jun 7, 2019

The downside of using metadata is that the type doesn't get included in the file, so a copy-paste of a .json would fail to carry the information. I'm open to some discussion on this.

I think it’d be useful, if it were easy to back up the data to different media, create ZIP files, etc.

I'll keep thinking about this, because I generally agree -- I'm looking for the most direct & simple solutions and this has some alarms going off in my head.

You're right that they're redundant. I'm not sure yet whether we should take advantage of the redundancy and not include individual file types or not.

I‘d reduce the redundancy (but my understanding is limited). The best approach may depend on what the file structure should be optimized for. Some aspects remind me of stream processing. If that’s the best way of thinking about it, then the files are only input for a database. In that case: do individual file names matter? Should they be named so that most recent data can be displayed first?

The specs originally required you to use the timestamp for the names, but that's a major pain for people who might want to manually add files with an editor. That said, the technical needs might outweigh the UX convenience on this one.

(Oh, I forgot to mention: I’m really excited about this project. Lots of possibilities.)

🚀

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants