Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove temporary piece file generation from go-fil-markets for OFFLINE deals #4901

Open
hannahhoward opened this issue Nov 18, 2020 · 7 comments
Assignees
Labels
area/data-transfer P2 P2: Should be resolved

Comments

@hannahhoward
Copy link
Contributor

This is just a place holder for figuring out, once we complete #4898, a way to not use temporary files for offline deals as well as online deals. I'm not sure the best approach to do this.

@hannahhoward hannahhoward added P2 P2: Should be resolved need/analysis Hint: Needs Analysis area/markets Area: Markets labels Nov 18, 2020
@dineshshenoy dineshshenoy added this to the 💹Storage Deal Success milestone Nov 18, 2020
@dineshshenoy dineshshenoy removed this from the 💹Storage Deal Success milestone Nov 25, 2020
@dirkmc
Copy link
Contributor

dirkmc commented Dec 2, 2020

Currently ImportDataForDeal

  • copies from a reader to a temporary file
  • generates a commP using the temp file
  • verifies the commP it matches the proposal piece CID
  • fires ProviderEventVerifiedData with the path to the temp file

Then in the HandoffDeal a Reader is created from the file.

Instead we could change ImportDataForDeal to

  • import the deal data directly from the Reader into the deal store
  • verify commP
  • delete the deal data if verification fails

From then on handle the deal the same way as if the data had come across the network.

@hannahhoward is it easy to delete the data for a deal if verification fails?

@hannahhoward
Copy link
Contributor Author

@dirkmc what does "import the deal data directly from the Reader into the deal store" -- do you mean a block store here? the problem here is we have a reader, which is literally just bytes. It's likely a CAR file, but NOT gauranteed to be generated in a way we can replicate in Lotus. We cannot attempt to reproduce a more semantic version of the data here.

My believe is there is no way to avoid a temporary file as long as we use a reader (we can certainly use an OS pipe + TeeReader, to unlock two seperate reads, but then you're just delegating the temp file to the OS). We could certainly change it to a file path. That has the unfortunate down side of leaving us to trust that the person running import does not delete the file in between running the command and getting to the HandoffDeal state.

It is worth mentioning @Mikael who can provide more context on why we can't attempt to regenerate the CAR file, and @whyrusleeping who may have insights as the person who wrote the code originally.

Also though, at the end of the day this just may not be a high priority item given it's difficulty.

BTW, we use a thing called a multistore to isolate blocks for a deal from other deals, which means that yes, we can and do delete deal data when verification fails.

@ribasushi
Copy link
Collaborator

ribasushi commented Dec 2, 2020

@hannahhoward you highlighted the wrong @mikeal

On commp calculation - you do not need to create the entire .car on disk, which I believe @magik6k did point out earlier on slack. As long as you have the entire dag in a blockstore already, you simply stream out the car in the usual left-first depth-first seen-first order, and directly calculate commp over the raw .car bytes that stream out of that walk.

@jennijuju jennijuju removed the need/analysis Hint: Needs Analysis label Jan 19, 2021
@jennijuju jennijuju added area/data-transfer and removed area/markets Area: Markets labels Feb 11, 2021
@jennijuju
Copy link
Member

jennijuju commented Feb 11, 2021

Seems to be related to #5291

@ribasushi
Copy link
Collaborator

@raulk @dirkmc is this addressed by the current deal-data flow rework?

@frrist
Copy link
Member

frrist commented Jul 23, 2021

Did some exploration around this issue as a part of related work. As Hannah has stated above, temporary files are required unless we are comfortable trusting that a user will not modify/delete the file they imported until it has been sealed into a sector. This is due to file importing and sector sealing happening asynchronously -- importing the file is near instance, sealing into a sector takes significantly longer.

Until a solution is decided on the workaround I intended to implement is:

@dirkmc
Copy link
Contributor

dirkmc commented Jul 23, 2021

@raulk @aarshkshah1992 does the solution outlined by @frrist make sense to you guys or is this something that's going to be covered by the DAG store work?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/data-transfer P2 P2: Should be resolved
Projects
None yet
Development

No branches or pull requests

7 participants