Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The trace datasets are unavailable #12

Closed
PureWhiteWu opened this issue Aug 16, 2023 · 4 comments
Closed

The trace datasets are unavailable #12

PureWhiteWu opened this issue Aug 16, 2023 · 4 comments

Comments

@PureWhiteWu
Copy link

Dropbox shows that all three files are not available.
Could you please upload those files to the repo(maybe using git-lfs?)

https://www.dropbox.com/sh/9ii9sc7spcgzrth/z9qaItlfnw/Papers/ARCTraces/DS1.lis

@PureWhiteWu
Copy link
Author

Or is there any other way I can get those datasets?
Thanks!

@tatsuya6502
Copy link
Member

tatsuya6502 commented Aug 17, 2023

Hi. For now, I put all the ARC datasets I have to an Amazon S3 bucket. Please download them from here: https://arc-ds-2023.s3.ap-northeast-1.amazonaws.com/arc-dataset.txz (49.5 MB) (Deleted. See this comment for the new location)

Then expand the txz:

## NOTE: You might need to install `xz` (xz-utils) in addition to 
## `tar` to expand a `txz` file.

$ tar xf arc-dataset.txz

Later, I will put them to a (new?) Git repository with git-lfs.

I am redistributing them with a permission from Dr. Dharmendra S. Modha, a coauthor of the ARC papers. I contacted Dr. Modha after seeing your message last night, and he kindly allowed me to publish them with a reference to ARC papers.

A referent to an ARC paper:

Nimrod Megiddo and Dharmendra S. Modha, "ARC: A Self-Tuning, Low Overhead Replacement Cache," USENIX Conference on File and Storage Technologies (FAST 03), San Francisco, CA, pp. 115-130, March 31-April 2, 2003.

I do not have some of the datasets such as ConCat.lis. If I remember correctly, their download links were already broken when I tried to download few year ago. Anyway, the above arc-dataset.txz should have enough stuff to get started.

Also note that current version of moka has a large hard-coded buffer (diagram). So you should avoid to test moka with smaller datasets such as OLTP.lis (11 MB), otherwise you will get a skewed result with very high cache hit rate as the entire dataset may fit into the buffer.

@PureWhiteWu
Copy link
Author

Thank you very much for this!

@tatsuya6502
Copy link
Member

FYI, I moved the trace files here: https://github.com/moka-rs/cache-trace/tree/main/arc

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants