I/O with page cache simulation model #199

dohoangdzung · 2020-09-16T01:22:36Z

Block, MemoryManager classes are added.
List of MemoryManager objects is added to Simulation.
readFromHost, writeToHost is added to Simulation.

henricasanova · 2020-09-16T21:11:00Z

include/wrench/simulation/Simulation.h

+        void readFromHost(WorkflowFile *file, double n_bytes, std::string mountpoint, std::string hostname);
+        void writeToHost(WorkflowFile *file, std::string mountpoint, std::string hostname);
+        void writeToHost(WorkflowFile *file, double amount, std::string mountpoint, std::string hostname);
+        MemoryManager* getMemoryManagerByHost(std::string hostname);


From what I think I understand these read/write methods are not doing anything over the network, correct? These are supposed to be local reads from the local memory manager, which maintains a cache for local files? If this is the case, I think that the name is a bit confusing. How about renaming: readFromMemoryCache() or something like that, without passing a hostname argument? Then also rename getMemoryManagerByHost() as simply getMemoryManager() (without a hostname argument). A service can always call Service::getHostname() to find out the name of the host its on. But perhaps I am wrong. It's just that "readFromHost(..., std::string hostname)" really looks like it's for a remote read...

Yeah I agree that readFromHost/writeToHost is somehow confusing, but readFromMemoryCache sounds like it simply reads from memory, so how about readWithMemoryCache/writeWithMemoryCache.

The reason I need to pass the hostname to these functions is to know which host we're reading/writing data (they're functions in Simulation). And then I need to call getMemoryManagerByHost(hostname) to get the MemoryManager on that host (from the list of memory managers)

Another minor thing is that I should change filename in Block to fileId, so it will be consistent with WRENCH.

Thanks for your quick review by the way! I hope that this can help answer to your concerns.

Yes, i think that readWithMemoryCache/writehWithMemoryCache is probably good for now.

About the hostname, I am confused. I mean, these methods would be called by a running process/actor/service, so your reading from that same host. The only time you need to pass a host name is when you need to do something on a host that's not the one you're currently running on. I guess, what I am asking, is, a process running on "host1" would ever need to call getMemoryManager("host2") or readWithMemoryCache("host2")? If yes, then great. But if not, then there is no need for the host argument. Let me know what you think.

And sure, fileid is probably better.

Yeah you're right that the methods should ideally be called from the same service running on the same host, so hostname is not necessary.
The reason I did that is, the current code in FileTransferThread calls to read/write methods in Simulation, so I did the same thing, write read/write methods in Simulation. I thought that it would have less impact on current code.
So technically, you can call read/write from another the host but the network time is not simulated in our model, so it doesn't work.
I think a better place for read/write methods is in a service that can read/write data (I'm not sure if it is StorageService or simply Service)

I understand, but then let's remove the host name argument, since every call will should look like: f(...., Service::getHostname()). It will be too tempting to call these methods for a remote host, which as you say is not simulated. For instance, the "compute()" method doesn't take a hostname either, even though we could technically do a compute("other host") call (which is, as in your case, not simulated).

As far as where to put these methods, it's a bit tricky. We tend to put methods that just simulated some use of the hardware in the Simulation method. But these are a bit different. For now, Simulation is ok. Another option is to make these methods methods in the MemoryManager class?

I guess I am getting confused here, but in the code of these methods, what's the problem with looking up the hostname? I mean, what I am saying is, instead of:

your_method(...., std::string hostname) { ....}

we can have:

your_method(...) { std::string hostname = S4U_getHostname(); ... }

Is this wrong?

Yeah I think my problem is I don't know the context of the "calling process" and I'm not sure I can get the correct hostname with Simulation::getHostName(). So I did it that way to make sure it works.

For example, if the call is from FireTransferThread::sendLocalFileToNetwork -> Simulation::readWithCache(), what will be the returned value Simulation::getHostName() called inside Simulation::readWithCache()?

Yes, that makes sense (by the way, we're arguing about a small detail here, I mean overall everything looks awesome).

In taht example you give, the host name you'll get will be the name of the host on which the calling file transfer thread runs, that is, the source of the transfer. By the way, if you run the simulation with --wrench-full-log, you'll see for each process what hosts it runs on (which is what gethostName() returns).

How about this? You run your code, but add a check in your methods (is hostname = getHostName()), and see if you see any problems. I am 99.99% sure that getHostName() returns exactly what you passed as the hostname argument in all situations. I am saying 99.99% because WRENCH has become complicated ;)

Sounds great! I tried and it works, actually there is only 1 host in my experiment, but you know the framework.
So I think I'll fix as we discussed and make another PR.

sounds good. and if, unexpectedly, it doesn't work, then we'll revert and you'll have my apologies :) Can't wait to see this merged in.

henricasanova

I've entered a comment about the naming of the methods. See what you think. But overall it looks good to me. One thing we should do at some point, once this is merged in, is write wrench::S4U_xxx wrappers, so that there is no "raw" s4u calls at all. But there are other places in WRENCH where we do have direct s4u calls without a wrapper anyway, so this is totally fine for now.

Let me know what you think about my renaming methods comment

Dung Do added 19 commits August 4, 2020 17:15

Add Block and MemoryManager class, pdflush runs in background

df3f254

Add argument to activate writeback simulation

aa60e2d

Memory Manager: add cache_read, cache_write, lru list update

853cd1d

Memory Manager: flush, pdlfush, eviction without disk activities

59f8c5f

add pdflush, refine code

a5d8c2e

Refine code

69d312d

Add memory managers to Simulation, add Simulation::readFromHost()

5af8c79

Add Simulation::writeToHost

872e448

Fix file write algo

65ae217

Add MemoryManager to SImulation

7081d09

Fix chunk read/write

629ed54

Add memory log functions

70d06a2

Fix mem log

927c88f

Fix used anonymous mem

949cca0

Fix anonymous memory used

edf25e4

Add dirty time to Block

4b4caf0

Fix periodical flushing

b430369

Fix pdflushing

f363f47

Fix eviction, string compare

037a345

rafaelfsilva requested review from henricasanova and rafaelfsilva September 16, 2020 01:25

rafaelfsilva assigned rafaelfsilva and henricasanova Sep 16, 2020

rafaelfsilva added the feature label Sep 16, 2020

rafaelfsilva added this to the v1.8 milestone Sep 16, 2020

henricasanova reviewed Sep 16, 2020

View reviewed changes

Dung Do added 3 commits September 17, 2020 21:37

rename functions, remove redundant params

277cd23

Fix use/release free memory

80799d8

Fix anonymous memory used amount

a522727

dohoangdzung closed this Sep 18, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

I/O with page cache simulation model #199

I/O with page cache simulation model #199

dohoangdzung commented Sep 16, 2020

henricasanova Sep 16, 2020

dohoangdzung Sep 16, 2020 •

edited

Loading

henricasanova Sep 16, 2020

dohoangdzung Sep 16, 2020

henricasanova Sep 17, 2020

henricasanova Sep 17, 2020

dohoangdzung Sep 17, 2020 •

edited

Loading

henricasanova Sep 17, 2020

dohoangdzung Sep 17, 2020

henricasanova Sep 17, 2020

henricasanova left a comment

I/O with page cache simulation model #199

I/O with page cache simulation model #199

Conversation

dohoangdzung commented Sep 16, 2020

Choose a reason for hiding this comment

dohoangdzung Sep 16, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

dohoangdzung Sep 17, 2020 • edited Loading

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

henricasanova left a comment

Choose a reason for hiding this comment

dohoangdzung Sep 16, 2020 •

edited

Loading

dohoangdzung Sep 17, 2020 •

edited

Loading