-
Notifications
You must be signed in to change notification settings - Fork 37
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Collecting statistics #77
Comments
Originally this work was done to iptb prior to the transition to using plugins. After the transition, we wanted to provide a generic way to handle the implementation of The solution to this was to add a generic way to capture stats around However, I think recording and reporting the elapsed execution time is probably a useful enough thing on it's own that we should probably just add it to everything that uses the generic reporting. If we are exposing the elapsed time as output, I think it provides enough information to calculate different statistics outside of iptb itself. There are two other piece though that I think also need to be touched on
Real time metrics are another thing (cpu, ram, etc) and I'm open to discussion around these. To summarize, I think a simple approach to supporting this use case at first is to add a elapsed time for all outputs along side the exit code, and adding the ability to return output as json. Metrics can be collected independently as the user sees fit. |
I think that designing features such as #75 #76 #77 moves IPTB into a direction that puts a lot of additional weight on IPTB. General ThoughtsMy thoughts on the subject as a user of the project and as a developer in general is: I think there is a room for such features because there are a lot of projects (OpenBazzarr etc) that want to measure the performance of IPFS and add it as a component of their system #50 #26 . I also got involved to the project to measure the performance of IPFS because it was a crucial component in my system. The question that remains is if the core development team want to take this burden or choose to leave it to the users. For me both options have benefits and it highly depends on the time that core devs have available. I understand that you may want to spend time developing time to develop/improve IPFS/libp2p than IPTB. It's a decision that core devs should make since they have a more holistic view on IPFS milestones. Personally, I trust you to make a good decision. OutputI agree as far as the elapsed time is concerned, the current implementation of elapsed time is robust. I would prefer having the output as a JSON file or txt file after the individual results as it makes it easier to parse. Something like this:
This will make the parsing from other programs easier compared to having the individual results per node. Metrics
StatsProviding the basic stats based on elapsed time is basically a free primitive in terms of development and computational cost from my perspective. The same does not hold for calculating stats on cc @dgrisham |
@davinci26 thanks for writing all of this out! I want to respond to it all, but won't be able to for 12 hours. I did want to comment quickly though about the output
I want to provide an easy way to parse, but I don't want to mix that with the human readable text if we can. One way to solve this would be to support an output encoding (ex: One of the things I did like about the original idea for a "stats" flag, was it provide an easy way to get just the stats out without also interfering with the other output of the command. It actually provided a really interesting way to interact with iptb for stat gathering purposes. I wrote a small python script which would read from stdin (could be any file I guess), and parse each line and calculate some basic stats. To connect it up to iptb, I made a named pipe. Every iptb command I ran would print the stats out to the named pipe. On the other end of the pipe was the python script. So for every command I ran through iptb, it would print the stats in another window. (Example) $ mkfifo stats
$ iptb run --stats ./stats -- ipfs id In another window tail -f ./stats | python stats.py This provides a really easy way to collect some output and run whatever calculations you want over it. I'm just not sure exactly what we want to be in the output, or if this is exactly the way to do it. One possibility is to have a event logging around the Core interface which would provide a much more detailed look into what is happening everywhere around the plugin. This would be a much more generic implementation and I think would provide users with almost everything they would need, or at least in an easy to extend way. Basically what method on the plugin that was invoked, and what it was called with. Script import sys
import statistics
import json
print("MEAN\tSTDEV\tVARIANCE\n")
for line in sys.stdin:
try:
jline = json.loads(line.rstrip())
except ValueError:
continue
nums = [o['elapsed'] for o in jline['results']]
mean = statistics.mean(nums)
stdev = statistics.stdev(nums)
variance = statistics.variance(nums)
print('{:.2f}\t{:.2f}\t{:.2f}'.format(mean, stdev, variance)) |
Some thoughts (will respond with more as things percolate):
|
See #65 for complete history
Original comment by @davinci26
The text was updated successfully, but these errors were encountered: