-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Finalize SIGMOD 2024 paper ~(if accepted)~ #8373
Comments
Here is what we submitted: |
I can check results and update them, please assign it to me. |
FWIW the notification deadline was yesterday but I have not heard anything one way or the other (and the CMT tool doesn't say one way or the other). I will email the chairs tomorrow if we haven't heard by then |
I emailed the chairs today and they said the notification will be delayed a few days. Will post updates here as I have them. |
Thanks for the update 🚀 |
Thank you @alamb |
The paper was accepted to SIGMOD! 🎉 I'll spend some time reviewing the comments later this week and we can organize action items for the final draft
|
Cool! Congrats to all! |
This is great news! congrats all! |
Congratulations everyone ! |
Great news! Congratulations to all involved! |
Congratulations everyone ! |
Here is the reviewer feedback Reviewer #2
Reviewer #5
Reviewer #7
|
It appears we have about 2 months to complete the final draft
Here is a summary of my suggested action items based on the reviewer feeback above
Here are some other notes I have The main criticism / weakness cited is that DataFusion doesn't demonstrate sufficient technical novelty other than integration of various existing ideas. I think this is a very valid point, and maybe we should re-emphasize the point more that it isn't technical novelty of any part, but the overall system.
This is a good point that would be good to work in
I agree this would be an interesting point, but given that we are already at the 12 page limit I am not sure how to do so in this particular paper. Maybe these would make good follow on papers or blog posts (@appletreeisyellow and I could potentially write one on how InfluxDB uses PruningPredicates 🤔 ) |
To update the draft I'm assuming we can just reuse the same overleaf project? we'd be happy to touch a bit more on the Comet side, and update the sentence 😂
|
Yea, as Comet now is open sourced, we can explicitly mention the project (with project link) and more details about it. |
Yes, please, let's use the same overleaf project
Yes please that would be great -- and it will also address some of the reviewer feedback suggesting more details on usecases |
I think this analogy is very useful -- let's keep it. In my experience it also resonates with technical folks very well. Since this feedback seems like an outlier in terms of reception, I suggest we improve other aspects of the paper. |
@JayjeetAtGithub is there any chance you can update your email address to one that works (rather than the influxdata one that does not)? Also, it would be great if someone could work on cleaning up the bibliography. |
Yeah, basically that's what we strive for, thanks! |
@alamb I updated my affiliation and email to that of UC Santa Cruz, my university. |
An update here: I plan to take a pass through the draft the week of March 4 and implement the bulk of any feedback that was not yet implemented. After that week I'll likely take a few proofreading passes, but I don't expect to do any major revisions I also don't plan to rerun benchmarks again due to lack of time. While the benchmark runs themselves are nicely automated thanks to @JayjeetAtGithub, analyizing the results takes significant time and research. |
I also make the labels and series colors consistent in the scalability chart: JayjeetAtGithub/datafusion-duckdb-benchmark#26 |
Update: I plan to complete the final outstanding item from the reviewers (adding some technical details about memory pool and related APIs) tomorrow, and then I will move on to wordsmithing / honing for a few days. I am sure we (at least I) could obsess over the content indefinitely, but I think we need to just "ship it" eventually and we are getting very close |
Update: @viirya cleaned up some of the language, and I tweaked the images to make the internal margins even: I also received a bunch of instructions from the ACM, so changed the title so the "A" wasn't capitalized
I reworked this section with more details / references: |
I also took a pas through the bibliography to make the style consistent |
TLDR -- please make any updates you would like / need by this weekend (Sunday March 24, 2024). Also, fellow authors, please ensure your names / affiliations are correct on the paper as I will submit that as part of the paper as well Update:
|
Here is an email about ACM formatting guidelines in case anyone following along is interested in how this process works Dear Author, Please remember to submit "Apache Arrow DataFusion: A Fast, Embeddable, Modular Analytic Query Engine," for publication in the proceedings and ACM DL. SIGMOD'24 Authors. You are receiving this initial email request to format and submit your final version (paper and promo video) per the information below (on or before EoD April 12th)
The following is a direct link to submit based on your unique submission ID: (REMOVED) Thank you, Sheridan Communications |
I took another pass through the paper. In addition to some word smithing and whitespace engineering, I increased the size of the abstract both so the front page didn't look as empty but also to summarize the content of the paper (in addition to its conclusion / main point) to help readers decide if the paper was interesting to them Here is the current text
Here is what it looks like |
I made it through Section 6 today, and I plan to start at Section 7 tomorrow for a final read through / polish. Starting Monday I just plan to do whitespace engineering / proofreading |
Ok, I did a final read / wordsmith / whitespace engineering on the last few sections. I will plan to do a proofreading pass or two over the next few days, but don't plan to change anything unless there is some grammatical issue. The current copy of the paper is here: (getting very close) |
I am on the home stretch -- I plan to do a final proofreading of Section 8, 9, 10, and 11 and then submit the draft tomorrow |
Thank you Andrew! Sorry I couldn't spent more time on the paper. |
No problem! I think we are all quite busy -- I also want to submit the paper so I can let it go (my OCD tendencies would be to edit it indefinitely, which is not good :) ) |
Here is the digital rights form that was submitted: 13731_1_1.pdf Here is the final draft: Here are the source files: I am working to upload this to the CMT tool -- once I get confirmation they got it and it is accepted I'll (finally!) close this issue. Thanks all |
We needed to make a few tweaks on the final manuscript to conform to the publisher's rules
I made the requested edits (layouts of author names, remove $15, and update the short authors and submitted a new draft: Here is the next draft DataFusion Query Engine - SIGMOD 2024-SOURCE-mk2.zip (BTW I am sharing this not because I think anyone really cares, but because I thought others might be interested in how this process works) |
And apparently I still didn't get it entirely correct:
|
"third time's the charm" |
Needed to tweak the title to have a
|
I remember we have discussed this Thanks for dealing with the tweak. |
No problem -- it seems like this process is still quite manual 😆 -- hopefully we have it all right now 🤞 |
I think it is finally accepted 🎉
|
UPDATE: Final paper: https://dl.acm.org/doi/10.1145/3626246.3653368 (alternate download)
¯### Is your feature request related to a problem or challenge?
@JayjeetAtGithub @Dandandan @yjshen @ozankabak @sunchao and @viirya and submitted a paper to the SIGMOD 2024 conference, which was tracked by #6782
If our paper is accepted, this ticket tracks follow on work items to complete prior to the final copy
For the Industrial Track the dates are:
Describe the solution you'd like
Here are the items I know so far:
Nice to haves:
fetch_arrow_table
instead offetchall
for DuckDB JayjeetAtGithub/datafusion-duckdb-benchmark#25datafusion 33
andduckdb 0.9.2
) Write DataFusion paper for (SIGMOD / VLDB / ICDE) #6782 (comment))Results
section with the new results, updating the query textual description if neededDescribe alternatives you've considered
No response
Additional context
No response
The text was updated successfully, but these errors were encountered: