Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What are we actually proposing? #36

Closed
ctb opened this issue Feb 26, 2016 · 24 comments
Closed

What are we actually proposing? #36

ctb opened this issue Feb 26, 2016 · 24 comments

Comments

@ctb
Copy link
Member

ctb commented Feb 26, 2016

This thing is due in three days, folks :). I can do some of the writing on Saturday while traveling but we need to nail down what, exactly, we are proposing.

Based on our pitch,

I would argue for proposing the following deliverables:

  1. a prototoype that demonstrates a vertical spike through some good practice in this area.
  2. a detailed discussion & set of links around each feature of the prototype, explaining what we and others have done, an opinionated perspective on what approaches could be used to address each problem, and a brief on why we chose the approach we used in the prototype.
  3. an exploration of what "big features" are missing from the prototoype

For the prototype, I think we've converged on establishment of initial directory structure; a declarative specification of dependencies, execution framework, and inputs/outputs for building a paper; support for CI CI; and integration with Zenodo for minting DOIs.

I'd strongly push for supporting the R ecosystem, since many biologists use R and this is a biology prize :). Between R and Python I think we get most modern biomedical scientists.

I think we need a brief discussion of the goal of enabling composition, without focusing on how -- I haven't seen convergence in that discussion yet. Please correct me if I'm wrong!


For the discussion, we just need to make sure that we discuss and document our decisions and link in projects and demos. But I don't think we need to do much about this for this round of the proposal, just point out that there are a lot of people who have done things related to our project and that we will engage with their ideas and demos and connect them into our project. This could even make a nice publication... ;)


For the third part on missing features, we should pay attention to topics like editing, diffing, and merging that are important for specific ecosystem members. The point to make in the proposal here is that we will inevitably run across great ideas that will need substantial work, so while we may not integrate them into our demo, we will record them and brainstorm about them.


The last question is how we propose to do this, or, basically, what we'll do with the money. I don't think we need to do more than sketch this out, but at least from my perspective all I'd want to do is run hackathons and support travel.

@betatim
Copy link
Member

betatim commented Feb 26, 2016

Have you had a look at the proposal.md that results from #33 ?

@ctb
Copy link
Member Author

ctb commented Feb 26, 2016

I hadn't seen some of the latest content, no! I would like to suggest shrinking the proposal quite a bit using the outline above, though; we should aim for an essential pitch of no more than two pages, decorated with ancillary information. Sound ok?

@betatim
Copy link
Member

betatim commented Feb 26, 2016

On Fri, Feb 26, 2016 at 5:03 PM C. Titus Brown [email protected]
wrote:

This thing is due in three days, folks :). I can do some of the writing on
Saturday while traveling but we need to nail down what, exactly, we are
proposing.

Based on our pitch
https://github.com/betatim/openscienceprize/blob/master/pitch.md,

I would argue for proposing the following deliverables:

  1. a prototoype that demonstrates a vertical spike through some good
    practice in this area.
  2. a detailed discussion & set of links around each feature of the
    prototype, explaining what we and others have done, an opinionated
    perspective on what approaches could be used to address each problem, and a
    brief on why we chose the approach we used in the prototype.
  3. an exploration of what "big features" are missing from the
    prototoype

For the prototype, I think we've converged on establishment of initial
directory structure; a declarative specification of dependencies, execution
framework, and inputs/outputs for building a paper; support for CI CI; and
integration with Zenodo for minting DOIs.

I'd strongly push for supporting the R ecosystem, since many biologists
use R and this is a biology prize :). Between R and Python I think we get
most modern biomedical scientists.

For me this converged on a "yes" when we agreed that the spec file would
specify how to create a markdown file from $whatever. This means we support
Rmarkdown -> you can use RStudio to create the executable paper.


The last question is how we propose to do this, or, basically, what
we'll do with the money. I don't think we need to do more than sketch this
out, but at least from my perspective all I'd want to do is run hackathons
and support travel.

See #26 for my thoughts so far. I think run our own hackathons, join
sprints/hackathons of projects we use, travel, and I need to get paid.

@betatim
Copy link
Member

betatim commented Feb 26, 2016

Yes please to shrinking, but I need help with that 😄

I actually think currently the proposal doesn't propose much more than the above outline, it is just very verbose.

@khinsen
Copy link
Collaborator

khinsen commented Feb 26, 2016

I pretty much agree with @ctb's summary of the situation.

Some further random thoughts:

  • We need to show that we are aware of the complexity of the problems, and clearly state that our prototype will not solve all of them.
  • If length is a problem (I didn't check the rules), we can cut down on the motivation and background discussion by referring to existing discussions. I can try to find some, though unfortunately I won't have much time for this over the weekend.

@betatim
Copy link
Member

betatim commented Feb 26, 2016

The absolute maximum is 15000 characters. Though I think less is more in this case.

@cranmer
Copy link
Contributor

cranmer commented Feb 26, 2016

I think there are some decent, but verbose prose that can be characterized as “preaching to the choir”. It’s an open science prize, so we don’t need to convince the judges of importance of reproducibility etc.

Proposal:

I suggest that we start intro with what are the newest technologies that are really shaping the current discussion (binder, everware) and how that has grown out of recent fundamental contributions (DOI, GitHub, docker, Jupyter). Then go directly to the proposal as a continuation of these ideas to achieve reusable workflows (or whatever our current title is). Specifically:
- vertical spike to seed integration of tools for this purpose
- composition as next major revolution in the Open & Reproducible Science revolution

On Feb 26, 2016, at 11:20 AM, Tim Head [email protected] wrote:

Yes please to shrinking, but I need help with that

I actually think currently the proposal doesn't propose much more than the above outline, it is just very verbose.


Reply to this email directly or view it on GitHub https://github.com/betatim/openscienceprize/issues/36#issuecomment-189351778.

@m3gan0
Copy link

m3gan0 commented Feb 26, 2016

if you're going to drag DOIs into this then you might as well mention ORCID iDs. Where DOI's are tracking digital objects, ORCID's let you track WHO is doing what. Can you imagine what it would be like if our online publishing and research tools hooked up to ORCID?

@cranmer
Copy link
Contributor

cranmer commented Feb 26, 2016

+1

also in addition to the python - R symmetry (particularly with bio in mind)
I think it wouldn’t be too hard to use OSF’s API to push results to OSF and get them to mint the DOI (keeping parity with Zenodo).

However, those are 2x2 forks in the vertical spike.

Kyle

On Feb 26, 2016, at 11:40 AM, m3gan0 [email protected] wrote:

if you're going to drag DOIs into this then you might as well mention ORCID iDs. Where DOI's are tracking digital objects, ORCID's let you track WHO is doing what. Can you imagine what it would be like if our online science tools hooked up to ORCID?


Reply to this email directly or view it on GitHub https://github.com/betatim/openscienceprize/issues/36#issuecomment-189363833.

@ctb
Copy link
Member Author

ctb commented Feb 26, 2016

I like - I can do first revisions on this tomorrow during travel, or work on whatever someone else does first!

Titus Brown, [email protected]

On Feb 26, 2016, at 10:31 AM, Kyle Cranmer [email protected] wrote:

I think there are some decent, but verbose prose that can be characterized as “preaching to the choir”. It’s an open science prize, so we don’t need to convince the judges of importance of reproducibility etc.

Proposal:

I suggest that we start intro with what are the newest technologies that are really shaping the current discussion (binder, everware) and how that has grown out of recent fundamental contributions (DOI, GitHub, docker, Jupyter). Then go directly to the proposal as a continuation of these ideas to achieve reusable workflows (or whatever our current title is). Specifically:

  • vertical spike to seed integration of tools for this purpose
  • composition as next major revolution in the Open & Reproducible Science revolution

On Feb 26, 2016, at 11:20 AM, Tim Head [email protected] wrote:

Yes please to shrinking, but I need help with that

I actually think currently the proposal doesn't propose much more than the above outline, it is just very verbose.


Reply to this email directly or view it on GitHub https://github.com/betatim/openscienceprize/issues/36#issuecomment-189351778.


Reply to this email directly or view it on GitHub.

@betatim
Copy link
Member

betatim commented Feb 26, 2016

Conclusion: not worth merging #35 or do you want to use that as a starting point?

@betatim
Copy link
Member

betatim commented Feb 26, 2016

The output of the vertical spike should be a rendered paper. Pushing results to other places sounds like a good idea that happen as a side effect of executing the paper. We don't want to be yet another workflow management tool, there are already enough of those and people want to use what they are used to.

@ctb
Copy link
Member Author

ctb commented Feb 26, 2016

Looks good to me! Merge!

On Feb 26, 2016, at 10:59 AM, Tim Head [email protected] wrote:

Conclusion: not worth merging #35 or do you want to use that as a starting point?


Reply to this email directly or view it on GitHub.

@anaderi
Copy link
Contributor

anaderi commented Feb 27, 2016

I'm not sure if this suggestion fits project roadmap, but maybe mentioning crowdsource science (or crowdscience, whatever horrible it may look) as long-long term goal would help the jury understanding the project idea better.

@betatim
Copy link
Member

betatim commented Feb 27, 2016

You mean R^3 would help with bringing citizen science to the next level by allowing them to participate in real research?

@anaderi
Copy link
Contributor

anaderi commented Feb 27, 2016

eventually, yes (it's just for pitching, not for proposal itself)

@betatim
Copy link
Member

betatim commented Feb 27, 2016

Ok. It is a good point.

To make progress on a concrete proposal, I think the following points are good to keep in mind:

  • it is impossible to figure out all the details upfront
  • to get funded we need to show that the core proponents know what they are doing (funding the team not the idea) and have a "vision"
    • won't bring disrepute to the funders ...
  • no one will be upset if we end up building/doing something that is cool but different (yet related) to what was originally proposed

On the one hand the proposal should be concrete but on the other hand it should be "flexible". As long as you can see that what is written doesn't exclude what you think should be done, that should be good enough. For me this means the first three and the last point in the pitch are key:

  • exciting new tech is enabling progress,
  • local and remote execution/development, making adoption/transition as easy as possible,
  • bringing the R community into the fold,
  • and building a community.

@anaderi
Copy link
Contributor

anaderi commented Feb 27, 2016

One more thing is relevant and might get better score is development high-level guidelines/methodology for conducting rrr

@ctb
Copy link
Member Author

ctb commented Feb 27, 2016

@anaderi there's a lot going on in that sphere and I am afraid of expanding the scope too much. The judges in this case are all well aware of the broader issues and I think saying that we will produce something concrete, with associated discussion, is going to be better received than another set of guidelines.

@khinsen
Copy link
Collaborator

khinsen commented Feb 27, 2016

@ctb +1

@anaderi
Copy link
Contributor

anaderi commented Feb 27, 2016

Sorry just to clarify I was not proposing to make these guidelines instead of doing something concrete. And if we aware of good example of such guidelines that we could somehow exemplify in our prototype maybe we could reference them in our proposal beforehand? Just to show that we aware of existing development in this field?

@ctb
Copy link
Member Author

ctb commented Feb 27, 2016

See #41.

@anaderi
Copy link
Contributor

anaderi commented Feb 28, 2016

Ah, cool! Thanks!

@ctb
Copy link
Member Author

ctb commented Feb 28, 2016

sorry @anaderi that wasn't in response to you, really ;). I don't have a strong idea of what to do re guidelines; if you'd start a new issue we can brainstorm on guidelines there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

6 participants