Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Create Google Summer of Code project #185

Closed
KirstieJane opened this issue Jan 18, 2019 · 13 comments
Closed

Create Google Summer of Code project #185

KirstieJane opened this issue Jan 18, 2019 · 13 comments
Assignees
Labels
community issues related to building a healthy community

Comments

@KirstieJane
Copy link
Member

It would be great to have someone working on tedana🎉 as part of their google summer of code.

This is an issue for building up a proposal and making sure it gets submitted ✨

@KirstieJane KirstieJane added the community issues related to building a healthy community label Jan 18, 2019
@KirstieJane KirstieJane added this to the healthy community milestone Jan 18, 2019
@tsalo
Copy link
Member

tsalo commented Jan 18, 2019

Here are the tasks we would like this person to accomplish:

  1. Write unit tests for existing functions, replacing any smoke tests with more useful ones checking output values
  2. Increase coverage to 90% for 1.0.0 release (Reach 90% test coverage #69)
  3. Reincorporate regression against data into integration tests
  4. Implement checksums for CircleCI integration tests (Implement checksums for CircleCI integration tests #59)

@handwerkerd
Copy link
Member

Building on tsalo's list:
5. Smoother integration of multiple data sets and processing options for continuous integration & unit tests.
6. Turning a currently non-existent specifications list on ideal outputs for reproducibility, validity, and data quality comparisons into code that is part of https://github.com/ME-ICA/tedana-reliability-analysis
7. Improvements for static & interactive visualizations (Ideally building off existing useful visualizations rather than starting from scratch)
8. Modularization of code with a focus on creating user input options that are scalable without becoming overwhelming
9. Modularization in general so that different sections of processing aren't unnecessarily dependent on each other
10. Parallelization or increasing efficiency of slow parts of the code (I'd say this is a lower priority now, but, if this is what interests someone, efficiency & modularization can compliment each other.

@emdupre
Copy link
Member

emdupre commented Jan 18, 2019

This is a great list ! I think we'll need to pick a top 3 or 4 items from this "wish-list," just to keep the proposal focused. But it's great to brainstorm all of these out !

I wanted to share the application we put together for the BIDS Starter Kit last year, when we were going up for GSoC: https://docs.google.com/document/d/1xkZhCWv-y3QAigzq-vDYJzzMUkQygxBXy8f1s2UoBao/edit?usp=sharing

It should serve as a good template for this application.

@jbteves
Copy link
Collaborator

jbteves commented Jan 18, 2019

I'll briefly comment that I think @dowdlelt and myself are planning on hammering away really hard at the reliability analysis ahead of summer, so our (perhaps naive and optimistic) hope would be that the reliability aspect of the problem is solved by the time the student arrives. I'm also going to add my opinion that such a task is probably not a good one for a summer student because it's not a very well-defined problem, and in my (limited) experience with summer students, problem definition helps a lot.

@tsalo
Copy link
Member

tsalo commented Jan 25, 2019

Here is a draft of our application. We decided to keep it fairly limited, although I think that testing, modularization, and implementing checksums for the integration tests will be a good use of a student's time. Please let us know what you think. I believe that it's due tomorrow (sorry for being so late). Is that right, @KirstieJane?

Title:Improving unit testing and test coverage for the TE-Dependence ANAlysis (tedana) toolbox

Mentor(s): Kirstie Whitaker and Taylor Salo

Context and motivation: Traditional fMRI denoising makes a priori assumptions about the shape of noise fluctuations across time. Multi-echo fMRI (ME-fMRI) enables data-driven denoising by collecting multiple echoes in a single fMRI volume, offering a significant improvement over standard approaches. Supporting this, previous ME-fMRI denoising methods such as ME-ICA (multi-echo independent component analysis) have been shown to improve data quality. However, existing implementations lack provenance for data inclusion criteria and are difficult to extend or improve.

The tedana Python package [1][2] is designed to serve as both a canonical multi-echo denoising pipeline with robust default settings and a toolbox into which researchers can integrate new methods for denoising. In creating tedana as a Python package, we have remained committed to best-practice principles in open-source development, including extensive documentation for both users and contributors as well as an open governance structure [3].

The proposed project aims to further develop tedana’s testing suite in order to improve test coverage and to make the codebase more robust to improvements and additions. Given the development team’s current emphasis on integrated workflows, appropriate unit tests have not been written for a large portion of the core functionality of the package. The GSOC project goal is to improve test coverage of tedana by modularizing existing workflows and writing unit tests of existing code.

The GSoC student will develop their skills working with Python, employing modern software testing suites to improve reproducibility and robustness of code and will have explicit mentorship in ways of open and collaborative working using git and GitHub.

Tool description: The tedana test suite is implemented as a collection of pytest-compatible testing functions. Tests will be evaluated on the continuous integration platforms CircleCI [4] and Travis [5], and will employ coverage profilers like CodeCov [6].

Improved modularization of existing workflows will be done in conjunction with members of the tedana developer community, leveraging the community’s distributed expertise both in Python programming knowledge and familiarity with tedana.

Project description and aims: This project is aimed towards students seeking to develop their coding skills and to gain familiarity with collaborative development. The successful candidate will gain 1) real world experience engaging with a wide range of researchers and developers and 2) experience with test-driven development.

Measurable outcomes include increased test coverage of the tedana package and implemented checksums [7] in regression testing.

Skills needed/desired: Interested students should be comfortable with Python and GitHub, with a desire to learn the continuous integration platforms CircleCI [4] and Travis [5], and coverage profilers like CodeCov [6]. A basic familiarity with neuroimaging data formats and preprocessing is also desirable. A commitment to open and collaborative working is essential. All contributors to tedana are expected to comply with the tedana code of conduct at all times [8].

Key words: Python; usability; brain imaging; reproducible research

Relevant external links:

  1. https://github.com/ME-ICA/tedana
  2. https://tedana.readthedocs.io
  3. https://tedana.readthedocs.io/en/latest/contributing.html
  4. https://circleci.com
  5. https://travis-ci.org
  6. https://codecov.io
  7. https://en.wikipedia.org/wiki/Checksum
  8. https://github.com/ME-ICA/tedana/blob/master/CODE_OF_CONDUCT.md

@jbteves
Copy link
Collaborator

jbteves commented Jan 25, 2019

This looks great! Did we decide not to do data visualization as part of the project?
Additionally, from the GSOC webpage it looks like the formal due date is February 6th.

@tsalo
Copy link
Member

tsalo commented Jan 25, 2019

I think that data visualization will require additional skills (e.g., Javascript) that will make it harder to find someone qualified. @emdupre is already developing the relevant skills, so I think she's going to tackle that part. Thanks for tracking down that deadline. I honestly hadn't looked yet.

@KirstieJane
Copy link
Member Author

Thank you @tsalo and @emdupre for writing this up! I'm really excited about it.

+1 on @tsalo's answer about visualisation taking too many specific skills - and particular they're skills that I don't have so I would struggle to mentor the student!!

We're 1 week behind an internal deadline for INCF, I asked for an extension to today (Fri 25). The sponsor organisations have to meet that Feb 6 deadline 😸

@jbteves
Copy link
Collaborator

jbteves commented Jan 25, 2019

Ah, sorry, didn't realize there were a few layers in organization-- sorry, but what's INCF?

@KirstieJane
Copy link
Member Author

@jbteves - its the International Neuroinformatics Coordinating Facility (https://www.incf.org).

Here's last year's INCF GSOC page if you want to see what sort of projects they put forward last year: https://summerofcode.withgoogle.com/archive/2018/organizations/6206122851565568

The sponsorship model for GSOC is a bit of a strange one, I can't find a super clear blog post explaining why you have to have an organisation mentor the students but that's also because I'm rushing - I feel like I've read one!

@KirstieJane
Copy link
Member Author

Sent it off to Malin today ✨✨✨

It’s the weekend in Sweden now, so hopefully it will be up on Monday or Tuesday and we can start promoting. If you know anyone though, feel free to send them to this issue! 🤩🚀💃🏼

@emdupre
Copy link
Member

emdupre commented Jan 25, 2019

Amazing ! Should we leave this issue open or go ahead and close it now that the proposal is submitted ?

@KirstieJane
Copy link
Member Author

Ah, yes, I did think that after I commented. I’m happy to close the issue and so I think if @tsalo is happy then we should do just that 🤖

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
community issues related to building a healthy community
Projects
None yet
Development

No branches or pull requests

5 participants