Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

adopt flake8 maccabe complexity metric #4380

Merged
merged 1 commit into from
Oct 18, 2021

Conversation

bjlittle
Copy link
Member

@bjlittle bjlittle commented Oct 18, 2021

🚀 Pull Request

Description

This PR introduces the flake8 mccabe algorithm to provide a metric on iris code complexity.

This empirical metric can then form part of the discussion between a PR author and reviewer/s with regards to the complexity of code being proposed.

According to McCabe, code with a score greater than 10 is deemed too complex. Currently iris has a mccabe score of 50.

Adopting the maccabe metric will allow us to target and reduce complexity within iris, but also be aware of any potential rise in complexity.

As per flake8, the complexity warnings/errors issued by mccabe can be explicitly ignored, if so deemed suitable and appropriate.


Consult Iris pull request check list

Copy link
Contributor

@trexfeathers trexfeathers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What a neat idea! It has my vote.

@bjlittle do you think this could do with a second review from someone else? Feels like it could play a significant role in our development going forward.

@bjlittle
Copy link
Member Author

Bumping the complexity down to 1 makes interesting reading...

@rcomer
Copy link
Member

rcomer commented Oct 18, 2021

I attempted to read the wikipedia article. It says there that each module shouldn’t have a score more than 10. Makes sense to me that this would be measured on a per-module basis, so is the Iris score of 50 just a maximum across modules or something?

@jamesp
Copy link
Member

jamesp commented Oct 18, 2021

Interesting! Is it possible to get some examples from the iris codebase along with their complexity scores? Reading the wikipedia link @rcomer provided:

Cyclomatic complexity may also be applied to individual functions, modules, methods or classes within a program.

Do we know what level this tool applies? From the output shown on the project page it certainly does per-function analysis

@trexfeathers
Copy link
Contributor

trexfeathers commented Oct 18, 2021

so is the Iris score of 50 just a maximum across modules or something?

Do we know what level this tool applies? From the output shown on the project page it certainly does per-function analysis

If you try flake8 with this enabled, it calls out any individual function/method that's over the threshold.

@bjlittle
Copy link
Member Author

The way in through flake8 only seems to offer a global maximum, but we could certainly refine as a next step to make things more granular (and that might mean using maccabe directly and not through flake8, but we can tease that one out).

At a minimum, a global maximum is the least that you want, if you buy into what it's offering.

@trexfeathers
Copy link
Contributor

Yeah a really cool version (not expecting this right now!) would be some kind of commit-to-commit regression (like coverage or ASV), but also specific to each object. Right now we have the latter, but not the former.

@bjlittle
Copy link
Member Author

bjlittle commented Oct 18, 2021

Note that, most projects that I've seen out there simply use the maximum-complexity entry point offered by flake8... otherwise you open yourself up to a whole other level of complexity management and maintenance... dare I say it, but making the use of maccabe itself more complex.

This PR is a minimum bar, not a wholistic granular approach. For me, that's a separate conversation and a separate follow-on pull-request (if we want to take that route).

I just want people to be aware that this kind of metric is easily available, and overall I think it's a good thing 👍

@jamesp
Copy link
Member

jamesp commented Oct 18, 2021

Seems good. Perhaps we enable it, and setup a reminder to review at a peloton in a month or so to discuss what we should do with this new info?

@bjlittle
Copy link
Member Author

bjlittle commented Oct 18, 2021

BTW here the woeful tale of iris complexity (unsorted - but above a score of 10 to reduce the noise)...
./lib/iris/_concatenate.py:796:5: C901 '_ProtoCube.register' is too complex (22)
./lib/iris/_constraints.py:264:5: C901 '_CoordConstraint.extract' is too complex (17)
./lib/iris/_merge.py:1397:5: C901 'ProtoCube._define_space' is too complex (16)
./lib/iris/aux_factory.py:971:5: C901 'OceanSigmaZFactory._check_dependencies' is too complex (16)
./lib/iris/aux_factory.py:1408:5: C901 'OceanSg1Factory._check_dependencies' is too complex (13)
./lib/iris/aux_factory.py:1609:5: C901 'OceanSFactory._check_dependencies' is too complex (11)
./lib/iris/aux_factory.py:1803:5: C901 'OceanSg2Factory._check_dependencies' is too complex (13)
./lib/iris/coords.py:571:5: C901 '_DimensionalMetadata.xml_element' is too complex (12)
./lib/iris/coords.py:1139:5: C901 'Cell.__common_cmp__' is too complex (19)
./lib/iris/coords.py:1936:5: C901 'Coord.collapsed' is too complex (11)
./lib/iris/cube.py:814:5: C901 'Cube.__init__' is too complex (14)
./lib/iris/cube.py:1614:5: C901 'Cube.coords' is too complex (12)
./lib/iris/cube.py:2420:5: C901 'Cube.__getitem__' is too complex (15)
./lib/iris/cube.py:2677:5: C901 'Cube._intersect' is too complex (15)
./lib/iris/cube.py:3053:5: C901 'Cube.slices' is too complex (12)
./lib/iris/cube.py:3213:5: C901 'Cube._xml_element' is too complex (29)
./lib/iris/cube.py:3551:5: C901 'Cube.collapsed' is too complex (18)
./lib/iris/cube.py:3790:5: C901 'Cube.aggregated_by' is too complex (14)
./lib/iris/cube.py:3999:5: C901 'Cube.rolling_window' is too complex (12)
./lib/iris/iterate.py:20:1: C901 'izip' is too complex (15)
./lib/iris/iterate.py:178:5: C901 '_ZipSlicesIterator.__init__' is too complex (11)
./lib/iris/palette.py:57:1: C901 '_default_cmap_norm' is too complex (11)
./lib/iris/palette.py:216:1: C901 '_load_palette' is too complex (15)
./lib/iris/plot.py:47:1: C901 '_get_plot_defn_custom_coords_picked' is too complex (18)
./lib/iris/plot.py:135:1: C901 '_get_plot_defn' is too complex (17)
./lib/iris/plot.py:291:1: C901 '_check_bounds_contiguity_and_mask' is too complex (12)
./lib/iris/plot.py:388:1: C901 '_draw_2d_from_bounds' is too complex (15)
./lib/iris/plot.py:485:1: C901 '_draw_2d_from_points' is too complex (16)
./lib/iris/plot.py:928:1: C901 '_map_common' is too complex (12)
./lib/iris/util.py:439:1: C901 'reverse' is too complex (11)
./lib/iris/util.py:1155:1: C901 'as_compatible_shape' is too complex (21)
./lib/iris/util.py:1434:1: C901 '_is_circular' is too complex (12)
./lib/iris/analysis/__init__.py:189:1: C901 '_dimensional_metadata_comparison' is too complex (24)
./lib/iris/analysis/__init__.py:1257:1: C901 '_weighted_percentile' is too complex (12)
./lib/iris/analysis/__init__.py:1422:1: C901 '_peak' is too complex (19)
./lib/iris/analysis/__init__.py:2170:5: C901 '_Groupby._compute_shared_coords' is too complex (18)
./lib/iris/analysis/_grid_angles.py:138:1: C901 'gridcell_angles' is too complex (21)
./lib/iris/analysis/_interpolation.py:598:5: C901 'RectilinearInterpolator.__call__' is too complex (17)
./lib/iris/analysis/_regrid.py:48:1: C901 '_regrid_weighted_curvilinear_to_rectilinear__prepare' is too complex (28)
./lib/iris/analysis/_regrid.py:612:5: C901 'RectilinearRegridder._regrid' is too complex (20)
./lib/iris/analysis/_regrid.py:826:5: C901 'RectilinearRegridder._create_cube' is too complex (13)
./lib/iris/analysis/_scipy_interpolate.py:97:5: C901 '_RegularGridInterpolator.__init__' is too complex (12)
./lib/iris/analysis/calculus.py:459:1: C901 'curl' is too complex (17)
./lib/iris/analysis/cartography.py:350:1: C901 'area_weights' is too complex (15)
./lib/iris/analysis/cartography.py:551:1: C901 'project' is too complex (24)
./lib/iris/analysis/cartography.py:1024:1: C901 'rotate_winds' is too complex (18)
./lib/iris/analysis/geometry.py:22:1: C901 '_extract_relevant_cube_slice' is too complex (17)
./lib/iris/analysis/stats.py:18:1: C901 'pearsonr' is too complex (13)
./lib/iris/analysis/trajectory.py:193:1: C901 'interpolate' is too complex (30)
./lib/iris/analysis/trajectory.py:523:1: C901 '_nearest_neighbour_indices_ndcoords' is too complex (19)
./lib/iris/common/metadata.py:1355:1: C901 'metadata_filter' is too complex (15)
./lib/iris/common/metadata.py:1499:1: C901 '_factory_cache' is too complex (17)
./lib/iris/common/resolve.py:358:5: C901 'Resolve._as_compatible_cubes' is too complex (11)
./lib/iris/common/resolve.py:819:5: C901 'Resolve._free_mapping' is too complex (18)
./lib/iris/common/resolve.py:1920:5: C901 'Resolve._prepare_points_and_bounds' is too complex (20)
./lib/iris/common/resolve.py:2208:5: C901 'Resolve.cube' is too complex (13)
./lib/iris/experimental/raster.py:95:1: C901 'export_geotiff' is too complex (14)
./lib/iris/experimental/regrid.py:160:1: C901 '_cropped_bounds' is too complex (18)
./lib/iris/experimental/regrid.py:406:1: C901 '_regrid_area_weighted_array' is too complex (21)
./lib/iris/experimental/regrid.py:646:1: C901 '_regrid_area_weighted_rectilinear_src_and_grid__prepare' is too complex (22)
./lib/iris/experimental/regrid.py:1225:5: C901 '_ProjectedUnstructuredRegridder._create_cube' is too complex (13)
./lib/iris/experimental/regrid_conservative.py:130:1: C901 'regrid_conservative_via_esmpy' is too complex (11)
./lib/iris/experimental/stratify.py:57:1: C901 'relevel' is too complex (13)
./lib/iris/experimental/ugrid/load.py:140:1: C901 'load_meshes' is too complex (12)
./lib/iris/experimental/ugrid/load.py:328:1: C901 '_build_mesh' is too complex (18)
./lib/iris/experimental/ugrid/mesh.py:392:5: C901 'Connectivity._validate_indices' is too complex (14)
./lib/iris/experimental/ugrid/mesh.py:631:5: C901 'Mesh.__init__' is too complex (15)
./lib/iris/experimental/ugrid/mesh.py:730:5: C901 'Mesh.from_coords' is too complex (12)
./lib/iris/experimental/ugrid/mesh.py:963:5: C901 'Mesh.__repr__' is too complex (13)
./lib/iris/fileformats/_ff.py:674:5: C901 'FF2PP._extract_field' is too complex (14)
./lib/iris/fileformats/netcdf.py:206:1: C901 'parse_cell_methods' is too complex (16)
./lib/iris/fileformats/netcdf.py:437:1: C901 '_assert_case_specific_facts' is too complex (11)
./lib/iris/fileformats/netcdf.py:660:1: C901 '_load_aux_factory' is too complex (20)
./lib/iris/fileformats/netcdf.py:826:1: C901 'load_cubes' is too complex (15)
./lib/iris/fileformats/netcdf.py:1380:5: C901 'Saver._add_mesh' is too complex (13)
./lib/iris/fileformats/netcdf.py:1734:5: C901 'Saver._get_dim_names' is too complex (26)
./lib/iris/fileformats/netcdf.py:2247:5: C901 'Saver._create_generic_cf_array_var' is too complex (16)
./lib/iris/fileformats/netcdf.py:2459:5: C901 'Saver._create_cf_grid_mapping' is too complex (27)
./lib/iris/fileformats/netcdf.py:2663:5: C901 'Saver._create_cf_data_variable' is too complex (34)
./lib/iris/fileformats/netcdf.py:2906:1: C901 'save' is too complex (25)
./lib/iris/fileformats/cf.py:1138:5: C901 'CFReader._build_cf_groups' is too complex (20)
./lib/iris/fileformats/dot.py:79:1: C901 'save_png' is too complex (12)
./lib/iris/fileformats/name_loaders.py:272:1: C901 '_cf_height_from_name' is too complex (11)
./lib/iris/fileformats/name_loaders.py:395:1: C901 '_generate_cubes' is too complex (22)
./lib/iris/fileformats/name_loaders.py:965:1: C901 'load_NAMEIII_version2' is too complex (35)
./lib/iris/fileformats/name_loaders.py:1189:1: C901 'load_NAMEIII_trajectory' is too complex (32)
./lib/iris/fileformats/pp_save_rules.py:100:1: C901 '_general_time_rules' is too complex (17)
./lib/iris/fileformats/pp_save_rules.py:408:1: C901 '_grid_and_pole_rules' is too complex (13)
./lib/iris/fileformats/pp_save_rules.py:654:1: C901 '_vertical_rules' is too complex (17)
./lib/iris/fileformats/nimrod_load_rules.py:96:1: C901 'units' is too complex (16)
./lib/iris/fileformats/nimrod_load_rules.py:613:1: C901 'attributes' is too complex (14)
./lib/iris/fileformats/nimrod_load_rules.py:716:1: C901 'probability_coord' is too complex (20)
./lib/iris/fileformats/pp.py:645:1: C901 '_data_bytes_to_shaped_array' is too complex (16)
./lib/iris/fileformats/pp.py:1125:5: C901 'PPField.save' is too complex (28)
./lib/iris/fileformats/pp.py:1967:1: C901 '_convert_constraints' is too complex (13)
./lib/iris/fileformats/pp_load_rules.py:48:1: C901 '_convert_vertical_coords' is too complex (14)
./lib/iris/fileformats/pp_load_rules.py:517:1: C901 '_epoch_date_hours' is too complex (12)
./lib/iris/fileformats/pp_load_rules.py:959:1: C901 '_all_other_rules' is too complex (50)
./lib/iris/fileformats/um/_fast_load.py:158:1: C901 '_convert_collation' is too complex (12)
./lib/iris/fileformats/_nc_load_rules/actions.py:251:1: C901 'action_build_dimension_coordinate' is too complex (16)
./lib/iris/io/__init__.py:337:1: C901 'save' is too complex (11)
./lib/iris/tests/__init__.py:864:5: C901 'IrisTest_nometa.check_graphic' is too complex (17)
./lib/iris/tests/__init__.py:1063:5: C901 'IrisTest_nometa.assertDictEqual' is too complex (12)
./lib/iris/tests/idiff.py:174:1: C901 'step_over_diffs' is too complex (19)
./lib/iris/tests/test_concatenate.py:23:1: C901 '_make_cube' is too complex (14)
./lib/iris/tests/test_netcdf.py:1007:5: C901 'TestNetCDFSave.test_attributes' is too complex (11)
./lib/iris/tests/test_imports.py:19:5: C901 'TestImports.itree' is too complex (14)
./lib/iris/tests/integration/fast_load/test_fast_load.py:77:5: C901 'Mixin_FieldTest.fields' is too complex (20)
./lib/iris/tests/runner/_runner.py:104:5: C901 'TestRunner.run' is too complex (11)
./lib/iris/tests/stock/_stock_2d_latlons.py:117:1: C901 'sample_2d_latlons' is too complex (11)
./lib/iris/tests/unit/fileformats/netcdf/test_Saver__ugrid.py:30:1: C901 'build_mesh' is too complex (12)
./lib/iris/tests/unit/fileformats/nc_load_rules/actions/test__grid_mappings.py:24:5: C901 'Mixin__grid_mapping._make_testcase_cdl' is too complex (21)
./lib/iris/tests/unit/fileformats/nc_load_rules/actions/test__grid_mappings.py:235:5: C901 'Mixin__grid_mapping.check_result' is too complex (17)
./lib/iris/tests/unit/fileformats/nc_load_rules/actions/test__time_coords.py:45:5: C901 'Mixin__timecoords__common._make_testcase_cdl' is too complex (19)
./lib/iris/tests/unit/fileformats/nc_load_rules/actions/test__time_coords.py:169:5: C901 'Mixin__timecoords__common.check_result' is too complex (11)
./lib/iris/_representation/cube_printout.py:168:5: C901 'CubePrinter._ingest_summary' is too complex (20)

Overall... it's not too bad. Interesting reading though 🤔

@trexfeathers
Copy link
Contributor

Going over the threshold results in a CI failure. But presumably we agree there might be one or two cases where we are OK with the complexity. In such cases, do we:

  • Merge with a CI fail?
    OR
  • Add the # noqa tag to the offending object?

@rcomer
Copy link
Member

rcomer commented Oct 18, 2021

From @bjlittle’s posted output, there is only one offender with a score over 30. So if something else goes over 50, that would indicate a genuine problem I think.

@bjlittle
Copy link
Member Author

bjlittle commented Oct 18, 2021

@trexfeathers Great question...

It's early days, and this is all new to us. I'd happily opt for just rolling with it on a PR by PR basis, making in-situ judgement calls, and watching out for patterns on how this affects our workflow.

If we take this all under the remit of the Peloton (as @jamesp suggests) and review on occasion how were fairing, that seems like a healthy and reasonable approach to take.

What say ye?

@trexfeathers
Copy link
Contributor

From @bjlittle’s posted output, there is only one offender with a score over 30. So if something else goes over 50, that would indicate a genuine problem I think.

@rcomer sure, for now. But I hope over time we are able to drop the threshold and then it will become more pertinent to think about "accepting failures". I'm happy to go with @bjlittle's approach of rolling with the punches for now; personally I'll be pushing for # noqa if it comes up.

@bjlittle
Copy link
Member Author

Yeah a really cool version (not expecting this right now!) would be some kind of commit-to-commit regression (like coverage or ASV), but also specific to each object. Right now we have the latter, but not the former.

@trexfeathers I can see the ASV cogs already turning overtime in your head 🤣

@trexfeathers
Copy link
Contributor

OK I'm calling @bjlittle, @trexfeathers, @rcomer, @jamesp a quorum. I'll merge this at 12:00 BST if no objections come up in the meantime.

@bjlittle
Copy link
Member Author

bjlittle commented Oct 18, 2021

Also, thinking about this on what it means to me personally...

As a developer, I'll be running flake8 locally as I'm in the develop-test-commit cycle. If I hit a mccabe failure due to introducing complexity I'd like to think that I'd then reiterate and attempt to simplify before pushing my branch as a pull-request.

If I can't reduce complexity, then I wouldn't expect a reviewer to merge with a mccabe CI failure, as that would undermine the whole of our CI, so I'd be forced to add a #noqa :Cxxx.

At that point I'm gonna have to dual with the reviewer to justify why I can't reduce complexity, and hope I can get it accepted as an exception to the general rule. The use of #noqa IMHO is a slippy slope, albeit occasionally necessary, so it'll be interesting to see how it plays out.

What's good about this is that there is now no hiding place for the introduction of complexity.

I'm also hoping that I can always play the "reduce complexity card" without jumping through hoops, otherwise then it becomes a author/reviewer discussion i.e., a strong signal that it might be harder to get the pull-request merged without very good reason.

@trexfeathers
Copy link
Contributor

trexfeathers commented Oct 18, 2021

have to dual with the reviewer

Sounds much more collaborative and less adversarial than "have a duel"!

I'm also hoping that I can always play the "reduce complexity card" without jumping through hoops, otherwise then it becomes a author/reviewer discussion i.e., a strong signal that it might be harder to get the pull-request merged without very good reason.

👍 There are so many things we all agree we want, but it's often hard to agree what that looks like. So having an objective vehicle for such things is a positive step, like black does for readable code style. As you say, there will still be discussions at times, but hopefully fewer.

@trexfeathers trexfeathers merged commit 7849d13 into SciTools:main Oct 18, 2021
@bjlittle
Copy link
Member Author

image

"Ding! Ding! Round one!"... "Jeez, I don't fancy my chances with this reviewer"

tkknight added a commit to tkknight/iris that referenced this pull request Nov 3, 2021
* main: (44 commits)
  [pre-commit.ci] pre-commit autoupdate (SciTools#4395)
  min pin for numpy (nep29) (SciTools#4386)
  Updated environment lockfiles (SciTools#4393)
  Extend stock.mesh api (SciTools#4389)
  Updated environment lockfiles (SciTools#4388)
  Integrate ASV with Nox (SciTools#4378)
  NetCDF save - stream ALL lazy arrays. (SciTools#4375)
  adopt flake8 maccabe complexity metric (SciTools#4380)
  Accept inverse_flattening = 0 for spherical ellipsoid (closes SciTools#4146) (SciTools#4368)
  Updated environment lockfiles (SciTools#4379)
  Prevent warning in `test_Saver` (SciTools#4376)
  drop pyugrid in site.cfg (SciTools#4373)
  `flake8` dependency (SciTools#4371)
  update latest whosnew (SciTools#4372)
  Allow `check_graphic` to be more flexible (SciTools#4370)
  [pre-commit.ci] pre-commit autoupdate (SciTools#4365)
  Updated environment lockfiles (SciTools#4364)
  Update latest.rst (SciTools#4362)
  More clarity on setting `iris-test-data` location. (SciTools#4359)
  update whatsnew (SciTools#4361)
  ...
@bjlittle bjlittle deleted the adopt-flake8-mccabe branch January 5, 2022 10:16
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants