Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improvements to Mapit::Wrapper and Mapit::Place #87

Merged
merged 6 commits into from
Mar 3, 2017

Conversation

mhl
Copy link
Contributor

@mhl mhl commented Mar 2, 2017

These changes make the Mapit::Wrapper and Mapit::Place classes
more generically useful (e.g. not hardcoding area types), easier
to reason about (e.g. parent-child relationships being set up after
creation of the Place objects and the parent attribute of a Place
now being another Place), and more efficient (e.g. no unnecessary
linear lookups for places, and using MapIt data from the
filesystem in preference to the network).

n.b. contrary to what I said in stand-up, I've stopped short of adding
the setting of child areas in this pull request, but that's easy to add.

@mhl mhl requested a review from octopusinvitro March 2, 2017 17:25
Copy link
Contributor

@octopusinvitro octopusinvitro left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Made some tiny comments on "the Ruby way" of doing things, other than that I only have one concern regarding downloading the files vs using them from disk directly, but I leave that up to you to decide.

Thank you for this Mark, this was a lot of work, I really like it much better now 👍 it's nice when things go from specifics to more general and useful
✨ 😃 ✨

'parent_name' => parent_name(area)
}
area.merge(parent)
def download_to_file(url, filename)
Copy link
Contributor

@octopusinvitro octopusinvitro Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am not sure about this step, now that we are going to read the Mapit data from disk. So maybe we can just omit that step and always read from disk, like we do with the mapping files. They are needed for the application. It is not too crazy to me to suppose that the Mapit files are also needed for the application since it deals with places and areas.

This will simplify the code a little bit more, and also the tests 👍

What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think you're right 👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I thought when writing this that it'd be convenient if you didn't have to download files as a separate step when dealing with a new area type, that'd be useful, and that updating them by deleting the files and running the server again would be kind of neat. But on the whole, I think those (debatable!) conveniences don't justify the increased complexity of the code, as you point out.

mapit_mappings.child_to_parent
def cache_mapit_data
@type_to_places = area_types.map { |t| [t, places(t)] }.to_h
@id_to_place = @type_to_places.values.flatten.map { |a| [a.id.to_s, a] }.to_h
Copy link
Contributor

@octopusinvitro octopusinvitro Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The @ of type_to_places is not needed here, since we are reading, not writing 🙂

def set_up_parent_child_relationships
# FIXME: this should also use MapIt's parent_area, if that's set.
mapit_mappings.child_to_parent.each do |child, parent|
if @id_to_place.key?(child) && @id_to_place.key?(parent)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also here, the @s in id_to_place are not needed

def child_to_parent
mapit_mappings.child_to_parent
def cache_mapit_data
@type_to_places = area_types.map { |t| [t, places(t)] }.to_h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since we are building these hashes, it may be a better idea to use symbols for the hash key. This is a common practice in Ruby as symbols are more performant than strings

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I know that's often done, but I don't think that's worth it in this case. Area type codes often come from MapIt calls, which will return them from JSON strings, and when you use them in queries to MapIt, you need them to be strings then as well. Transforming those strings to and from symbols for any input and output like that seems unnecessary, given that the performance impact from using strings instead of hash keys in going to be tiny in this app.

end

def all_areas
states + federal_constituencies + senatorial_districts
def area_from_pombola_slug(pombola_slug)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the method is already informing us of that, I think it would be enough to call the argument slug, to make it a bit less repetitive. Same for area_from_ep_id(ep_id), I think it would be clear that id refers to an ep_id.

Up to you though!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

:) I slightly prefer the name of the argument to be inherently descriptive, because it can make it easier to spot errors of usage in the body of the method. That's a better argument for ep_id / id than pombola_slug / slug, though, since the only slugs we handle here are Pombola slugs whereas we handle multiple types of ID. I'll change pombola_slug anyway.

@@ -1,89 +1,100 @@
# frozen_string_literal: true
require 'net/http'
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch! 👍

uri = URI(mapit_url + area_type)
JSON.parse(Net::HTTP.get(uri)).values
def mapit_area_cache_filename(area_type)
File.join('mapit', "#{URI.parse(mapit_url).host}-#{area_type}.json")
Copy link
Contributor

@octopusinvitro octopusinvitro Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This mapit is site-specific knowledge... if we removed the downloading part of the code (see comment below) we could pass this knowledge in place of mapit_url which will not be needed anymore.

For example, like we do in mappings, we could pass an array of mapit data filenames, or as we do with the finder, just the name of the directory where they will be stored.

(this comment is related to the one below, so if this is not a good idea it can be ignored 🙂 )

Copy link
Contributor Author

@mhl mhl Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, good point that this is an oddity if we're using the data in the repository. I've removed this in a slightly different way, passing in a data_directory: 'mapit' parameter to the initializer, which I think has the same upside and means we're still inferring the filename from the area types.

def initialize(place:, mapit_ids_to_pombola_slugs:, baseurl:)
@place = place
@mapit_ids_to_pombola_slugs = mapit_ids_to_pombola_slugs
attr_reader :parent
Copy link
Contributor

@octopusinvitro octopusinvitro Mar 3, 2017

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you can make this an attr_accessor and then you can get rid of the parent=() method.

Unless you wanted to memoize the parent inside of @parent with
@parent ||= parent_place

Also I don't think it is needed to pass the parent as an argument, if you don't initialize it, it will be nil by default. Or we could just do @parent = nil in the constructor if we wanted to be that explicit.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good point about attr_accessor. I'll do that.

When you say:

Also I don't think it is needed to pass the parent as an argument

The idea of adding it as an option argument to initialize that in some situations you can be sure that the parent already exists when you create the object, so it's convenient to be able to pass in the parent if that's the case. (This is used in one of the tests, for example.)

@mhl mhl force-pushed the simplify-wrapper-step-by-step branch from 9aeac8f to 0171702 Compare March 3, 2017 16:41
mhl added 6 commits March 3, 2017 16:52
It was really confusing that this instance variable had the name of the
class, despite being a completely different data structure.
The Mapit::Wrapper class hardcoded particular types of area in
method names like 'federal_constituencies' or 'states'. This means you
had to change the code of the class to reuse it for other area types,
which is undesirable. This commit changes that so that there is a
'places_of_type' method which returns all the places for a
particular MapIt type code, e.g. '.places_of_type('SEN')'

Unforunately, because of the way that parent areas are set with this
version of the code, it's necessary to distinguish the area type that
parents may be drawn from and area types that child areas are drawn
from. This will be made more flexible in the next commit, however.
This class had a number of problems which this commit fixes:

 - It was very inefficient, iterating over all areas whenever it need to
   look one up.

 - It dealt with heterogenous representations of areas (MapIt-derived
   hashes and Mapit::Place objects) even after the wrapper object has
   been constructed.

 - The parent-child relationships had to be set very carefully, in that
   the parent places had to be created before the child places, so that
   the parents would exist before setting the parent attributes of the
   children.

This commit fixes those problems and simplifies the class both to use
and reason about considerably.

Setting the parent of an Place is now done either by passing in a parent
Place object when initializing the Place, or by setting it afterwards
with the place= setter.
We're going to use this to avoid having to download the MapIt data every
time the server starts up; I'm adding this in a separate commit to make
the other commits easier to read.
Since we're keeping the full MapIt data for area types FED, SEN and STA
in the repository anyway now, we can remove the old reduced fixture
data and use the real data instead. This will slow down the tests
slightly, since more MapIt data is being loaded and longer pages are
rendered in the tests, but has the advantage that we're dealing with
the real data so may catch additional problems caused by data in the
full dataset. Also some of the magic numbers in the tests are clearer:
e.g. the 37 in state tests is for the 36 states + the Federal Capital
Territory.
To avoid having to fetch data from MapIt every time the server is
started, we can use the area JSON in the repository. (The MapIt
data changes very occasionally, and usually as a result of manual
intervention to update it or make a fix, so not getting a live
copy every time shouldn't cause problems.)

This also enables us to simplify the code quite significantly.
@coveralls
Copy link

Coverage Status

Coverage decreased (-0.005%) to 99.761% when pulling fdeb31c on simplify-wrapper-step-by-step into 8f75fe2 on master.

@mhl mhl merged commit fdeb31c into master Mar 3, 2017
@octopusinvitro octopusinvitro deleted the simplify-wrapper-step-by-step branch March 5, 2017 23:57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants