-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improvements to Mapit::Wrapper and Mapit::Place #87
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made some tiny comments on "the Ruby way" of doing things, other than that I only have one concern regarding downloading the files vs using them from disk directly, but I leave that up to you to decide.
Thank you for this Mark, this was a lot of work, I really like it much better now 👍 it's nice when things go from specifics to more general and useful
✨ 😃 ✨
lib/mapit/wrapper.rb
Outdated
'parent_name' => parent_name(area) | ||
} | ||
area.merge(parent) | ||
def download_to_file(url, filename) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am not sure about this step, now that we are going to read the Mapit data from disk. So maybe we can just omit that step and always read from disk, like we do with the mapping files. They are needed for the application. It is not too crazy to me to suppose that the Mapit files are also needed for the application since it deals with places and areas.
This will simplify the code a little bit more, and also the tests 👍
What do you think?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I think you're right 👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I thought when writing this that it'd be convenient if you didn't have to download files as a separate step when dealing with a new area type, that'd be useful, and that updating them by deleting the files and running the server again would be kind of neat. But on the whole, I think those (debatable!) conveniences don't justify the increased complexity of the code, as you point out.
lib/mapit/wrapper.rb
Outdated
mapit_mappings.child_to_parent | ||
def cache_mapit_data | ||
@type_to_places = area_types.map { |t| [t, places(t)] }.to_h | ||
@id_to_place = @type_to_places.values.flatten.map { |a| [a.id.to_s, a] }.to_h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The @
of type_to_places
is not needed here, since we are reading, not writing 🙂
lib/mapit/wrapper.rb
Outdated
def set_up_parent_child_relationships | ||
# FIXME: this should also use MapIt's parent_area, if that's set. | ||
mapit_mappings.child_to_parent.each do |child, parent| | ||
if @id_to_place.key?(child) && @id_to_place.key?(parent) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also here, the @
s in id_to_place
are not needed
def child_to_parent | ||
mapit_mappings.child_to_parent | ||
def cache_mapit_data | ||
@type_to_places = area_types.map { |t| [t, places(t)] }.to_h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since we are building these hashes, it may be a better idea to use symbols for the hash key. This is a common practice in Ruby as symbols are more performant than strings
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I know that's often done, but I don't think that's worth it in this case. Area type codes often come from MapIt calls, which will return them from JSON strings, and when you use them in queries to MapIt, you need them to be strings then as well. Transforming those strings to and from symbols for any input and output like that seems unnecessary, given that the performance impact from using strings instead of hash keys in going to be tiny in this app.
lib/mapit/wrapper.rb
Outdated
end | ||
|
||
def all_areas | ||
states + federal_constituencies + senatorial_districts | ||
def area_from_pombola_slug(pombola_slug) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the method is already informing us of that, I think it would be enough to call the argument slug
, to make it a bit less repetitive. Same for area_from_ep_id(ep_id)
, I think it would be clear that id
refers to an ep_id
.
Up to you though!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
:) I slightly prefer the name of the argument to be inherently descriptive, because it can make it easier to spot errors of usage in the body of the method. That's a better argument for ep_id
/ id
than pombola_slug
/ slug
, though, since the only slugs we handle here are Pombola slugs whereas we handle multiple types of ID. I'll change pombola_slug
anyway.
lib/mapit/wrapper.rb
Outdated
@@ -1,89 +1,100 @@ | |||
# frozen_string_literal: true | |||
require 'net/http' |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! 👍
lib/mapit/wrapper.rb
Outdated
uri = URI(mapit_url + area_type) | ||
JSON.parse(Net::HTTP.get(uri)).values | ||
def mapit_area_cache_filename(area_type) | ||
File.join('mapit', "#{URI.parse(mapit_url).host}-#{area_type}.json") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This mapit
is site-specific knowledge... if we removed the downloading part of the code (see comment below) we could pass this knowledge in place of mapit_url
which will not be needed anymore.
For example, like we do in mappings, we could pass an array of mapit data filenames, or as we do with the finder, just the name of the directory where they will be stored.
(this comment is related to the one below, so if this is not a good idea it can be ignored 🙂 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, good point that this is an oddity if we're using the data in the repository. I've removed this in a slightly different way, passing in a data_directory: 'mapit'
parameter to the initializer, which I think has the same upside and means we're still inferring the filename from the area types.
lib/mapit/place.rb
Outdated
def initialize(place:, mapit_ids_to_pombola_slugs:, baseurl:) | ||
@place = place | ||
@mapit_ids_to_pombola_slugs = mapit_ids_to_pombola_slugs | ||
attr_reader :parent |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think you can make this an attr_accessor
and then you can get rid of the parent=()
method.
Unless you wanted to memoize the parent inside of @parent
with
@parent ||= parent_place
Also I don't think it is needed to pass the parent
as an argument, if you don't initialize it, it will be nil
by default. Or we could just do @parent = nil
in the constructor if we wanted to be that explicit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point about attr_accessor
. I'll do that.
When you say:
Also I don't think it is needed to pass the parent as an argument
The idea of adding it as an option argument to initialize
that in some situations you can be sure that the parent already exists when you create the object, so it's convenient to be able to pass in the parent if that's the case. (This is used in one of the tests, for example.)
9aeac8f
to
0171702
Compare
It was really confusing that this instance variable had the name of the class, despite being a completely different data structure.
The Mapit::Wrapper class hardcoded particular types of area in method names like 'federal_constituencies' or 'states'. This means you had to change the code of the class to reuse it for other area types, which is undesirable. This commit changes that so that there is a 'places_of_type' method which returns all the places for a particular MapIt type code, e.g. '.places_of_type('SEN')' Unforunately, because of the way that parent areas are set with this version of the code, it's necessary to distinguish the area type that parents may be drawn from and area types that child areas are drawn from. This will be made more flexible in the next commit, however.
This class had a number of problems which this commit fixes: - It was very inefficient, iterating over all areas whenever it need to look one up. - It dealt with heterogenous representations of areas (MapIt-derived hashes and Mapit::Place objects) even after the wrapper object has been constructed. - The parent-child relationships had to be set very carefully, in that the parent places had to be created before the child places, so that the parents would exist before setting the parent attributes of the children. This commit fixes those problems and simplifies the class both to use and reason about considerably. Setting the parent of an Place is now done either by passing in a parent Place object when initializing the Place, or by setting it afterwards with the place= setter.
We're going to use this to avoid having to download the MapIt data every time the server starts up; I'm adding this in a separate commit to make the other commits easier to read.
Since we're keeping the full MapIt data for area types FED, SEN and STA in the repository anyway now, we can remove the old reduced fixture data and use the real data instead. This will slow down the tests slightly, since more MapIt data is being loaded and longer pages are rendered in the tests, but has the advantage that we're dealing with the real data so may catch additional problems caused by data in the full dataset. Also some of the magic numbers in the tests are clearer: e.g. the 37 in state tests is for the 36 states + the Federal Capital Territory.
To avoid having to fetch data from MapIt every time the server is started, we can use the area JSON in the repository. (The MapIt data changes very occasionally, and usually as a result of manual intervention to update it or make a fix, so not getting a live copy every time shouldn't cause problems.) This also enables us to simplify the code quite significantly.
0171702
to
fdeb31c
Compare
These changes make the Mapit::Wrapper and Mapit::Place classes
more generically useful (e.g. not hardcoding area types), easier
to reason about (e.g. parent-child relationships being set up after
creation of the Place objects and the parent attribute of a Place
now being another Place), and more efficient (e.g. no unnecessary
linear lookups for places, and using MapIt data from the
filesystem in preference to the network).
n.b. contrary to what I said in stand-up, I've stopped short of adding
the setting of child areas in this pull request, but that's easy to add.