-
Notifications
You must be signed in to change notification settings - Fork 1
Insert 2021 class profiles and points thresholds for Warsaw schools #43
Comments
@Anakin100100 bump |
I think that a class profile together with what the requirements were to get in are general enough that we should think how to approach that in non Warsaw specific ways. Can we implement filtering based on additive components here? Something like tick boxes where users can select the extended subjects that they would like to select. @micorix what do you think about this approach? |
I'm not sure if I follow, but yeah we planned on having multiple-select field with extended subjects as the only way to filter class profiles |
It would be sth like school type filter here: https://test.po8klasie.pl/warszawa/search |
@wojtodzio @micorix can we setup a meeting? Tomorrow would be tough because I'm attending hacknight at Hackespace Silesia and I won't be available from 5 pm and I have to get to Bielsko first. If you are up to it I'm free on the weekend. On monday I'm free before 8 pm. |
Monday and Saturday would be fine with me but I would need to confirm. What do you want to discuss? |
1 Identyfying key tasks that are required to present anything to the public @micorix @wojtodzio feel free to add your topics to the agenda |
Ok. I know that it is hard to estimate before the meeting but when approximately do you think the backend will be ready for the first official release? |
@micorix this depends on what we need to make the first official release which is not clearly defined for now. |
Ok, so to clarify. The most urgent 🔥 priorities are:
|
@Anakin100100 are u able to provide raw estimates considering the priorities listed above? |
@micorix The filtering that is available for data from gdynia has to be changed. Extended subjects are going to be a separate model that is going to be regenerated with the database. The extended subject combination will be a separate model belonging to a institution. After all the other filters run we will filter the extended subjects in memory. I will work on it tomorrow and I may finish it then but it's not a guarantee. |
Ok, great! How about the data from Warsaw? |
This data is hard to work because of low degree of standardisation, various irregularities in the data and a huge number of edge cases and irregularities introduced by the vocational schools. I'll do my best but using this data is challenging. |
@wojtodzio I need your help. I need to be able to filter on class profiles which is an array of strings, sth like ["Polski", "Matematyka"]. The model hierarchy looks as follows: Institution has one Subject Set which has many Subjects that have names in the array above. We need to filter the data where each of the specified subject names must be present in the subject set but more can be present as well. For now I've written somethink like that: if @class_profiles != nil which of course gives an error NoMethodError (undefined method `key?' for nil:NilClass
How should i go about writing this? Is there something from active record that I'm missing or is writing it with raw SQL the only way to do this? |
@wojtodzio before you can try debugging this issue you have to populate the database with subjects and subject sets for warsaw schools using CreateSubjectsJob.new.perform_now and later ProcessWarsawDataJob.new.perform_now (this one take 15 mins so it's a good idea to start it beforehand). |
@Anakin100100 I'm not sure what I'm doing wrong, but I think something like that should work (I haven't tested it, though): Institution.joins(subject_sets: :subjects).where(subject_sets: { subjects: { name: ["Polski", "Matematyka"] } }).distinct |
@wojtodzio Thanks for info on how to solve this issue. Assuming that all migrations have been run not loading the file correctly could be responsible for that. If there are no records to iterate over it would finish instantly. Can you confirm that inside ProcessWarsawDataService the raw_school_data contains roughly 800 records after loading the file (it should be inside the data directory) |
Yes, it does. It seems like institutions are missing. I guess I should also run something like |
@wojtodzio Not really, the whole process is described in docs/regenerating_the_database.md first create the institution types and later enque the jobs for each type |
Using the job for that |
Right, I've just tried running the regeneration script from master, and then this job from your branch, and it seems to have helped (or at least the job is taking more than a few sec. to finish 😂) |
Oh, and I've just realized that you wanted ALL of the subjects to be there. You can do group + having, then. E.g., something like this (I haven't tested it): Institution.joins(subject_sets: :subjects).group(:id).having("ARRAY_AGG(subjects.name::text) @> ARRAY['Polski']") |
@Anakin100100 what's the status of this issue? |
We got data regarding past class profiles and points thresholds for schools in Warsaw.
@Anakin100100 can you add them to the db?
The file is not ideal (every second row needs to be ignored) but it was the best we could come up with when converting the PDF using some online tools.
If it's possible please insert also original PDF into the repo for clarity and transparency.
PDF converted to Excel:
https://docs.google.com/spreadsheets/d/1a13O1QidSuWR4Xgf_FmDj4KDkbzVq2hz/edit?usp=sharing&ouid=115685832088064624071&rtpof=true&sd=true
Original PDF:
https://drive.google.com/file/d/12aRWJekAPh-rnqG7rJhk_8ujaf3Pzgdk/view?usp=sharing
The text was updated successfully, but these errors were encountered: