Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update course evals #1038

Closed
1 task done
JiaqiWang18 opened this issue Sep 30, 2023 · 2 comments · Fixed by #1057 or #1063
Closed
1 task done

Update course evals #1038

JiaqiWang18 opened this issue Sep 30, 2023 · 2 comments · Fixed by #1057 or #1063
Assignees
Labels
feature-request For new feature request

Comments

@JiaqiWang18
Copy link
Member

Contact Details

No response

Is your feature request related to a problem? Please describe.

Many users have requested us to update the course evaluation on Semester.ly

Describe the solution you'd like.

Take evluations published from the official source, and add them to course modals in a format similar to: https://jhu.semester.ly/course/EN.660.332/Fall/2023

It does not seem like there is an API for this data. So a solution is to create a web crawler and ingest these data manually.

Describe alternatives you've considered

Have users submit course evals thru semester.ly.
Not recommended if we are not offering different questions from these on the official forms.

Additional Information

No response

Code of Conduct

  • I agree to follow Semester.ly's Code of Conduct
@JiaqiWang18 JiaqiWang18 added the feature-request For new feature request label Sep 30, 2023
@jchen324
Copy link

jchen324 commented Oct 6, 2023

Some research into course evaluation:

  • How evaluations before 2015 were injected into database:
    • Htmls containing all courses and their evaluations were downloaded and stored to parsing/schools/jhu/HopkinsEvaluations
    • Parser parsing/schools/jhu/evals.py was run to parse the html and inject evaluations into database
  • However, since htmls containing all courses and evaluations are not available any more, this method doesn't work now
    • Current evaluation website mandates the use of cookies and sessions, requiring JHED login
    • Parsing is made difficult because the website only returns a portion of the evaluation results in html with no clear class id or name, and users have to click on "show more results" to send xhr to get additional results

We should discuss further on how to proceed with implementing this feature.

@JiaqiWang18
Copy link
Member Author

JiaqiWang18 commented Nov 3, 2023

Current Evaluation Ingestion Steps

First run the ingest command to convert HTML to json
Then run the digets command to save json to database

How to run current ingestor

python manage.py ingest jhu --types evals

Note --years and --terms flags don't work
It generates parsing/schools/jhu/data/evals.json

How to run current digestor

The digestir loads the json data and save them into the database.
It has multiple digestion strategies to ensure data consistency
*

python manage.py ingest jhu --types evals --years 2015

Evals.json format

A list of below object

 {
    "course": {
      "code": "AS.310.305"
    },
    "instructors": [
      {
        "name": "Marvin Ott"
      }
    ],
    "kind": "eval",
    "score": 4.32,
    "summary": "Students praised the course...",
    "term": "Fall",
    "year": "2013"
  }

Next steps

Currently, evals.py the ingestor uses Beautifulsoup to parse HTMLs. Since we now need to use Selenium and require authentication, it is probably better if we run this part locally for security reasons and generate a json file to be read and give this file directly to the digestor.

For the digestion step, we should aim to use the existing code in digestor.py because it has robust logic to validate, reconcile difference, and avoid duplicated records already. So we should try to generate a json file that has the same format as the one above so we can directly give it to digestor.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature-request For new feature request
Projects
None yet
2 participants