Skip to content

The assignments I accomplished in USC Building Knowledge Graphs course.

Notifications You must be signed in to change notification settings

oscargong/DSCI-558-Building-Knowledge-Graphs

Repository files navigation

DSCI-558

The assignments I finished and the material used in USC DSCI-558 Building Knowledge Graphs course, which brought by USC Knowledge Graph Center.

Summaries

Subject Library Technique Description
1 Web Scraping Scrapy Using Scrapy, crawl 10k pages from IMDB, extract attributes from each page, store the outcome into Json-lines files.
2 Information Extraction spaCy NLP Using spaCy, form actor's biography text, for each attribute, build one Lexical extractor and one Syntactic extractor.
3 Entity Resolution, Blocking & Knowledge Representation The Record Linkage ToolKit (RLTK), RDFLib Given two datasets of IMDB and AFI, and a dev dataset.
Match records from these 2 datasets (record linkage). Use Blocking to reduce the number of pairs need to compare.
Design a model in RDF Schema, store the result in a turtle using the designed model.
4 RDF query Apache Jena SPARQL, WikiData Query Write SPARQL queries to solve several intricate requests.
5 IE - Revisit
Weak Supervision and Distant Supervision
Snorkel Weak Supervision, Distant Supervision Hand label a small set of dev data, write label functions using Snorkel, combined with distance supervision, label training set.
Output a Generative Model.
6 PSL and OWL PSL, Protégé Probabilistic Soft Logic, OWL Write the PSL model to link the same paper.
Using Protege to build an OWL ontology and try some reasoning.
7 Tabular Data & Knowledge Graph Embedding AmpliGraph RDF Data Cube, KG Embedding

Documents

The W3C documents are hard to read: poorly typography and impossible to mark up. KG is a rapid-growing domain, lacks well-written documents. I put my organized W3C documents and other useful materials here.

About

The assignments I accomplished in USC Building Knowledge Graphs course.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published