Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: Improve parser performance #318

Merged
merged 25 commits into from
Jan 17, 2024
Merged
Show file tree
Hide file tree
Changes from 22 commits
Commits
Show all changes
25 commits
Select commit Hold shift + click to select a range
589b6d9
refactor: mark private function with _
eric-nguyen-cs Dec 8, 2023
ae5af55
refactor(parser): add type annotations and clean up code
eric-nguyen-cs Dec 8, 2023
cef659a
chore: use context manager to close session in tests
eric-nguyen-cs Dec 8, 2023
802091d
chore: update neo4j and Makefile
eric-nguyen-cs Dec 15, 2023
30aa918
refactor: create parser specific directory
eric-nguyen-cs Dec 15, 2023
0c9c856
refactor: start taxonomy_parser by copying parser file
eric-nguyen-cs Dec 20, 2023
f26f256
refactor: move logger to separate file
eric-nguyen-cs Dec 20, 2023
13cfce2
refactor: remove unnecessary code for taxonomy parser
eric-nguyen-cs Dec 20, 2023
3410ad0
feat: update TaxonomyParser to return taxonomy class
eric-nguyen-cs Dec 20, 2023
6674876
feat: update parser to use taxonomy parser
eric-nguyen-cs Dec 20, 2023
cd1e288
chore: update tests for new taxonomy parser
eric-nguyen-cs Dec 21, 2023
1540015
Merge branch 'main' into ericn/decouple-parser-and-db-writer
eric-nguyen-cs Dec 21, 2023
1098bc7
fix: remove multi_label for single project_label
eric-nguyen-cs Dec 20, 2023
3e88470
feat: improve node creation performance
eric-nguyen-cs Dec 20, 2023
6d1eda1
feat: add node id index to improve search query performance
eric-nguyen-cs Dec 20, 2023
faea1a1
feat: improve previous link creation performance
eric-nguyen-cs Dec 20, 2023
2b4f879
feat: improve child link creation performance
eric-nguyen-cs Dec 20, 2023
095b09a
feat: group queries into transaction
eric-nguyen-cs Dec 20, 2023
c7e03e0
chore: update logging info and add timing info
eric-nguyen-cs Dec 20, 2023
c0f6f50
fix: add db name to sessions
eric-nguyen-cs Dec 20, 2023
bdfb252
refactor: move ellipsis func to logger class
eric-nguyen-cs Dec 20, 2023
94bfa7d
fix: stop id index creation if index exists
eric-nguyen-cs Dec 21, 2023
f1e936d
fix: resolve comments
eric-nguyen-cs Jan 12, 2024
cf3a993
fix: resolve comments
eric-nguyen-cs Jan 17, 2024
572fc78
Merge branch 'main' into ericn/improve-parser-performance
eric-nguyen-cs Jan 17, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 3 additions & 2 deletions backend/editor/graph_db.py
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,7 @@
from .exceptions import TransactionMissingError

log = logging.getLogger(__name__)
DEFAULT_DB = "neo4j"


txn = contextvars.ContextVar("txn")
Expand All @@ -29,7 +30,7 @@ async def TransactionCtx():
"""
global txn, session
try:
async with driver.session() as _session:
async with driver.session(database=DEFAULT_DB) as _session:
txn_manager = await _session.begin_transaction()
async with txn_manager as _txn:
txn.set(_txn)
Expand Down Expand Up @@ -86,5 +87,5 @@ def SyncTransactionCtx():
"""
uri = settings.uri
driver = neo4j.GraphDatabase.driver(uri)
with driver.session() as _session:
with driver.session(database=DEFAULT_DB) as _session:
yield _session
2 changes: 1 addition & 1 deletion docker-compose.yml
Original file line number Diff line number Diff line change
Expand Up @@ -3,7 +3,7 @@ version: "3.9"
services:
neo4j:
restart: ${RESTART_POLICY:-no}
image: neo4j:5.3.0-community
image: neo4j:5.14.0-community
ports:
# admin console
- "${NEO4J_ADMIN_EXPOSE:-127.0.0.1:7474}:7474"
Expand Down
2 changes: 1 addition & 1 deletion parser/Makefile
Original file line number Diff line number Diff line change
Expand Up @@ -13,7 +13,7 @@ quality:

tests:
cd .. && docker compose up -d neo4j
pytest .
poetry run pytest .
# we do not shutdown neo4j

checks: quality tests
2 changes: 1 addition & 1 deletion parser/openfoodfacts_taxonomy_parser/normalizer.py
Original file line number Diff line number Diff line change
Expand Up @@ -7,7 +7,7 @@
import unidecode


def normalizing(line, lang="default", char="-"):
def normalizing(line: str, lang="default", char="-"):
"""Normalize a string depending on the language code"""
line = unicodedata.normalize("NFC", line)

Expand Down
Loading
Loading