Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(package): Enable replica set for the MongoDB results cache and configure it when starting the package. #632

Merged
merged 11 commits into from
Jan 20, 2025
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

shall we also consider renaming this script as initialize-results-cache.py to better reflect its updated purpose?

Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import sys

from pymongo import IndexModel, MongoClient
from pymongo.errors import OperationFailure

# Setup logging
# Create logger
Expand All @@ -15,6 +16,22 @@
logger.addHandler(logging_console_handler)


def initialize_replica_set(client, uri):
try:
result = client.admin.command("replSetGetStatus")
logger.info("Replica set already initialized: %s", result)
except OperationFailure as e:
logger.info("Initializing replica set")

# Explicit host specification is required, or the docker's ID would be used as the hostname.
config = {
"_id": "rs0",
"members": [{"_id": 0, "host": "localhost:27017"}],
}
client.admin.command("replSetInitiate", config)
logger.info("Replica set initialized successfully.")


def main(argv):
args_parser = argparse.ArgumentParser(description="Creates results cache indices for CLP.")
args_parser.add_argument("--uri", required=True, help="URI of the results cache.")
Expand All @@ -27,9 +44,11 @@ def main(argv):
stream_collection_name = parsed_args.stream_collection

try:
with MongoClient(results_cache_uri, directConnection=True) as results_cache_client:
initialize_replica_set(results_cache_client, results_cache_uri)

with MongoClient(results_cache_uri) as results_cache_client:
stream_collection = results_cache_client.get_default_database()[stream_collection_name]

file_split_id_index = IndexModel(["file_split_id"])
orig_file_id_index = IndexModel(["orig_file_id", "begin_msg_ix", "end_msg_ix"])
stream_collection.create_indexes([file_split_id_index, orig_file_id_index])
Expand Down
2 changes: 2 additions & 0 deletions components/package-template/src/etc/mongo/mongod.conf
Original file line number Diff line number Diff line change
@@ -1,5 +1,7 @@
net:
bindIp: 0.0.0.0
replication:
replSetName: "rs0"
systemLog:
destination: file
path: /var/log/mongodb/mongod.log
Expand Down
Loading