Skip to content

Commit

Permalink
Merge pull request DIRACGrid#7945 from fstagni/cherry-pick-2-fdbff561…
Browse files Browse the repository at this point in the history
…4-integration

[sweep:integration] docs: answers to the woke police
  • Loading branch information
fstagni authored Dec 10, 2024
2 parents 9081eda + 3ac4c23 commit af2786a
Show file tree
Hide file tree
Showing 24 changed files with 147 additions and 147 deletions.
Original file line number Diff line number Diff line change
Expand Up @@ -14,7 +14,7 @@ Most of DIRAC services can be exposed using either the DIPs or the HTTPs protoco

As a general rule, services can be duplicated,
meaning you can have the same service running on multiple hosts, thus reducing the load.
There are only 2 cases of DIRAC services that have a "master/slave" concept, and these are the Configuration Service
There are only 2 cases of DIRAC services that have a "controller/worker" concept, and these are the Configuration Service
and the Accounting/DataStore service.

Same can be said for executors: you can have many residing on different hosts.
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -436,7 +436,7 @@ operation is the registration of the new host in the already functional Configur

#
# These options define the DIRAC components being installed on "this" DIRAC server.
# The simplest option is to install a slave of the Configuration Server and a
# The simplest option is to install a worker of the Configuration Server and a
# SystemAdministrator for remote management.
#
# The following options defined components to be installed
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -90,11 +90,11 @@ Services
+--------------------+---------------------------------------------------------------------------------------------------+-------------+---------------------------------------------------------------------------+-----------+
| **System** | **Component** |**Duplicate**| **Remarks** | **HTTPs** +
+--------------------+---------------------------------------------------------------------------------------------------+-------------+---------------------------------------------------------------------------+-----------+
| Accounting | :mod:`DataStore <DIRAC.AccountingSystem.Service.DataStoreHandler>` | PARTIAL | One master and helpers (See :ref:`datastorehelpers`) | +
| Accounting | :mod:`DataStore <DIRAC.AccountingSystem.Service.DataStoreHandler>` | PARTIAL | One controller and helpers (See :ref:`datastorehelpers`) | +
+ +---------------------------------------------------------------------------------------------------+-------------+---------------------------------------------------------------------------+-----------+
| | :mod:`ReportGenerator <DIRAC.AccountingSystem.Service.ReportGeneratorHandler>` | | | +
+--------------------+---------------------------------------------------------------------------------------------------+-------------+---------------------------------------------------------------------------+-----------+
| Configuration | :mod:`Configuration <DIRAC.ConfigurationSystem.Service.ConfigurationHandler>` | PARTIAL | One master (rw) and slaves (ro). It's advised to have several CS slaves | YES +
| Configuration | :mod:`Configuration <DIRAC.ConfigurationSystem.Service.ConfigurationHandler>` | PARTIAL | One controller (rw) and workers (ro). Should have several CS workers | YES +
+--------------------+---------------------------------------------------------------------------------------------------+-------------+---------------------------------------------------------------------------+-----------+
| DataManagement | :mod:`DataIntegrity <DIRAC.DataManagementSystem.Service.DataIntegrityHandler>` | YES | | YES +
+ +---------------------------------------------------------------------------------------------------+-------------+---------------------------------------------------------------------------+-----------+
Expand Down
24 changes: 12 additions & 12 deletions docs/source/AdministratorGuide/Systems/Configuration/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,22 +5,22 @@ Configuration System
====================

The configuration system serves the configuration to any other client (be it another server or a standard client).
The infrastructure is master/slave based.
The infrastructure is controller/worker based.

******
Master
******
**********
Controller
**********

The master Server holds the central configuration in a local file. This file is then served to the clients, and synchronized with the slave servers.
The controller Server holds the central configuration in a local file. This file is then served to the clients, and synchronized with the worker servers.

the master server also regularly pings the slave servers to make sure they are still alive. If not, they are removed from the list of CS.
the controller server also regularly pings the worker servers to make sure they are still alive. If not, they are removed from the list of CS.

When changes are committed to the master, a backup of the existing configuration file is made in ``etc/csbackup``.
When changes are committed to the controller, a backup of the existing configuration file is made in ``etc/csbackup``.

******
Slaves
******
*******
Workers
*******

Slave server registers themselves to the master when starting.
worker server registers themselves to the controller when starting.
They synchronize their configuration on a regular bases (every 5 minutes by default).
Note that the slave CS do not hold the configuration in a local file, but only in memory.
Note that the worker CS do not hold the configuration in a local file, but only in memory.
2 changes: 1 addition & 1 deletion docs/source/DeveloperGuide/Overview/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -132,7 +132,7 @@ Configuration Service
The Configuration Service is built in the DISET framework to provide static configuration parameters to
all the distributed DIRAC components. This is the backbone of the whole system and necessitates excellent
reliability. Therefore, it is organized as a single master service where all the parameter
updates are done and multiple read-only slave services which are distributed geographically, on VO-boxes
updates are done and multiple read-only worker services which are distributed geographically, on VO-boxes
at Tier-1 LCG sites in the case of LHCb. All the servers are queried by clients in a load balancing way.
This arrangement ensures configuration data consistency together with very good scalability properties.

Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -201,7 +201,7 @@ The agent will try to execute request as a whole in one go.
:alt: Treating of Request in the RequestExecutionAgent.
:align: center

The `RequestExecutingAgent` is using the `ProcessPool` utility to create slave workers (subprocesses running `RequestTask`)
The `RequestExecutingAgent` is using the `ProcessPool` utility to create workers (subprocesses running `RequestTask`)
designated to execute requests read from `ReqDB`. Each worker is processing request execution using following steps:

* downloading and setting up request's owner proxy
Expand Down
4 changes: 2 additions & 2 deletions src/DIRAC/AccountingSystem/Service/DataStoreHandler.py
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
""" DataStore is the service for inserting accounting reports (rows) in the Accounting DB
This service CAN be duplicated iff the first is a "master" and all the others are slaves.
This service CAN be duplicated iff the first is a "controller" and all the others are workers.
See the information about :ref:`datastorehelpers`.
.. literalinclude:: ../ConfigTemplate.cfg
Expand Down Expand Up @@ -112,7 +112,7 @@ def export_compactDB(self):
"""
Compact the db by grouping buckets
"""
# if we are running slaves (not only one service) we can redirect the request to the master
# if we are running workers (not only one service) we can redirect the request to the master
# For more information please read the Administrative guide Accounting part!
# ADVICE: If you want to trigger the bucketing, please make sure the bucketing is not running!!!!
if self.runBucketing:
Expand Down
2 changes: 1 addition & 1 deletion src/DIRAC/ConfigurationSystem/Agent/Bdii2CSAgent.py
Original file line number Diff line number Diff line change
Expand Up @@ -103,7 +103,7 @@ def execute(self):
if not result["OK"]:
self.log.warn("Could not download a fresh copy of the CS data", result["Message"])

# Refresh the configuration from the master server
# Refresh the configuration from the controller server
gConfig.forceRefresh(fromMaster=True)

if self.processCEs:
Expand Down
4 changes: 2 additions & 2 deletions src/DIRAC/ConfigurationSystem/Client/CSAPI.py
Original file line number Diff line number Diff line change
Expand Up @@ -98,7 +98,7 @@ def initialize(self):
return self.__initialized
retVal = gConfig.getOption("/DIRAC/Configuration/MasterServer")
if not retVal["OK"]:
self.__initialized = S_ERROR("Master server is not known. Is everything initialized?")
self.__initialized = S_ERROR("Controller server is not known. Is everything initialized?")
return self.__initialized
self.__rpcClient = ConfigurationClient(url=gConfig.getValue("/DIRAC/Configuration/MasterServer", ""))
self.__csMod = Modificator(
Expand Down Expand Up @@ -915,7 +915,7 @@ def getCurrentCFG(self):
def showDiff(self):
"""Just shows the differences accumulated within the Modificator object"""
diffData = self.__csMod.showCurrentDiff()
gLogger.notice("Accumulated diff with master CS")
gLogger.notice("Accumulated diff with Controller CS")
for line in diffData:
if line[0] in ("+", "-"):
gLogger.notice(line)
Expand Down
26 changes: 13 additions & 13 deletions src/DIRAC/ConfigurationSystem/Client/CSCLI.py
Original file line number Diff line number Diff line change
Expand Up @@ -43,7 +43,7 @@ class CSCLI(CLI):
def __init__(self):
CLI.__init__(self)
self.connected = False
self.masterURL = "unset"
self.controllerURL = "unset"
self.writeEnabled = False
self.modifiedData = False
self.rpcClient = None
Expand Down Expand Up @@ -83,11 +83,11 @@ def _setConnected(self, connected, writeEnabled):
self.writeEnabled = writeEnabled
if connected:
if writeEnabled:
self.prompt = f"({self.masterURL})-{colorize('Connected', 'green')}> "
self.prompt = f"({self.controllerURL})-{colorize('Connected', 'green')}> "
else:
self.prompt = f"({self.masterURL})-{colorize('Connected (RO)', 'yellow')}> "
self.prompt = f"({self.controllerURL})-{colorize('Connected (RO)', 'yellow')}> "
else:
self.prompt = f"({self.masterURL})-{colorize('Disconnected', 'red')}> "
self.prompt = f"({self.controllerURL})-{colorize('Disconnected', 'red')}> "

def do_quit(self, dummy):
"""
Expand All @@ -104,7 +104,7 @@ def do_quit(self, dummy):

def _setStatus(self, connected=True):
if not connected:
self.masterURL = "unset"
self.controllerURL = "unset"
self._setConnected(False, False)
else:
retVal = self.rpcClient.writeEnabled()
Expand All @@ -118,23 +118,23 @@ def _setStatus(self, connected=True):
self._setConnected(True, False)

def _tryConnection(self):
print(f"Trying connection to {self.masterURL}")
print(f"Trying connection to {self.controllerURL}")
try:
self.rpcClient = ConfigurationClient(url=self.masterURL)
self.rpcClient = ConfigurationClient(url=self.controllerURL)
self._setStatus()
except Exception as x:
gLogger.error("Couldn't connect to master CS server", f"{self.masterURL} ({str(x)})")
gLogger.error("Couldn't connect to controller CS server", f"{self.controllerURL} ({str(x)})")
self._setStatus(False)

def do_connect(self, args=""):
"""
Connects to configuration master server (in specified url if provided).
Connects to configuration controller server (in specified url if provided).
Usage: connect <url>
"""
if not args or not isinstance(args, str):
self.masterURL = gConfigurationData.getMasterServer()
if self.masterURL != "unknown" and self.masterURL:
self.controllerURL = gConfigurationData.getMasterServer()
if self.controllerURL != "unknown" and self.controllerURL:
self._tryConnection()
else:
self._setStatus(False)
Expand All @@ -144,7 +144,7 @@ def do_connect(self, args=""):
print("Must specify witch url to connect")
self._setStatus(False)
else:
self.masterURL = splitted[0].strip()
self.controllerURL = splitted[0].strip()
self._tryConnection()

def do_sections(self, args):
Expand Down Expand Up @@ -247,7 +247,7 @@ def do_writeToServer(self, dummy):
choice = input("Do you really want to send changes to server? yes/no [no]: ")
choice = choice.lower()
if choice in ("yes", "y"):
print(f"Uploading changes to {self.masterURL} (It may take some seconds)...")
print(f"Uploading changes to {self.controllerURL} (It may take some seconds)...")
response = self.modificator.commit()
if response["OK"]:
self.modifiedData = False
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -63,7 +63,7 @@ class ConfigurationClient(Client):

def __init__(self, **kwargs):
# By default we use Configuration/Server as url, client do the resolution
# In some case url has to be static (when a slave register to the master server for example)
# In some case url has to be static (when a worker register to the controller server for example)
# It's why we can use 'url' as keyword arguments
if "url" not in kwargs:
kwargs["url"] = "Configuration/Server"
Expand Down
Original file line number Diff line number Diff line change
Expand Up @@ -59,9 +59,9 @@ def export_getCompressedDataIfNewer(self, sClientVersion):

def export_publishSlaveServer(self, sURL):
"""
Used by slave server to register as a slave server.
Used by worker server to register as a worker server.
:param sURL: The url of the slave server.
:param sURL: The url of the worker server.
"""
self.ServiceInterface.publishSlaveServer(sURL)
return S_OK()
Expand Down
2 changes: 1 addition & 1 deletion src/DIRAC/ConfigurationSystem/private/Refresher.py
Original file line number Diff line number Diff line change
Expand Up @@ -72,7 +72,7 @@ def autoRefreshAndPublish(self, sURL):
"""
gLogger.debug("Setting configuration refresh as automatic")
if not gConfigurationData.getAutoPublish():
gLogger.debug("Slave server won't auto publish itself")
gLogger.debug("Worker server won't auto publish itself")
if not gConfigurationData.getName():
import DIRAC

Expand Down
30 changes: 15 additions & 15 deletions src/DIRAC/ConfigurationSystem/private/RefresherBase.py
Original file line number Diff line number Diff line change
@@ -1,13 +1,11 @@
import time
import random


from DIRAC.ConfigurationSystem.Client.ConfigurationData import gConfigurationData
from DIRAC.ConfigurationSystem.Client.PathFinder import getGatewayURLs
from DIRAC.FrameworkSystem.Client.Logger import gLogger
from DIRAC.Core.Utilities import List
from DIRAC.Core.Utilities.EventDispatcher import gEventDispatcher
from DIRAC.Core.Utilities.ReturnValues import S_OK, S_ERROR
from DIRAC.Core.Utilities.ReturnValues import S_ERROR, S_OK
from DIRAC.FrameworkSystem.Client.Logger import gLogger


def _updateFromRemoteLocation(serviceClient):
Expand Down Expand Up @@ -90,29 +88,31 @@ def _refreshAndPublish(self):
Refresh configuration and publish local updates
"""
self._lastUpdateTime = time.time()
gLogger.info("Refreshing from master server")
sMasterServer = gConfigurationData.getMasterServer()
if sMasterServer:
gLogger.info("Refreshing from controller server")
sControllerServer = gConfigurationData.getMasterServer()
if sControllerServer:
from DIRAC.ConfigurationSystem.Client.ConfigurationClient import ConfigurationClient

oClient = ConfigurationClient(
url=sMasterServer,
url=sControllerServer,
timeout=self._timeout,
useCertificates=gConfigurationData.useServerCertificate(),
skipCACheck=gConfigurationData.skipCACheck(),
)
dRetVal = _updateFromRemoteLocation(oClient)
if not dRetVal["OK"]:
gLogger.error("Can't update from master server", dRetVal["Message"])
gLogger.error("Can't update from controller server", dRetVal["Message"])
return False
if gConfigurationData.getAutoPublish():
gLogger.info("Publishing to master server...")
gLogger.info("Publishing to controller server...")
dRetVal = oClient.publishSlaveServer(self._url)
if not dRetVal["OK"]:
gLogger.error("Can't publish to master server", dRetVal["Message"])
gLogger.error("Can't publish to controller server", dRetVal["Message"])
return True
else:
gLogger.warn("No master server is specified in the configuration, trying to get data from other slaves")
gLogger.warn(
"No controller server is specified in the configuration, trying to get data from other Workers"
)
return self._refresh()["OK"]

def _refresh(self, fromMaster=False):
Expand All @@ -127,9 +127,9 @@ def _refresh(self, fromMaster=False):
initialServerList = gatewayList
gLogger.debug("Using configuration gateway", str(initialServerList[0]))
elif fromMaster:
masterServer = gConfigurationData.getMasterServer()
initialServerList = [masterServer]
gLogger.debug(f"Refreshing from master {masterServer}")
controllerServer = gConfigurationData.getMasterServer()
initialServerList = [controllerServer]
gLogger.debug(f"Refreshing from controller {controllerServer}")
else:
initialServerList = gConfigurationData.getServers()
gLogger.debug(f"Refreshing from list {str(initialServerList)}")
Expand Down
22 changes: 11 additions & 11 deletions src/DIRAC/ConfigurationSystem/private/ServiceInterface.py
Original file line number Diff line number Diff line change
Expand Up @@ -11,44 +11,44 @@

class ServiceInterface(ServiceInterfaceBase, threading.Thread):
"""
Service interface, manage Slave/Master server for CS
Service interface, manage Worker/Controller server for CS
Thread components
"""

def __init__(self, sURL):
threading.Thread.__init__(self)
ServiceInterfaceBase.__init__(self, sURL)

def _launchCheckSlaves(self):
def _launchCheckWorkers(self):
"""
Start loop which check if slaves are alive
Start loop which check if Workers are alive
"""
gLogger.info("Starting purge slaves thread")
gLogger.info("Starting purge Workers thread")
self.daemon = True
self.start()

def run(self):
while True:
iWaitTime = gConfigurationData.getSlavesGraceTime()
time.sleep(iWaitTime)
self._checkSlavesStatus()
self._checkWorkersStatus()

def _updateServiceConfiguration(self, urlSet, fromMaster=False):
def _updateServiceConfiguration(self, urlSet, fromController=False):
"""
Update configuration of a set of slave services in parallel
Update configuration of a set of Worker services in parallel
:param set urlSet: a set of service URLs
:param fromMaster: flag to force updating from the master CS
:param fromController: flag to force updating from the master CS
:return: Nothing
"""
if not urlSet:
return
with ThreadPoolExecutor(max_workers=len(urlSet)) as executor:
futureUpdate = {executor.submit(self._forceServiceUpdate, url, fromMaster): url for url in urlSet}
futureUpdate = {executor.submit(self._forceServiceUpdate, url, fromController): url for url in urlSet}
for future in as_completed(futureUpdate):
url = futureUpdate[future]
result = future.result()
if result["OK"]:
gLogger.info("Successfully updated slave configuration", url)
gLogger.info("Successfully updated Worker configuration", url)
else:
gLogger.error("Failed to update slave configuration", url)
gLogger.error("Failed to update Worker configuration", url)
Loading

0 comments on commit af2786a

Please sign in to comment.