Developer Q&A

Multi-threading or multi-processing?

Harvester has a multi-threading architecture. Each thread executes plugin's function synchronously. Plugins can spawn background processes and/or threads to process tasks asynchronously, but the number of background processes or threads must be controlled by plugins.

How grid plugin API work?

Each worker is identified by a unique identifier in the batch system like batch jobid and condor jobid. Plugins take actions with the identifier.

Harvester handle heartbeats/status info back to panda?

Harvester's propagator agents send heartbeats every 30 min for running jobs or immediately for finished/failed jobs.

How Harvester uses external components like pilot, APF and aCT?

It uses external components as libraries. i.e. in the same process and same memory space.

How are pilots killed?

Normally pilots kill themselves once they get the kill command from PanDA through heartbeats. However, even if pilots stop sending heartbeats Harvester will be able to get the list of stuck pilots from PanDA to directly kill them using condor_rm etc.

Which harvesters handle which PQs?

Each harvester instance will have a unique identifier. Config files for harvester instances are stored on PanDA. A config file is downloaded with the identifier when the instance is up. The config file contains the list of PQs for which the instance works.

Multiple harvesters per PQ?

It is possible to have multiple harvester instances per PQ. For example, queue depth can be dynamically set by PanDA in an harvester instance. An easiest solution would be to set queue depth to 1000 when only one instance is running, then it would be reduced to 500 when another instance is up for the same PQ.

What exact job-specific attributes are desired from check_workers()?

In the pull model workflow only status would be enough since the pilot directly reports other information. In the push model workflow all information which the pilot reports would be desirable.

Can we extend job attributes if needed?

Job attributes are stored in a clob field in the harvester DB. The field contains a dictionary so that it is easy to add new attributes.

How harvester monitor works?

The idea is to periodically upload contents of harvester DB to Oracle. There will be a full or slimmed mirror table of the harvester DB in Oracle. BigPandaMon will show views on the table.

API call arg docs / return docs

See plugin API specifications

Home

Getting started
Installation and configuration
Testing and running
Debugging
Work with Middleware
Admin FAQ

Developer pages
Code structure
DB structure
State and sequence diagrams
Plugin API specifications
Agents and Plugins descriptions
Plugin utilities
Workflows supported by harvester
Developer Q&A
Release

Development guides
Development workflow
Tagging

Production & commissioning
Condor experiences
Commissioning on the grid
Production servers
Service monitoring
Auto Queue Configuration with AGIS
GCE setup
Kubernetes setup
SSH+RPC middleware setup

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Developer Q&A

Multi-threading or multi-processing?

How grid plugin API work?

Harvester handle heartbeats/status info back to panda?

How Harvester uses external components like pilot, APF and aCT?

How are pilots killed?

Which harvesters handle which PQs?

Multiple harvesters per PQ?

What exact job-specific attributes are desired from check_workers()?

Can we extend job attributes if needed?

How harvester monitor works?

API call arg docs / return docs

Clone this wiki locally