Nightly TDAQ Release

These are the evolving release notes for the next major TDAQ release.

source /cvmfs/atlas.cern.ch/repo/sw/tdaq/tools/cmake_tdaq/bin/cm_setup.sh nightly
mkdir work
cd work
cp /cvmfs/atlas.cern.ch/repo/sw/tdaq/tools/cmake_tdaq/cmake/templates/CMakeLists.txt .
getpkg owl
mkdir build
cd build
cmake ..
make
make install

Use of Python 2 based system commands

With the move to Python 3 inside the TDAQ software there is now a fundamental incompability between the normal system setup and the TDAQ environment. If one wants to call a system command that is implemented in Python (2 for CentOS 7), like yum or auth-get-sso-cookie the environment has manipulated.

In most cases something like the following will be enough:

env -u PYTHONHOME -u PYTHONPATH auth-get-sso-cookie ...

If the command loads compiled libraries as well, it may be necessary to add a -u LD_LIBRARY_PATH as well.

BeamSpotUtils

  • Updates to support new HLT histogram naming convention for Run3.
  • Add support for new track-based method.
  • Improvements and refactoring to support easier testing.

CES

Package: CES
Jira: ATDAQCCCES

Live documentation about recoveries and procedure can be found here.

Changes in recoveries and/or procedures:

  • When the TDAQ_CES_FORCE_REMOVAL_AUTOMATIC environment variable is set to true, then no acknowledgment is asked by the operator for a stop-less removal action to be executed (regardless of the machine or beam mode);
  • Stop-less removal involving the SwRod: the reporting application now always receives the initial list of components back (and not only the valid components, as it was before);
  • After each clock switch, a new command is sent to AFP and ZDC;
  • The BeamSpotArchiver_PerBunchLiveMon application it not started at warm stop anymore;
  • The DCM is now notified when a SwROD dies or is restarted (in the same way as it happens with a ROS or a SFO).

Internal changes:

  • Following changes in the MasterTrigger interface;
  • Fixed bug not allowing the auto-pilot to be disabled in some conditions;
  • Fixed bug causing the ERS HoldingTriggerAction message to not be sent.

Igui

The Igui twiki can be found here.

The Igui settings can now be configured via a configuration file. The file must be formatted as a property file. The Igui settings can still be configured using command line (i.e., system) properties and environment variables as described here; at the same time priorities for configuration items have been defined: - Items defined in the configuration file have the highest priority; - Items not found in the configuration file are then searched in system properties; - If no properties are defined for a configuration item, then environment variables are used.

The configuration file can be defined via the igui.property.file property. If that property is not defined, then a configuration file is searched in <USER-HOME>/.igui/igui_<TDAQ-RELEASE>.properties.

Here is an example of a configuration file containing all the available options:

# [See https://twiki.cern.ch/twiki/bin/viewauth/Atlas/DaqHltIGUI#Settings for more details](https://gitlab.cern.ch/atlas-tdaq-software/Igui)

# [Environment variables can be used with the following format:](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [${ENV_VAR_NAME:-defaultValue}](https://gitlab.cern.ch/atlas-tdaq-software/Igui)

# [The default ERS subscription](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_ERS_SUBSCRIPTION environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.ers.subscription=sev = FATAL

# [The location of the OKS file holding the description of the default ERS filter](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_ERS_FILTER_FILE environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.ers.filter.file=/atlas/oks/${TDAQ_RELEASE:-tdaq-10-00-00}/combined/sw/Igui-ers.data.xml

# [The log-book URL](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_ELOG_SERVER_URL environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.elog.url=http://pc-atlas-www.cern.ch/elisa/api/

# [The OKS GIT web view URL](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_DB_GIT_BASE environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.git.url=http://pc-tdq-git.cern.ch/gitea/oks/${TDAQ_RELEASE:-tdaq-10-00-00}/commit/

# [The browser to be used to open web links](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_USE_BROWSER environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.use.browser=firefox

# [Whether to show the warning at CONNET or not](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_NO_WARNING_AT_CONNECT environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.hide.connect.warning=true

# [Whether to ask for the control resource when the Igui is started](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_USE_RM environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.useRM=true

# [The timeout (in minutes) after which an Igui in status display mode will show a dialog informing ](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [the operator that it is going to be terminated](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_FORCEDSTOP_TIMEOUT environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
tdaq.igui.forcedstop.timeout=60

# [Message to be shown by a confirmation dialog before the STOP command is sent](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_STOP_WARNING environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.stop.warning=Do not stop the partition@${TDAQ_PARTITION:-ATLAS}

# [Number of rows to be shown by default by the ERS panel](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_ERS_ROWS environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.ers.rows=1000

# [Panels to forcibly load when the Igui is started](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_PANEL_FORCE_LOAD environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [Panels are identified by their class names; multiple panels can be separated by ":"](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.panel.force.load=PmgGui.PmgISPanel:IguiPanels.DFPanel.DFPanel

# [Name of the main RDB server](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [WARNING: do not define this property unless you really know what you are doing](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.rdb.name=RDB

# [Name of the read/write RDB server](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [WARNING: do not define this property unless you really know what you are doing](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.rdb_rw.name=RDB_RW

MonInfoGatherer

Alternative implementation for merging non-histogram data (ADHI-4842). This should resolve most of the timing issues we have seen with DCM data in the past. It is enabled by default but can be disabled with ISDynAnyNG configuration parameter (see ConfigParameters.md).

PartitionMaker

The underlying implementation of the pm_farm.py tool has been changed from ParallelSSH to fabric. The reason is that the former seems to be mostly unmaintained in the last couple of years.

ProcessManager

The ProcessManager twiki can be found here.

RunControl

This is the link to the main RunControl twiki.

SFOng

  • added: periodic update of free buffer counter (IS) even when no data are received was: "0" free buffers was published when no data received giving the wrong impression that SFOng was about to assert backpressure.

TriggerCommander

Package: TriggerCommander

Implementations of the MasterTrigger interface

An implementation of the MasterTrigger interface has to implement a new method:

class X : public MasterTrigger {
  ...

  void setPrescalesAndBunchgroup(uint32_t l1p, uint32_t hltp, uint32_t bg) override;
  ...
};

beauty

nightly

Remove automated setting of pBeast server

The initial Beauty implementation provided code automatically setting the pBeast server based on the values of environmental variables. This logic proved to be weak and not easily maintainable: * especially in testbed, the pBeast server name has changed many times * overloading the pBeast environmental variable for the authentication method for setting the server name is strictly speaking incorrect.

The new implementation removes completely this detection code. It is instead responsibility of the user to always provide a server name to the Beauty constructor.

Old code relying on the previous implicit mechanism will fail:

>>> import beauty
>>> beauty.Beauty()
Traceback (most recent call last):
  File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'server'

Control pBeast authentication method

Different pBeast servers require different authentication methods. Currently two methods are support: * no authentication required * authentication via an existing Kerberos token

The authentication method to be used can be controlled via a dedicated environmental variable (PBEAST_SERVER_SSO_SETUP_TYPE) or via the library API. The latter method is now exposed in the Beauty interface. The Beauty.__init__ method accepts a keyword argument cookie_setup. Valid values are: * None → the default behaviour of the pBeast library will be used. The environment variable will be respected, if set * Beauty.NOCOOKIE → no authentication required * Beauty.AUTOUPDATEKERBEROS → authentication via Kerberos token

>>> import beauty
>>> b = beauty.Beauty('http://someserver', cookie_setup=beauty.Beauty.NOCOOKIE)

coca

  • Tag coca-03-15-09
  • Add type annotations for Python modules.

coldpie

  • tag coldpie-00-05-01
  • add coral.pyi and cool.pyi type-annotated interface specifications

DAQ Tokens

Add a new method to acquire a token if the user has a kerberos ticket: 'http'. This connects to a URL (default: ${TDAQ_TOKEN_HTTP_URL:-https://vm-atdaq-token.cern.ch/token} ) which is protected by GSSAPI and will return a token if the authentication succeeds. This is intended to replace the gssapi mechanism going forward.

A PEM file can contain more than one public key. All will be processed by verify(). This allows to rotate in a new key while the old one is still valid without changing TDAQ_TOKEN_PUBLIC_KEY_URL manually - which is anyway impossible for already running processes.

DVS GUI (graphical UI for DVS)

See also related DVS and TestManager packages.

dynlibs - Load shared libraries

This package is deprecated from tdaq-09-00-00 onwards.

Please use plain dlopen() or boost::dll instead. Note that unlike in this package, the boost::dll::shared_library object has to stay in scope as long as the shared library is used !

Example of possible replacement

#include <boost/dll.hpp>

double example()
{
   boost::dll::shared_library lib("libmyplugin.so",
                                  boost::dll::load_mode::type::search_system_folders |
                                  boost::dll::load_mode::type::rtld_now);
   // Get pointer to function with given signature
   auto f = lib.get<double(double, double)>("my_function")
   return f(10.2, 3.141);
}

emon

  • Connections between event samplers and monitors have been optimized. Existing configurations should be adjusted to benefit form that. Previously the NumberOfSamplers parameter has been used to define a number of samplers to be connected by all monitors of a group that uses the same selection criteria. In the new implementation this number defines a number of samplers that each individual monitor has to connect. That makes no difference for monitors that used to connect to a single sampler and do't form a group. For the monitors that share the same selection criteria, like for example the Global Monitoring tasks, this number should be changed to the old number divided to the number of the monitors in the group. For Athena monitoring the corresponding parameter of a JO file is called KeyCount.

mda

  • tag mda-07-19-01
  • Add type annotations for Python modules.

swrod

Memory Management

Memory management of the SW ROD fragment builders can now be configured via the MemoryPool OKS class, which in the previous releases used to be ignored. Each SwRodApplication object has been already linked with an instance of the MemoryPool, but in the new release it will be used to define the default configuration for all ROBs handled by the given SW ROD application. This configuration can be overridden for a particular ROB by linking another instance of the MemoryPool with the corresponding SwRodRob object. The meaning of the two MemoryPool attributes is the following:

  • PageSize - default size of individual memory pages allocated by this memory pool. Note that this value can be overridden by fragment builder implementation, which is explained by the algorithm's description in the updated User's Guide.

  • NumberOfPages - the number of pages that will be pre-allocated before a new run is started. The maximum number of pages that can be allocated by the memory pool is unlimited.

If one memory page is not large enough to hold the data of a particular ROB fragment the fragment builders will allocate extra pages. More information is given in the updated User's Guide.

Note that the MaxMessageSize parameter of the SwRodFragmentBuilder class has now slightly different meaning with respect to the previous SW ROD versions. Now it truly means what its name implies, i.e. it defines the maximum size of a single data packet that will be accepted by the algorithm. Packets with the sizes exceeding this limit will be discarded. Packets of a smaller size are guaranteed to be added to the ROB fragment payload without truncation.

Support of netio and netio-next

Starting from this release SW ROD doesn't support any more the legacy netio protocol and therefore cannot be used to receive data from the old felix-core systems. For receiving data from felix-star via the new netio-next protocol one must use the FelixClient interface, which can be configured via the SwRodFelixInput OKS class. The direct use of the netio-next API is also no longer supported. Because of this both the SwRodNetioInput and the SwRodNetioNextInput classes have been removed from the SW ROD OKS schema file.

transport

Clients using classes from the transport package should look into more modern network libraries like boost::asio until the C++ standard contains an official network library.

webis_server

The JSON based API no longer HTML escapes the metadata for the object. This is unnecessary since the JSON invalid characters are different from HTML and the JSON encode does its own escaping anyway.

The one mostly affected are histograms requested as raw IS value in JSON - however, this should be a basically non-existant use case. The type names e.g. change from HistogramData&lt;int&gt; to HistogramData<int>.