tdaq-11-02-00
The ATLAS TDAQ software version tdaq-11-02-00
has been released
on December 1st, 2023.
Update: tdaq-11-02-01
has been built on January 31st, 2024. The only
difference is a new ROOT version which fixes a critical bug.
Availability and Installation
Outside of Point 1 the software should be used via CVMFS. It's official location is
/cvmfs/atlas.cern.ch/repo/sw/tdaq/tdaq/tdaq-11-02-00/
At Point 1 the software is as usual available at
/sw/atlas/tdaq/tdaq-11-02-00/
The software can also be installed locally via ayum.
git clone https://gitlab.cern.ch/atlas-sit/ayum.git
source ayum/setup.sh
Modify the prefix
entries in the yum repository files in ayum/etc/yum.repos.d/*.repo
to point to the desired destination.
ayum install tdaq-11-02-00_x86_64-el9-gcc13-opt
Note that ayum
is not supported beyond CentOS 7. Instead switch to the
new atlas-dnf5 tool. Pick
the latest release.
In case the LCG RPMs are not found with ayum, add this to etc/yum.repos.d/lcg.repo:
[lcg-104-centos7-x86_64]
name=LCG 104 Releases (CentOS 7)
baseurl=https://lcgpackages.web.cern.ch/lcgpackages/lcg/repo/7/x86_64/LCG_104/
prefix=...your-prefix...
gpgcheck=0
enabled=0
protect=0
[lcg-104-el9-x86_64]
name=LCG 104 Releases (EL 9)
baseurl=https://lcgpackages.web.cern.ch/lcgpackages/lcg/repo/9/x86_64/LCG_104/
prefix=...your-prefix...
gpgcheck=0
enabled=0
protect=0
[lcg-packs-centos7-x86_64]
name=LCG Release Packages (CentOS 7)
baseurl=https://lcgpackages.web.cern.ch/lcgpackages/lcg/repo/7/x86_64/Packages/
prefix=...your-prefix...
gpgcheck=0
enabled=0
protect=0
[lcg-packs-centos9-x86_64]
name=LCG Release Packages (CentOS 9)
baseurl=https://lcgpackages.web.cern.ch/lcgpackages/lcg/repo/9/x86_64/Packages/
prefix=...your-prefix...
gpgcheck=0
enabled=0
protect=0
Configurations
The release is available for the following configurations:
- x86_64-el9-gcc13-opt (default at Point 1)
- x86-64-el9-gcc13-dbg (debug version at Point 1)
- x86_64-centos7-gcc11-opt (legacy)
- x86_64-centos7-gcc11-dbg (legacy)
- aarch64-el9-gcc13-opt (experimental)
The EL 9 variant recognizes all of CentOS Stream, Rocky Linux, Alma Linux, RedHat Enterprise Linux as equivalent. Note that the tag for the operating system has changed from centos9 to el9. This may also require changes in OKS databases that are copied over from older releases
External Software
LCG_104c
The version of the external LCG software is LCG_104c.
Update: The LCG version for tdaq-11-02-01 is LCG_104d where ROOT was updated to v6.28.12.
TDAQ Specific External Software
Package | Version | Requested by |
---|---|---|
cmzq | 4.2.1 | FELIX |
zyre | 2.0.1 | FELIX |
colorama | 0.4.4 | DCS |
opcua | 0.98.13 | DCS |
parallel-ssh | 2.10.0 | TDAQ (PartitionMaker) |
pugixml | 1.9 | L1Calo, L1CTP |
ipbus-software | 2.8.12 | L1Calo, L1CTP |
microhttpd | 0.9.73 | TDAQ (pbeast) |
mailinglogger | 5.1.0 | TDAQ (SFO) |
Twisted | 22.4.0 | TDAQ (webis_server) |
urwid | 2.1.2 | TDAQ |
jwt-cpp | v0.6.0 | TDAQ |
Flask | 2.1.3 | TDAQ |
flask-sock | 0.6.0 | TDAQ (Phase II) |
gunicorn | 21.2.0 | TDAQ (Phase II) |
python-ldap | 3.4.3 | TDAQ (Phase II) |
Note that libfabric
has been removed from the TDAQ externals.
New and Removed Packages
The following packages have been added since tdaq-10-00-00:
- FELIXPyTools
- ers2idl
- sso-helper
The following packages have been removed since tdaq-10-00-00:
- ROSTCPNP
- felixbus
- netio
- pmaker
- trp_gui
- TriggerTool
- TrigDb
- BunchGroupUpdate
OCI and Apptainer Images
You can mostly replace podman
with docker
in the following examples.
The OCI images used for building and testing the TDAQ software are available here:
podman pull gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-el9
podman pull gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:aarch64-el9
podman pull gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-centos7
podman pull gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:aarch64-centos7
A multi-arch image for both Intel and ARM64 is available as
gitlab-registry.cern.c/atlas-tdaq-software/tdaq_ci:el9
Run it like this:
podman run -it --rm -v /cvmfs:/cvmfs:ro,shared gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-el9
The corresponding apptainer/singularity images are available on CVMFS:
/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-el9
/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:aarch64-el9
/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-centos7
/cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:aarch64-centos7
Run it like this:
apptainer shell -p -B /cvmfs /cvmfs/unpacked.cern.ch/gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-centos7
The -p
option puts you in a separate PID namespace, so you only see the processes inside the container when
you use ps xuwf
. It will also kill all of your processes when you exit the container.
In case you have no local CVMFS, build your own SIF file:
apptainer build tdaq-el9.sif docker://gitlab-registry.cern.ch/atlas-tdaq-software/tdaq_ci:x86_64-el9
apptainer shell -p -B /cvmfs tdaq-el9.sif
An inclusive multi-arch container that does not need CVMFS and includes the LCG software is
available here (only for for *-el9-gcc13-opt
configuration):
podman pull registry.cern.ch/atlas-tdaq/tdaq:11.2.0
podman run --it --rm registry.cern.ch/atlas-tdaq/tdaq:11.2.0
Graphical UIs in a container
GUIs mostly don't work out of the box in a container. apptainer
should work out of the
box with the instructions above.
For podman
use the following additional arguments:
podman run -it -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -h $HOSTNAME --rm registry.cern.ch/atlas-tdaq/tdaq:11.2.0
docker
is the most complicated. If you plan to use the container regularly, the best way is to
create a customized version just for you: (replace the uid
and user
arguments with your local
user ID and user name.
FROM registry.cern.ch/atlas-tdaq/tdaq:11.2.0
ARG uid=1000
ARG user=rhauser
RUN useradd -U -u ${uid} -m ${user}
USER ${user}
Then run:
docker build -t my/tdaq:11.2.0 .
Alternatively you can overwrite the arguments from the command line by adding --build-arg uid=2000 --build-arg user=myself
.
Finally run it like this:
docker run -it -v /tmp/.X11-unix:/tmp/.X11-unix -e DISPLAY=$DISPLAY -h $HOSTNAME --rm my/tdaq:11.2.0
Inside the container
The above examples put you into a container where you have the same user ID as outside. This also simplifies the use case where you want to mount e.g. a directory with files from the host into the container.
. /etc/profile.d/tdaq.sh
cm_setup tdaq-11-02-00
export TDAQ_IPC_INIT_REF=file:/tmp/init.ref
cd /tmp
pm_part_hlt.py -p test
setup_daq test.data.xml initial
setup_daq test.data.xml test
You should get the IGui on your screen and be able to interact with it.
Setup Options
In the following we assume some alias like
alias cm_setup='source /cvmfs/atlas.cern.ch/repo/sw/tdaq/tools/cmake_tdaq/bin/cm_setup.sh'
- The
cm_setup --list
option will show the available releases including nightlies - The
cm_setup --clean ...
option will bypass all testbed specific setup. This is useful if you want to use testbed hardware but be completely independent from the existing infrastructure. You have to set your ownTDAQ_IPC_INIT_REF
path to start a private initial partition, if needed. - The
cm_setup
script takes a short version of the CMTCONFIG build configuration as argument. E.g.cm_setup nightly dbg
will setupx86_64-el9-gcc13-dbg
cm_setup nightly gcc13
will setupx86_64-el9-gcc13-opt
cm_setup nightly gcc12-dbg
will setupx86_64-el9-gcc13-dbg
- There is no short cut for setting the architecture or the OS
AccessManager
tdaq-11-02-00
Add TDAQ_AM_CONFIGURATION_FILE
process environment variable to specify client configuration file. The /sw/tdaq/AccessManager/cfg/client.cfg
file has a priority if exists.
Add SETPRESCALESANDBUNCHGROUP
to the TriggerCommander interface commands.
Add RDB interface with OPEN
, CLOSE
, UPDATE
, RELOAD
, OPEN_SESSION
and LIST_SESSIONS
commands to be used by the RDB server.
BeamSpotUtils
- Updates to support new HLT histogram naming convention for Run3.
- Add support for new track-based method.
- Improvements and refactoring to support easier testing.
CES
Package: CES
Jira: ATDAQCCCES
Live documentation about recoveries and procedure can be found here.
Changes in recoveries and/or procedures:
- When the
TDAQ_CES_FORCE_REMOVAL_AUTOMATIC
environment variable is set totrue
, then no acknowledgment is asked by the operator for a stop-less removal action to be executed (regardless of the machine or beam mode); - Stop-less removal involving the SwRod: the reporting application now always receives the initial list of components back (and not only the valid components, as it was before);
- After each clock switch, a new command is sent to AFP and ZDC;
- The
BeamSpotArchiver_PerBunchLiveMon
application it not started at warm stop anymore; - The
DCM
is now notified when aSwROD
dies or is restarted (in the same way as it happens with a ROS or a SFO).
Internal changes:
- Following changes in the
MasterTrigger
interface; - Fixed bug not allowing the auto-pilot to be disabled in some conditions;
- Fixed bug causing the ERS
HoldingTriggerAction
message to not be sent.
Igui
The Igui
twiki can be found here.
The Igui settings can now be configured via a configuration file. The file must be formatted as a property file. The Igui settings can still be configured using command line (i.e., system) properties and environment variables as described here; at the same time priorities for configuration items have been defined: - Items defined in the configuration file have the highest priority; - Items not found in the configuration file are then searched in system properties; - If no properties are defined for a configuration item, then environment variables are used.
The configuration file can be defined via the igui.property.file
property. If that property is not defined, then
a configuration file is searched in <USER-HOME>/.igui/igui_<TDAQ-RELEASE>.properties
.
Here is an example of a configuration file containing all the available options:
# [See https://twiki.cern.ch/twiki/bin/viewauth/Atlas/DaqHltIGUI#Settings for more details](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [Environment variables can be used with the following format:](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [${ENV_VAR_NAME:-defaultValue}](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [The default ERS subscription](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_ERS_SUBSCRIPTION environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.ers.subscription=sev = FATAL
# [The location of the OKS file holding the description of the default ERS filter](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_ERS_FILTER_FILE environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.ers.filter.file=/atlas/oks/${TDAQ_RELEASE:-tdaq-10-00-00}/combined/sw/Igui-ers.data.xml
# [The log-book URL](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_ELOG_SERVER_URL environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.elog.url=http://pc-atlas-www.cern.ch/elisa/api/
# [The OKS GIT web view URL](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_DB_GIT_BASE environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.git.url=http://pc-tdq-git.cern.ch/gitea/oks/${TDAQ_RELEASE:-tdaq-10-00-00}/commit/
# [The browser to be used to open web links](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_USE_BROWSER environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.use.browser=firefox
# [Whether to show the warning at CONNET or not](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_NO_WARNING_AT_CONNECT environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.hide.connect.warning=true
# [Whether to ask for the control resource when the Igui is started](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_USE_RM environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.useRM=true
# [The timeout (in minutes) after which an Igui in status display mode will show a dialog informing ](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [the operator that it is going to be terminated](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_FORCEDSTOP_TIMEOUT environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
tdaq.igui.forcedstop.timeout=60
# [Message to be shown by a confirmation dialog before the STOP command is sent](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_STOP_WARNING environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.stop.warning=Do not stop the partition@${TDAQ_PARTITION:-ATLAS}
# [Number of rows to be shown by default by the ERS panel](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_ERS_ROWS environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.ers.rows=1000
# [Panels to forcibly load when the Igui is started](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [That can also be defined via the TDAQ_IGUI_PANEL_FORCE_LOAD environment variable](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [Panels are identified by their class names; multiple panels can be separated by ":"](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.panel.force.load=PmgGui.PmgISPanel:IguiPanels.DFPanel.DFPanel
# [Name of the main RDB server](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [WARNING: do not define this property unless you really know what you are doing](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.rdb.name=RDB
# [Name of the read/write RDB server](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
# [WARNING: do not define this property unless you really know what you are doing](https://gitlab.cern.ch/atlas-tdaq-software/Igui)
igui.rdb_rw.name=RDB_RW
Jers - Java ERS
tdaq-11-04-00
Use of unmaintained SPI (https://github.com/rspilker/spi) for annotating classes providing ers.Stream interface for runtime registration in ERS StreamFactory is replaced with org.reflections (https://github.com/ronmamo/reflections). Now stream classes are declared (annotated) like
@ErsStreamName(name="null")
public class NullStream extends AbstractOutputStream { ...
The list of packages for search of stream providers is ers
and mts
.
MonInfoGatherer
Alternative implementation for merging non-histogram data (ADHI-4842). This
should resolve most of the timing issues we have seen with DCM data in the
past. It is enabled by default but can be disabled with ISDynAnyNG
configuration parameter (see ConfigParameters.md
).
PartitionMaker
The underlying implementation of the pm_farm.py
tool has
been changed from ParallelSSH to fabric.
The reason is that the former seems to be mostly unmaintained
in the last couple of years.
tdaq-11-02-00
Build DALs from oks files in release installation area to avoid side effects when use oksgit and rdb
ProcessManager
The ProcessManager
twiki can be found here.
Resource manager (RM)
tdaq-11-02-00
Important changes
The configuration is loaded when RM server is started in accordance with TDAQCCRM-16.
The RM was updated by results of the review TDAQCCRM-20: - Check correspondence of existing design and implementation with known client needs (PMG, IGUI, other systems). - Identify and remove any branches of code not used anymore. - Follow TDAQ recommendations to use ERS exceptions in c++, java and IDL - Identify needs to add/fix doxygen and javadoc comments if missing, wrong , incomplete or unclear.
All exception mentioned below are in daq::rmgr
namespace.
IDL updates
Removed exceptions:
- PartitionNotFoundE
- ObjectNotFoundE
- HandleIdNotFoundE
- ProcessNotFoundE
- AcceessDeniedE (only AcceessAnyDeniedE in use for check if access avaiable)
Added exception:
- SoftwareNotFoundE
Changes in interface methods
rmConfigUpdate - parameters chanaged
In previuose version client could update configuration on server using local dbname. Now in accordings with TDAQCCRM-16 the configuration is loaded now only on server side. This method only initiates server configuration reload from the same database as it was used on start time. New signature: void rmConfigUpdate ( in string partition, in string user ) raises( ConfigExceptionE); Parameters reflect which user and from which partition the reload of configuration was initiated. Exception is raised if configuration update failed on server side.
requestResources - exceptions changed
Removed: - exception ObjectNotFoundE Added: - SoftwareNotFoundE Raised if failed to find software object with given in the RM
requestResourcesForProcess - exceptions changed
Removed: - exception ObjectNotFoundE Added: - ResourceNotFoundE Raised if failed to find resource with given in the RM
requestResourcesForProcess - removed in accordance with ATDAQCCRM-4
freeProcessResources - exceptions changed
Removed exceptions: - PartitionNotFoundE - ProcessNotFoundE - ObjectNotFoundE
freeAllProcessResources - exceptions changed
Removed exceptions: - HandleIdNotFoundE - PartitionNotFoundE - ProcessNotFoundE - ObjectNotFoundE
freeResources - exceptions changed
Removed: - exception HandleIdNotFoundE
freePartition is depricated and will be removed in next release. Use freeAllInPartition instead
freeAllInPartition
The behavior of this method is the same as freePartition, only the name has been changed to be more clear to users.
freeComputerAll is depricated and will be removed in next release. Use freeAllOnComputer instead
freeAllOnComputer
The behavior of this method is the same as freeComputerAll, only the name has been changed to be more clear to users.
freePartitionResource - exceptions changed
Removed: - exception AcceessDeniedE
freeResource - exceptions changed
Removed: - exception AcceessDeniedE
getResourceInfo - exceptions changed
Removed exceptions: - PartitionNotFoundE - ResourceNotFoundE
getFullResInfo - exceptions changed
Removed: - exception ResourceNotFoundE
getHandleInfo - exceptions changed
Removed: - exception HandleIdNotFoundE
getResourceOwners - exceptions changed
Removed exceptions: - PartitionNotFoundE - ResourceNotFoundE
RM_Client API updates
Most external dependences moved from RM_Client class to new working RM_ClientImpl class that is out of client library. The Doxygen methods description are refreshed.
Exceptions policy
A lot of unthrown exceptions removed from all methods. Each method during connection with RM server could throw one of two exceptions: - LookupFailed - IPCException Every method could throw - CommunicationException
Additional exceptions can be raised in follows methods
requestResources with signature long requestResources(const std::string& partition, const std::string& swobjectid, const std::string& computerName, const std::string& clientName)
- SoftwareNotFound(
- CommunicationException
- UnavailableResources
requestResources with signature long requestResources(const std::string& partition, const std::string& swobjectid, const std::string& computerName, unsigned long mypid, const std::string& clientName)
- SoftwareNotFound
- CommunicationException
- UnavailableResources
requestResource with signature requestResource( const std::string& partition, const std::string& resource, const std::string& computer, const std::string& clientName, int process_id )
- ResourceNotFound
- CommunicationException
- UnavailableResources
requestResourceForMyProcess with signature requestResourceForMyProcess(const std::string& partition, const std::string& resource, const std::string& computerName, const std::string& clientName)
- ResourceNotFound
- CommunicationException
- UnavailableResources
getPartitionInfo
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
freePartition
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
freeAllInPartition
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
freeComputer
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
freeAllOnComputer
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
freeComputerResource
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
freeApplicationResources
- CannotVerifyToken
- AcceessAnyDenied
- CannotAcquireToken
RM_Client methods updates
rmConfigurationUpdate - input parameter changed
Description: RM server update the configuration from the DB that were used during RM server start.
Signature:
void rmConfigurationUpdate ( const std::string& partition );
Input parameter: Partitition in frame of which the update of configuration is activated.
rmConfigurationUpdate with signature void rmConfigurationUpdate ( Configuration * conf )
- removed
requestHWResources - removed in accorand will be removed in next release. Use freeAllInPartition insteaddance with ATDAQCCRM-4
requestHWResourceForProcess - removed in accordance with ATDAQCCRM-4
requestResource with signature long requestResource( const std::string& partition, const std::string& resource, const std::string& computer, const std::string& clientName, int process_id )
- added new method, see Doxygen documentation for details.
freePartition - depricated and will be removed in the next release. Use freeAllInPartition instead.
freeAllInPartition - added new method
The behavior of this method is the same as freePartition, only the name has been changed to be more clear to users.
freeComputer - depricated and will be removed in the next release. Use freeAllInPartition instead.
freeAllOnComputer - added new method
The behavior of this method is the same as freePartition, only the name has been changed to be more clear to users.
RM server updates
rmgr_server
Start up without preload config was replaced by mandatory config load since this is required by new initial infrastructure implementation. As the result environment variable TDAQ_RM_CONFIG is not used but new parameter parameter added in rmgr_server -c [ --configdb ] dbname The configuration database from which software objects and resources are loaded at startup. Mandatory. RM server reports ERS FATAL error and terminates if the database can not be loaded.
rmgr_server catch all thrown in RM_Server exceptions, report ers::fatal error and return EXIT_FAILURE.
RunControl
This is the link to the main RunControl twiki.
SFOng
- added: periodic update of free buffer counter (IS) even when no data are received was: "0" free buffers was published when no data received giving the wrong impression that SFOng was about to assert backpressure.
TriggerCommander
Package: TriggerCommander
Implementations of the MasterTrigger interface
An implementation of the MasterTrigger
interface has to implement a new method:
class X : public MasterTrigger {
...
void setPrescalesAndBunchgroup(uint32_t l1p, uint32_t hltp, uint32_t bg) override;
...
};
beauty
nightly
Remove automated setting of pBeast server
The initial Beauty
implementation provided code automatically setting the pBeast server based on the values of environmental variables. This logic proved to be weak and not easily maintainable:
* especially in testbed, the pBeast server name has changed many times
* overloading the pBeast environmental variable for the authentication method for setting the server name is strictly speaking incorrect.
The new implementation removes completely this detection code. It is instead responsibility of the user to always provide a server name to the Beauty
constructor.
Old code relying on the previous implicit mechanism will fail:
>>> import beauty
>>> beauty.Beauty()
Traceback (most recent call last):
File "<stdin>", line 1, in <module>
TypeError: __init__() missing 1 required positional argument: 'server'
Control pBeast authentication method
Different pBeast servers require different authentication methods. Currently two methods are support: * no authentication required * authentication via an existing Kerberos token
The authentication method to be used can be controlled via a dedicated environmental variable (PBEAST_SERVER_SSO_SETUP_TYPE
) or via the library API.
The latter method is now exposed in the Beauty
interface. The Beauty.__init__
method accepts a keyword argument cookie_setup
. Valid values are:
* None
→ the default behaviour of the pBeast library will be used. The environment variable will be respected, if set
* Beauty.NOCOOKIE
→ no authentication required
* Beauty.AUTOUPDATEKERBEROS
→ authentication via Kerberos token
>>> import beauty
>>> b = beauty.Beauty('http://someserver', cookie_setup=beauty.Beauty.NOCOOKIE)
coca
- Tag
coca-03-15-09
- Add type annotations for Python modules.
tdaq-11-02-00
- Tag
coca-03-15-08
- Fix syntax in
coca-migrate-3-to-4.sql
- Tag
coca-03-15-07
- Add missing include file for gcc 13
- Tag
coca-03-15-06
- Enable space cleanup across multiple servers/releases (ADHI-4902)
coldpie
- tag
coldpie-00-06-00
- dropped Python2 support
- tag
coldpie-00-05-01
- add
coral.pyi
andcool.pyi
type-annotated interface specifications
tdaq-11-02-01
tdaq-11-02-00
- tag
coldpie-00-05-00
- Add keyword-only
order
argument to ChannelSelection constructor
tdaq-11-01-00
config
tdaq-11-02-00
- rename
Configuration::prefetch_all_data()
toConfiguration::prefetch_data()
- add
Configuration::prefetch_schema()
to preload schema - always preload schema when Configuration is instantiated by python (see ADTCC-323)
The data are prefetched if "TDAQ_DB_PREFETCH_ALL_DATA" or "TDAQ_DB_PREFETCH_DATA" process environment variables are defined.
The schema is prefetced if "TDAQ_DB_PREFETCH_SCHEMA" process environment variables is defined, or the Configuration
object is instantiated by the python configuration classes.
dal
tdaq-11-02-00
- add el9 (Alma9) and gcc13 tags
- add application's segment path into
TDAQ_ERS_QUALIFIERS
(see ADTCC-324 for details) - add attribute
RunTags
toPartition
class to support run modes in oks (see ATDSUPPORT-456 for details) - add support for calculation of python paths by dal algorithm (see below)
Python paths by DAL algorithm
See ATDSUPPORT-452 for details.
The users should to remove any hardcoded python paths in oks and use new approach like described below.
The following lines added to Online@SW_Repository
in daq/sw/repository.data.xml
:
<attr name="PythonPaths" type="string">
<data val="${TAG}/lib"/>
<data val="external/${TAG}/lib/python3.9/site-packages"/>
<data val="share/lib/python"/>
</attr>
The following lines added to TDAQ Common"@SW_Repository
object in daq/sw/tdaq-common-repository.data.xml
:
<attr name="PythonPaths" type="string">
<data val="share/lib/python"/>
<data val="external/${TAG}/lib/python"/>
<data val="${TAG}/lib"/>
<data val="external/${TAG}/lib"/>
</attr>
This results:
TDAQ_DB_DATA='' dal_dump_apps -d oksconfig:daq/segments/setup.data.xml -p initial -n DefaultRootController | grep PYTHONPATH | sed 's/\:/\n/g;s/=/=\n/'
* PYTHONPATH=
"/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq/tdaq-99-00-00/installed/x86_64-el9-gcc13-opt/lib
/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq/tdaq-99-00-00/installed/share/lib/python
/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq/tdaq-99-00-00/installed/external/x86_64-el9-gcc13-opt/lib/python3.9/site-packages
/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq-common/tdaq-common-99-00-00/installed/share/lib/python
/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq-common/tdaq-common-99-00-00/installed/external/x86_64-el9-gcc13-opt/lib/python
/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq-common/tdaq-common-99-00-00/installed/x86_64-el9-gcc13-opt/lib
/cvmfs/atlas-online-nightlies.cern.ch/tdaq/nightlies/tdaq-common/tdaq-common-99-00-00/installed/external/x86_64-el9-gcc13-opt/lib"
DAQ Tokens
DPoP (Demonstrating Proof of Possession) Support
DAQ tokens will contain an associated DPoP proof which will be checked on the receiver side. This allows to associate an access token with a certain operation, making it impossible to replay a stolen/lost token against a different server for a different functionality.
From the user's point of view this appears as an
additional optional string parameter to acquire()
and
verify()
. The argument should be a URI like string
characterizing the desired operation. It can be as simple
as a generic service indicator like pmg:
in which case
it provides domain separation (e.g. such a token cannot
be used for changing trigger keys, or run control commands).
Or, it can contain the called method and arguments themselves,
in which case it can only be replayed with the exact same
parameters, e.g. trg://ATLAS/setPrescales/3423/2532
Furthermore the verify()
function takes an optional
parameter max_age
which specifies the maximum allowed age
of the DPoP proof. The default is 15 seconds (i.e. the token
is useless after 15 seconds). Since a DPoP proof is generated
by the client for each request its lifetime can be as short
or long as needed (defined by the server), up to the lifetime
of the access token itself
This concerns only the implementations of various CORBA services that use the access manager for authorization decisions and is transparent for everyone else.
Helper functions for credentials
#include "daq_tokens/credentials.h"
Use daq::tokens::get_credentials()
to either acquire a token or just
get the current user name.
Use daq::tokens::verify_credentials(creds)
to check the received credentials
and return the associated user name.
Both functions take an optional operation
string parameter for use with
DPoP proofs. The verify function takes another optional integer parameter specifying
the maximum age of the DPoP proof in seconds (default is 15 seconds).
Similarly the daq.tokens.Credentials
package in Java provides static
get()
and verify()
functions with multiple overloads to create
or verify credentials.
tdaq-11-04-00
Support oidc-agent
Add a new method (agent
) to acquire a token from a running oidc-agent
.
To make use of that the oidc-agent-desktop
package has to be installed
on the system (e.g. lxplus). You have to start the agent manually
if your system is not configured to do it automatically.
eval $(oicd-agent-service use)
Note: This does not work on lxplus on the moment, use
eval $(oidc-agent)
Create an entry with the name atlas-tdaq-token
:
oidc-gen --flow=device --client-id atlas-tdaq-token -m --pub atlas-tdaq-token
- Select CERN as issuer.
- Type return for scopes
- Specify a password (not your CERN password...!)
Every time you want make use of the agent, make sure it is started and then run
oidc-add atlas-tdaq-token
[enter your password]
Set the TDAQ_TOKEN_ACQUIRE
environment variable to agent
.
Use any interactive TDAQ commands as usual.
Support local token cache
Add a new method to acquire token from a refresh token stored in a cache file: 'cache'. This is intended for interactive use or for long running jobs where a refresh token is acquired out of band and provisioned for the specific use case.
Example:
export TDAQ_TOKEN_ACQUIRE="cache browser"
If set in an interactive shell, the first time a token is acquired by any application the cache is empty. The browser will open a window to let the user authenticate as usual. The refresh token returned from this exchange is stored in the cache.
When another application requests a token, it will find the refresh token in the cache and use it to acquire an access token, transparent for the user. When the refresh token expires (at the end of the SSO session), the user will be prompted again via the browser for authentication.
Instead of browser
the device
method can be given if the user session
is e.g. via ssh
and not able to start a browser.
The refresh token can also be acquired out of band e.g. by using
the sso-helper
functions. In particular one can request a token with
offline_access
that will only expire if it hasn't been used for three months.
source ssh-helper.sh
token_from_browser atlas-tdaq-token | jq -r .refresh_token > $HOME/.cache/tdaq-sso-cache
chmod og-rwx $HOME/.cache/tdaq-sso-cache
The default cache location is $HOME/.cache/tdaq-sso-cache
with a fallback to
/tmp/tdaq-sso-cache-$USER
if the home directory does not exist. The file must
be only readable by the owner or it will be ignored. The location can also be
set explicitly by the defining the TDAQ_TOKEN_CACHE
variable.
All this together can be used for a long running job that needs TDAQ authentication like this:
source ssh-helper.sh
token_from_browser atlas-tdaq-token offline_access | jq -r .refresh_token > $HOME/.config/my-sso-cache
chmod og-rwx $HOME/.config/my-sso-cache
and then use it:
cm_setup
export TDAQ_TOKEN_CACHE=$HOME/.config/my-sso-cache
start-my-job
tdaq-11-03-00
Add a new method to acquire a token if the user
has a kerberos ticket: 'http'. This connects
to a URL (default: ${TDAQ_TOKEN_HTTP_URL:-https://vm-atdaq-token.cern.ch/token}
)
which is protected by GSSAPI and will return a token
if the authentication succeeds. This is intended to
replace the gssapi
mechanism going forward.
A PEM file can contain more than one public key. All
will be processed by verify()
. This allows to rotate
in a new key while the old one is still valid without
changing TDAQ_TOKEN_PUBLIC_KEY_URL
manually -
which is anyway impossible for already running processes.
DVS (Diagnostics and Verification Framework)
tdaq-11-02-00
Do not expose TM/Client.h from manager.h (better separation of implementation from interface).
DVS GUI (graphical UI for DVS)
See also related DVS and TestManager packages.
dynlibs - Load shared libraries
This package is deprecated from tdaq-09-00-00 onwards.
Please use plain dlopen()
or boost::dll
instead. Note that unlike in this package, the boost::dll::shared_library
object has to stay in
scope as long as the shared library is used !
Example of possible replacement
#include <boost/dll.hpp>
double example()
{
boost::dll::shared_library lib("libmyplugin.so",
boost::dll::load_mode::type::search_system_folders |
boost::dll::load_mode::type::rtld_now);
// Get pointer to function with given signature
auto f = lib.get<double(double, double)>("my_function")
return f(10.2, 3.141);
}
emon
- Connections between event samplers and monitors have been optimized. Existing configurations should be adjusted to benefit form that. Previously the NumberOfSamplers parameter has been used to define a number of samplers to be connected by all monitors of a group that uses the same selection criteria. In the new implementation this number defines a number of samplers that each individual monitor has to connect. That makes no difference for monitors that used to connect to a single sampler and do't form a group. For the monitors that share the same selection criteria, like for example the Global Monitoring tasks, this number should be changed to the old number divided to the number of the monitors in the group. For Athena monitoring the corresponding parameter of a JO file is called KeyCount.
ers2idl
tdaq-11-02-00
New TDAQ package that can be used for passing ERS issues (including chained ones) declared in IDL interfaces between CORBA server and client.
It contains the ERS Issue definition in IDL file (ers2idl module) that should be included in other IDL and used in IDL interface definitions (e.g. as _out
type of parameters for returning an Issue from a remote server). For the implementation, a helper C++ library is provided to convert IDL-generated structure into ers::Issue
C++ object on the client side, or in the other direction from ers::Issue
into ers2idl::Issue
for instantiating it on the server side. Available helper functions (namespace daq):
daq::ers2idl_issue(::ers2idl::Issue& message, const ers::Issue& issue)
std::unique_ptr<ers::Issue> daq::idl2ers_issue(const ::ers2idl::Issue& message)
::ers2idl::Issue_var daq::ers2idl_issue(const ers::Issue& issue)
IPC
TLS Support
IPC based servers and clients can use TLS to protect their communication.
The TDAQ_IPC_ENABLE_TLS
environment variable
can be used to turn this capability on
(if value = "1") or off (if value = "0"). This
is typically a global setting as part of the
common environment. If enabled clients will
be able to talk to servers which require TLS.
For a C++ based server to enforce TLS two environment variables should be set in its environment (in addition to the variable above):
ORBendPoint=giop:ssl::
ORBendPointPublish=giop:ssl::
An OKS VariableSet
called TDAQ_IPC_TLSSERVER_Environment
is available that can be linked to an
Application
or Binary
object.
For testing the variables can be also set manually in an interactive environment.
For Java based servers the following property has to be set:
jacorb.security.ssl.server.required_options=20
There is an overhead involved in setting up an TLS connection, as well as for the encryption during the data transfer phase. Therefore it should be used only for servers that receive confidential data like authorization tokens, and it should be avoided for anything where potentially large amounts of data are exchanged, like IS servers.
In practice currently only the following applications require this:
- pmgserver
- run controllers
- master trigger implementations
- RDB writer
Note that the way TLS is used in the IPC package does not provide authentication, only confidentiality and integrity. Authentication has to be handled on the application level.
mda
- tag
mda-07-19-01
- Add type annotations for Python modules.
tdaq-11-02-00
- tag
mda-07-19-00
- Allow MDA app to read histograms from other partitions
MTS
tdaq-11-04-00
Subscription syntax
Extended the subscription syntax allowing - ':' symbol in tokens - wildcard in the beginning of a token - spaces as part of context values, e.g. in function or file names
Current subscription syntax (quasy EBNF notation):
key = 'app' | 'msg' | 'qual'
sev = 'sev'
par = '(param|par)(+[a-zA-Z_-])'
context_item = 'app' | 'host' | 'user' | 'package' | 'file' | 'function' | 'line'
context = 'context(context_item)'
sevlev = 'fatal' | 'error' | 'warning' | 'info' | 'information'
token_wildcard = -'*'+[a-zA-Z_.:-]-'*'
token = +[a-zA-Z_.:-]
char = // any character exclusing quotes
quoted_string = '"'+char'"' | "'"+char"'"
item = (key ('=' | '!=') token_wildcard) | (sev ('=' | '!=') sevlev) | (par ('=' | '!=') token)
| (context ('=' | '!=') quoted_string)
factor = item | ( expression ) | ('not' factor)
expression = '*' | factor *(('and' expression) | ('or' expression))
An example of subscription is
sev=ERROR or (msg=mtssender::* and (msg=*Died or msg=*::longmessageid) and param(pid)!=666 and context(line)!=1 and context(function) != 'const int daq::function')
Use new way of declaring jERS stream implementations
Follow the changes in jERS, using ErsStreamName annotation to declare a class implementing certain ERS stream:
@ErsStreamName(name="mts")
public class MTSInStream extends ers.AbstractInputStream { ...
tdaq-11-02-00
Use new ers2idl package. Removed dependency on OWLSemaphore.
Added ers2splunk.sh utility to redirect the output of mts2splunk_receiver to stdout and further to TCP host:port, feeding splunk indexer on a host. This is to be used on tbed splunk instance. An application (ers2splunk) is added to the setup segment infrastrcuture on tbed.
OKS
tdaq-11-02-00
- use
tbb::scalable_allocator
allocator instead ofboost::fast_pool_allocator
- add debug info (
.git/oks_proc_info
) when clone oks repository - use
flock
on/var/tmp
before enteringoks-commit.sh
pull/push section to avoid issue when a user runs multiple commits on the same node - add
ordered
flag into oks relationship constructor to be used by rdb'soks copy
oksconfig
tdaq-11-02-00
Add version
parameter to db spec as discussed in ADTCC-328.
As an example, this spec "oksconfig:combined/partitions/ATLAS.data.xml&version=tag:r454833@ATLAS" defines ATLAS oks configuration for run 454833.
P-BEAST
tdaq-11-02-00
No need to use obsolete auth-get-sso-cookie
package by CERN IT to get Point-1 data on GPN. The cookie is created by DAQ sso-helper library.
To read data on GPN, the proxy kerberos authentication method is recommended. A valid kerberos ticket is needed.
All changes in details:
* limit number of processing threads in repository server (16 by default) to avoid OOM issues in case of heavy load
* fix json syntax issue for special symbols in user data (like quotes in string data), see ADAMATLAS-437 for details
* validate partition, class and attribute parameters in server calls, see ADAMATLAS-438
* add option -V
to specify floating point data precision in output of pbeast command line tools, see ADAMATLAS-442 for details
* if defined, the sampleInterval
parameter in readSeries
REST call has a priority, see ADAMATLAS-450 for details
* support regular expressions for getAnnotations
REST call, see ADAMATLAS-451 for details
* replace obsolete CERN IT's auth-get-sso-cookie
package by DAQ sso-helper library, see ADTCC-326 for details
* support proxy kerberos alternative authentication method (recommended for GPN), see ADHI-4951 for details
RDB
tdaq-11-02-00
Introdice RDB Access Manager policy
Introduce Access Manager policy to send commands changing state of RDB server: * OPEN * CLOSE * UPDATE * RELOAD * OPEN_SESSION * LIST_SESSIONS
An appropriate permission has to be granted using AM DAQ:rdb rules. The user running RDB server has permission to send any command.
Pass ERS exceptions using ers2idl package
Use ers2idl::Issue
to pass ERS exceptions from server to client over CORBA and modify exception declarations:
exception NotFound {
ers2idl::Issue issue;
};
exception CannotProceed {
ers2idl::Issue issue;
};
Use DAQ tokens for user authentication
Use DAQ tokens to pass information about user performing RDB actions and log important about user actions into RDB logs.
SSO Helper
tdaq-11-00-00
The sso-helper
library help to interact with CERN Single Sign On (SSO) system. The code has been extracted from various existing private implementations (most notably daq_tokens
) to make it more generally available.
The main motivation is to avoid having to call out to the existing Python scripts like auth-get-sso-cookies
and auth-get-sso-token
from DAQ applications. Apart from the overhead of starting a Python interpreter for this task, the scripts are either incompatible with the TDAQ environment (if taken from the system) or not always up to date (if take from LCG).
Two command line tools replicate (and extend) the functionality of the above mentioned scripts and can be used from shell scripts:
get-sso - retrieve an SSO protected URL, optinally store cookies for re-use
get-sso-token - retrieve (new) SSO OICD tokens, refresh them etc.
For more details see the README file.
swrod
Memory Management
Memory management of the SW ROD fragment builders can now be configured via the MemoryPool OKS class, which in the previous releases used to be ignored. Each SwRodApplication object has been already linked with an instance of the MemoryPool, but in the new release it will be used to define the default configuration for all ROBs handled by the given SW ROD application. This configuration can be overridden for a particular ROB by linking another instance of the MemoryPool with the corresponding SwRodRob object. The meaning of the two MemoryPool attributes is the following:
-
PageSize - default size of individual memory pages allocated by this memory pool. Note that this value can be overridden by fragment builder implementation, which is explained by the algorithm's description in the updated User's Guide.
-
NumberOfPages - the number of pages that will be pre-allocated before a new run is started. The maximum number of pages that can be allocated by the memory pool is unlimited.
If one memory page is not large enough to hold the data of a particular ROB fragment the fragment builders will allocate extra pages. More information is given in the updated User's Guide.
Note that the MaxMessageSize parameter of the SwRodFragmentBuilder class has now slightly different meaning with respect to the previous SW ROD versions. Now it truly means what its name implies, i.e. it defines the maximum size of a single data packet that will be accepted by the algorithm. Packets with the sizes exceeding this limit will be discarded. Packets of a smaller size are guaranteed to be added to the ROB fragment payload without truncation.
Support of netio and netio-next
Starting from this release SW ROD doesn't support any more the legacy netio protocol and therefore cannot be used to receive data from the old felix-core systems. For receiving data from felix-star via the new netio-next protocol one must use the FelixClient interface, which can be configured via the SwRodFelixInput OKS class. The direct use of the netio-next API is also no longer supported. Because of this both the SwRodNetioInput and the SwRodNetioNextInput classes have been removed from the SW ROD OKS schema file.
transport
Clients using classes from the transport
package should
look into more modern network libraries like
boost::asio
until the C++ standard contains an official network library.
webdaq
The webdaq
package provides an HTTP based client
API to the basic online TDAQ services like IS, OH, EMON
and OKS. It only depends on curl and nlohmann::json libraries
and can be compiled in a stand-alone mode outside of any
TDAQ specific environment. It is part of tdaq-common to
make it automatically available to offline projects that
don't build against the full TDAQ stack.
Possible use cases are
- Use in Athena without requiring the full TDAQ stack
- Use in a stand-alone SoC environment without TDAQ software
Configuration
The library requires the TDAQ_WEBDAQ_BASE
environment variable
to be set. It should include the URL to access a running
webis_server
instance, e.g. https://atlasop.cern.ch or http://localhost:8080
To publish any kind of information the webis_server
instance has to
be started with the --writable
option. Note that this is not the case for
the instance that is accessible from outside Point 1 under
https://atlasop.cern.ch
API
For the API see the main header file.
webis_server
The JSON based API no longer HTML escapes the metadata for the object. This is unnecessary since the JSON invalid characters are different from HTML and the JSON encode does its own escaping anyway.
The one mostly affected are histograms requested as raw IS value in JSON - however,
this should be a basically non-existant use case. The type names e.g. change from
HistogramData<int>
to HistogramData<int>
.
tdaq-11-00-00
The JSON based API now returns a JSON formatted response in case of an error. It will contain an object with two keys:
{
"error" : "error_value",
"error_description: "human readable error message"
}
The error
value is one of a few defined strings, containing no whitespace. It can
be used to check for certain error types (e.g. invalid_partition
). The
error_description
is a human readable string that can be presented to the user.