python scripting
framework - monitoring
project summary
Basic reference
EDG Fabric Monitoring -
The EDG WP4
Fabric Monitoring
framework that provides a facility to collect data from distributed nodes
and to store them centrally. It is composed of agents (Monitoring Sensor Agent - MSA) which
run sensors (Monitoring Sensors - MS) on each monitored node and a central server (fmonServer)
that collects data. The server receives samples as they are measured by MSAs and stores them in
a repository which is accessible through a certain api. The communication between Monitoring
Agent and Monitoring Sensors is based on so-called ascii protocol - MS communicates
with MSA through its standard input/output streams. Specification of the protocol is described
in sensor api document.
Important materials:
EDG fmon documentation
EDG fmon release notes
EDG fmon tutorial
sensor API
repository API
Python Scripting Framework - monitoring (PSFmon) -
Scripting framework built in python that aims at facilitating fast creation of
specific sensors/metrics. It fully adheres to EDG fabric monitoring specification. Its main
purposes:
- hiding the details of edg fmon protocols (communication with agents, error handling,
providing results, timestamps etc.) from the user
- object-oriented 'higher-level' way of monitoring (hierarchy of classes, separate
functional layers, recursive computing, complex errors etc.)
- rapid development of ready-to-work sensors/metrics; complete templates
PSFmon comprises a python package with appropriate modules,
template sensors/metrics, MSA config files, shell scripts (execution).
Linode is also attached to PSFmon as an example of fully-functional
library used for sampling data and correlating the results.
correlation -
Computing performed in accordance with samples stored in a repository.
Typically, these are c programs - correlation agents - based on repository api
(repositoryAPI.h), that retrieve data from the repository and conclude
from them, according to a certain formula. The outcome may be
written to a log file or may cause other behaviour, like launching another program.
Linode -
Linux Node Installation monitoring - a set of local and remote tests
framed up in order to mimic installation of a Linux node. Linode is composed of
python scripts (modules) conforming to PSFmon framework and c programs implementing
correlation schemes. They all address particular set of requirements, defined in a separate
document.
Project status
PSFmon is fully-functional. The python sources for PSFmon are complete.
User documentation is fairly exhaustive, yet some technical details are still
missing its technical counterpart. All files will be shortly placed on CERN CVS
account.
Linode is complete and deployed for test purposes on lxnfs4 and
lxnfs5 machines. After tests, it will be moved on production.
Materials
Presentation on PSFmon prepared for LMD meeting on 22.01.2004,
in ppt file.
Description of PSFmon is attached to the module. It can also be accessible from
the website.
CERN
cvs
account for PSFmon.
Introductory ppt slides on CDS and PSFmon (together)
All project files are currently stored on my account (marekm), in publicly accessible
directory (psf). There are, among others, following subdirectories:
- doc - project documentation
- linode - linode correlation sources/binaries
- msa-cfg - MSA config files for different purposes
- psfmon - python package for PSF Monitoring
There are also several additional scripts there:
- run_server - it runs edg-fmon-server with specified arguments
- run_agent - it runs edg-fmon-agent on a selected machine, pointing
to a specified server
- run_correl - it launches appropriate correlation engine (details are inside)
- kill_agent - it kills previously launched agent
For Linode purpose only, there have been created two directories on that account: lxnfs4
and lxnfs5, that store internal data and log files for their corresponding
Linode instances, as well as config files and launching scripts.
|