TC10.17 - Back Office Scheduler Monitoring, DISCES tool

×

Warning message

  • You can't delete this newsletter because it has not been sent to all its subscribers.
  • You can't delete this newsletter because it has not been sent to all its subscribers.

Test Case Title

TC10.17 - Back Office Scheduler Monitoring, DISCES tool

Goal

I can use DISCES (Distributed Smart City/Cloud Engine Scheduler)

  • To Define processes to be executed in the back office periodically or sporadically
  • To stop and restart processes
  • To monitor their activity
  • While processes
  • can be ETL, Java, Python, C/C++
  • can be executed and monitored

Prerequisites

Access to the snap4city.org as authorised users.

The following functionalities are available only for specific Snap4city users with specific privileges.

Expected successful result

Access to the dashboards accessible by the links and see them, visualization of the status, on which the dashboard owner can set some rules for firing and send notifications as email, or other means

Steps

 

 

Please note that some of the following links could be accessible only for registered users.

Access to Snap4City.org portal as RootAdmin

Click on Management --> Back Office Scheduler as depicted in the figure

From the DISCES it is possible to see the status of the processes/jobs managed by DISCES, and the performances on access response on the network over time. This allows to manage them, recover them in the case of failure, early detect problems and to start maintenance activities in advance.

It is possible to have a view at level of JOBS, TRIGGER, etc.

The trigger is the time rule by which a job is activated. The jobs can be concatenated each other.

Selecting “Cluster” it is possible to get a view for monitoring the status of the computational nodes involved as depicted in the figure. Please note the metrics on the solution healthiness:

  • CPU, CPU Load
  • Mem Total, Mem Free
  • Cores
  • Jobs/h
  • Jobs Executed
  • Jobs Failed/Success (24 h): percentage of failed jobs in the last 24 hours. Please note the high number of them. The job fails typically for the lack of connection with the content and dat providers.
  • Jobs Failed/Success (7 days)

For each of the 5 nodes, in this cases, a number of metrics are reported about the healthiness of the processor. By clicking on some of them, graphs are depicted about.