1. Alarm

Alarm module is designed for processing alarm event. The alarm event generated by Judge is written in redis. Alarm reads and processes the alarm event and send it via different channels.

1.1. Design Intention

The processing logic of alarm event is not only sending mails and message. Alarm needs to callback the user-provided port when event is generated in order to process event automatically. Sometimes there are too many alarm messages and mails. Alarms with lower priority are recommend to be combined together. These logics are designed in Alarm.

We configured alarm priorities while configuring alarm policies, such as P0, P1, P2 and so on. Alarm of each priority corresponds to different queues. We hope Alarm will first read the data of queue P0, then P1, and finally P5. We resort to brpop command of redis, because we hope Alarm will process event with higher priority in the first place.

Alarm information that is already sent will be saved in MySQL, so that user can check the information of alarm history in Dashboard. Several alarms sent because of one policy at the same time will be saved in aggregation in MySWL. The alarm history of past 7 days is saved by default, but you can change the time limit.

1.2. Deployment Attention

Alarm is a single-point. Alarm needs to aggregate alarms that are unrestored because they take up Alarm's memory. So Alarm can only deploy one instance. The existence of Alarm needs to be monitored.

1.3. Configuration Instruction

The name of configuration file must be "cfg.json" and it be changed based on "cfg.example.json".

{
    "log_level": "debug",
    "http": {
        "enabled": true,
        "listen": "0.0.0.0:9912"
    },
    "redis": {
        "addr": "127.0.0.1:6379",
        "maxIdle": 5,
        "highQueues": [
            "event:p0",
            "event:p1",
            "event:p2"
        ],
        "lowQueues": [
            "event:p3",
            "event:p4",
            "event:p5",
            "event:p6"
        ],
        "userIMQueue": "/queue/user/im",
        "userSmsQueue": "/queue/user/sms",
        "userMailQueue": "/queue/user/mail"
    },
    "api": {
        "im": "http://127.0.0.1:10086/wechat",  //gateway address sent via Wechat
        "sms": "http://127.0.0.1:10086/sms",  //gateway address sent via SMS
        "mail": "http://127.0.0.1:10086/mail", //gateway address sent via mail
        "dashboard": "http://127.0.0.1:8081",  //address where module Dashboard is running
        "plus_api":"http://127.0.0.1:8080",   //address where module Falcon-plus api is running
        "plus_api_token": "default-token-used-in-server-side" //token used in communication authentication between the server of module Falcon-plus api
    },
    "falcon_portal": {
        "addr": "root:@tcp(127.0.0.1:3306)/alarms?charset=utf8&loc=Asia%2FChongqing",
        "idle": 10,
        "max": 100
    },
    "worker": {
        "im": 10,
        "sms": 10,
        "mail": 50
    },
    "housekeeper": {
        "event_retention_days": 7,  //the number of days during which the information of alarm history is saved
        "event_delete_batch": 100
    }
}

1.4. Process Management

# Start
./open-falcon start alarm

# Stop
./open-falcon stop alarm

# Check log
./open-falcon monitor alarm

1.5. Alarm Aggregation

Alarm will probably happen in large range if one core service crushes. We developed an alarm aggregation feature to reduce the the number of alarm SMS. The alarm information is written in Dashboard module and Dashboard sends a url address back to Alarm, then Alarm sends the url link to users. So when users receive a message with a URL, open the link in the message and they will find several alarms altogether.

Events of Event queues in HighQueues configuration will not be aggregated because they are given high priority. Only events in LowQuenes will be aggregated. If you don't want to aggregate any of those ,just configure all Event queues in HighQueues.

Copyright 2015 - 2018 Xiaomi Inc. all right reserved,powered by Gitbook该文件修订时间: 2022-05-30 16:56:30