From b6436b09ce83fce35b9db03628fb30ce90a61115 Mon Sep 17 00:00:00 2001
From: laiwei
Date: Sun, 17 Jul 2022 21:57:16 +0800
Subject: [PATCH 1/9] update community governance (#1056)
* update readme to add badge
* update community gov
---
README.md | 118 +++++++++++++++++++++---------------
doc/active-contributors.md | 0
doc/committers.md | 0
doc/community-governance.md | 68 ++++++++++++++-------
doc/end-users.md | 0
doc/pmc.md | 0
6 files changed, 114 insertions(+), 72 deletions(-)
create mode 100644 doc/active-contributors.md
create mode 100644 doc/committers.md
create mode 100644 doc/end-users.md
create mode 100644 doc/pmc.md
diff --git a/README.md b/README.md
index 4430d14f3..8ba17b436 100644
--- a/README.md
+++ b/README.md
@@ -1,66 +1,78 @@
-
+
+
+
+
夜莺是一款开源的云原生监控系统,采用 all-in-one 的设计,提供企业级的功能特性,开箱即用的产品体验。推荐升级您的 Prometheus + AlertManager + Grafana 组合方案到夜莺
+
+
+
+
+
+
+
+
+
+
+
+
+
+
[English](./README_EN.md) | [中文](./README.md)
-> 夜莺是一款开源的云原生监控系统,采用 All-In-One 的设计,提供企业级的功能特性,开箱即用的产品体验。推荐升级您的 Prometheus + AlertManager + Grafana 组合方案到夜莺。
+## Highlighted Features
-**夜莺监控具有以下特点:**
+- **开箱即用**
+ - 支持 Docker、Helm Chart 等多种部署方式,内置多种监控大盘、快捷视图、告警规则模板,导入即可快速使用,活跃、专业的社区用户也在持续迭代和沉淀更多的最佳实践于产品中;
+- **兼容并包**
+ - 支持 [Categraf](https://github.com/flashcatcloud/categraf)、Telegraf、Grafana-agent 等多种采集器,支持 Prometheus、VictoriaMetrics、M3DB 等各种时序数据库,支持对接 Grafana,与云原生生态无缝集成;
+ - 集数据采集、可视化、监控告警、数据分析于一体,与云原生生态紧密集成,提供开箱即用的企业级监控分析和告警能力;
+- **开放社区**
+ - 托管于[中国计算机学会开源发展委员会](https://www.ccf.org.cn/kyfzwyh/),有[快猫星云](https://flashcat.cloud)的持续投入,和数千名社区用户的积极参与,以及夜莺监控项目清晰明确的定位,都保证了夜莺开源社区健康、长久的发展;
+- **高性能**
+ - 得益于夜莺的多数据源管理引擎,和夜莺引擎侧优秀的架构设计,借助于高性能时序库,可以满足数亿时间线的采集、存储、告警分析场景,节省大量成本;
+- **高可用**
+ - 夜莺监控组件均可水平扩展,无单点,已在上千家企业部署落地,经受了严苛的生产实践检验。众多互联网头部公司,夜莺集群机器达百台,处理十亿级时间线,重度使用夜莺监控;
+- **灵活扩展**
+ - 夜莺监控,可部署在1核1G的云主机,可在上百台机器部署集群,可运行在K8s中;也可将时序库、告警引擎等组件下沉到各机房、各region,兼顾边缘部署和中心化管理;
-#### 1. 开箱即用
-支持 Docker、Helm Chart 等多种部署方式,内置多种监控大盘、快捷视图、告警规则模板,导入即可快速使用,活跃、专业的社区用户也在持续迭代和沉淀更多的最佳实践于产品中;
-
-#### 2. 兼容并包
-支持 [Categraf](https://github.com/flashcatcloud/categraf)、Telegraf、Grafana-agent 等多种采集器,支持 Prometheus、VictoriaMetrics、M3DB 等各种时序数据库,支持对接 Grafana,与云原生生态无缝集成;
-
-#### 3. 开放社区
-托管于[中国计算机学会开源发展委员会](https://www.ccf.org.cn/kyfzwyh/),有[快猫星云](https://flashcat.cloud)的持续投入,和数千名社区用户的积极参与,以及夜莺监控项目清晰明确的定位,都保证了夜莺开源社区健康、长久的发展;
-
-#### 4. 高性能
-得益于夜莺的多数据源管理引擎,和夜莺引擎侧优秀的架构设计,借助于高性能时序库,可以满足数亿时间线的采集、存储、告警分析场景,节省大量成本;
-
-#### 5. 高可用
-夜莺监控组件均可水平扩展,无单点,已在上千家企业部署落地,经受了严苛的生产实践检验。众多互联网头部公司,夜莺集群机器达百台,处理十亿级时间线,重度使用夜莺监控;
-
-#### 6. 灵活扩展
-夜莺监控,可部署在1核1G的云主机,可在上百台机器部署集群,可运行在K8s中;也可将时序库、告警引擎等组件下沉到各机房、各region,兼顾边缘部署和中心化管理;
-
-
-#### 如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您升级到夜莺:
+> 如果您在使用 Prometheus 过程中,有以下的一个或者多个需求场景,推荐您升级到夜莺:
- Prometheus、Alertmanager、Grafana 等多个系统较为割裂,缺乏统一视图,无法开箱即用;
- 通过修改配置文件来管理 Prometheus、Alertmanager 的方式,学习曲线大,协同有难度;
- 数据量过大而无法扩展您的 Prometheus 集群;
- 生产环境运行多套 Prometheus 集群,面临管理和使用成本高的问题;
-#### 如果您在使用 Zabbix,有以下的场景,推荐您升级到夜莺:
+> 如果您在使用 Zabbix,有以下的场景,推荐您升级到夜莺:
- 监控的数据量太大,希望有更好的扩展解决方案;
- 学习曲线高,多人多团队模式下,希望有更好的协同使用效率;
- 微服务和云原生架构下,监控数据的生命周期多变、监控数据维度基数高,Zabbix 数据模型不易适配;
-#### 如果您在使用 [open-falcon](https://github.com/open-falcon/falcon-plus),我们更推荐您升级到夜莺:
-- 关于open-falcon和夜莺的详细介绍,请参考阅读[《云原生监控的十个特点和趋势》](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
+> 如果您在使用 [Open-Falcon](https://github.com/open-falcon/falcon-plus),我们更推荐您升级到夜莺:
+
+- 关于 Open-Falcon 和夜莺的详细介绍,请参考阅读[《云原生监控的十个特点和趋势》](https://mp.weixin.qq.com/s?__biz=MzkzNjI5OTM5Nw==&mid=2247483738&idx=1&sn=e8bdbb974a2cd003c1abcc2b5405dd18&chksm=c2a19fb0f5d616a63185cd79277a79a6b80118ef2185890d0683d2bb20451bd9303c78d083c5#rd)。
-#### 我们推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为首选的监控数据采集器:
-- Categraf 是夜莺监控的默认采集器,采用开放插件机制和 all-in-one 的设计,同时支持 metric、log、trace、event 的采集。Categraf 不仅可以采集 CPU、内存、网络等系统层面的指标,也集成了众多开源组件的采集能力,支持K8s生态。Categraf 内置了对应的仪表盘和告警规则,开箱即用。
+> 我们推荐您使用 [Categraf](https://github.com/flashcatcloud/categraf) 作为首选的监控数据采集器:
+
+- [Categraf](https://github.com/flashcatcloud/categraf) 是夜莺监控的默认采集器,采用开放插件机制和 all-in-one 的设计,同时支持 metric、log、trace、event 的采集。Categraf 不仅可以采集 CPU、内存、网络等系统层面的指标,也集成了众多开源组件的采集能力,支持K8s生态。Categraf 内置了对应的仪表盘和告警规则,开箱即用。
-## 资料链接
+## Getting Started
- [快速安装](https://mp.weixin.qq.com/s/iEC4pfL1TgjMDOWYh8H-FA)
- [详细文档](https://n9e.github.io/)
- [社区分享](https://n9e.github.io/docs/prologue/share/)
-## 产品演示
+## Screenshots
-## 架构介绍
-
+## Architecture
-Nightingale 可以接收各种采集器上报的监控数据(比如 [Categraf](https://github.com/flashcatcloud/categraf)、telegraf、grafana-agent、Prometheus),并写入多种流行的时序数据库中(可以支持Prometheus、M3DB、VictoriaMetrics、Thanos、TDEngine等),提供告警规则、屏蔽规则、订阅规则的配置能力,提供监控数据的查看能力,提供告警自愈机制(告警触发之后自动回调某个webhook地址或者执行某个脚本),提供历史告警事件的存储管理、分组查看的能力。
+
+夜莺监控可以接收各种采集器上报的监控数据(比如 [Categraf](https://github.com/flashcatcloud/categraf)、telegraf、grafana-agent、Prometheus),并写入多种流行的时序数据库中(可以支持Prometheus、M3DB、VictoriaMetrics、Thanos、TDEngine等),提供告警规则、屏蔽规则、订阅规则的配置能力,提供监控数据的查看能力,提供告警自愈机制(告警触发之后自动回调某个webhook地址或者执行某个脚本),提供历史告警事件的存储管理、分组查看的能力。
@@ -72,31 +84,39 @@ Nightingale 可以接收各种采集器上报的监控数据(比如 [Categraf]
如果单机版本的 Prometheus 性能不够或容灾较差,我们推荐使用 [VictoriaMetrics](https://github.com/VictoriaMetrics/VictoriaMetrics),VictoriaMetrics 架构较为简单,性能优异,易于部署和运维,架构图如上。VictoriaMetrics 更详尽的文档,还请参考其[官网](https://victoriametrics.com/)。
-## 如何参与
+## Community
-开源项目要更有生命力,离不开开放的治理架构和源源不断的开发者和用户共同参与,我们致力于建立开放、中立的开源治理架构,吸纳更多来自企业、高校等各方面对云原生监控感兴趣、有热情的计算机专业人士,打造专业、有活力的开发者社区。关于《夜莺开源项目和社区治理架构(草案)》,请查阅 [doc/community-governance.md](./doc/community-governance.md).
+开源项目要更有生命力,离不开开放的治理架构和源源不断的开发者和用户共同参与,我们致力于建立开放、中立的开源治理架构,吸纳更多来自企业、高校等各方面对云原生监控感兴趣、有热情的开发者,一起打造有活力的夜莺开源社区。关于《夜莺开源项目和社区治理架构(草案)》,请查阅 **[COMMUNITY GOVERNANCE](./doc/community-governance.md)**.
**我们欢迎您以各种方式参与到夜莺开源项目和开源社区中来,工作包括不限于**:
-- 补充和完善文档 => [n9e.github.io](https://n9e.github.io/);
-- 分享您在使用夜莺监控过程中的最佳实践和经验心得 => [文章分享](https://n9e.github.io/docs/prologue/share/);
-- 提交产品建议 =》 [github issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md);
-- 提交代码,让夜莺监控更快、更稳、更好用 => [github pull request](https://github.com/didi/nightingale/pulls);
-
+- 补充和完善文档 => [n9e.github.io](https://n9e.github.io/)
+- 分享您在使用夜莺监控过程中的最佳实践和经验心得 => [文章分享](https://n9e.github.io/docs/prologue/share/)
+- 提交产品建议 =》 [github issue](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
+- 提交代码,让夜莺监控更快、更稳、更好用 => [github pull request](https://github.com/didi/nightingale/pulls)
**尊重、认可和记录每一位贡献者的工作**是夜莺开源社区的第一指导原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区知识沉淀的贡献:
-1. 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq) ;
-2. 提问之前请先搜索 [github issue](https://github.com/ccfos/nightingale/issues);
-3. 我们优先推荐通过提交 github issue 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml) | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md);
-4. 最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题);
+- 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq)
+- 提问之前请先搜索 [github issue](https://github.com/ccfos/nightingale/issues)
+- 我们优先推荐通过提交 github issue 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml) | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md)
+- 最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题)
-## 联系我们
-- 推荐您关注夜莺监控公众号,及时获取相关产品和社区动态
+## Who is using
-
+您可以通过在 **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897)** 登记您的使用情况,分享您的使用经验。
-## Stargazers over time
+## Stargazers
[![Stargazers over time](https://starchart.cc/ccfos/nightingale.svg)](https://starchart.cc/ccfos/nightingale)
+## Contributors
+
+
+
+
## License
-- [Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)
\ No newline at end of file
+[Apache License V2.0](https://github.com/didi/nightingale/blob/main/LICENSE)
+
+## Contact Us
+推荐您关注夜莺监控公众号,及时获取相关产品和社区动态:
+
+
\ No newline at end of file
diff --git a/doc/active-contributors.md b/doc/active-contributors.md
new file mode 100644
index 000000000..e69de29bb
diff --git a/doc/committers.md b/doc/committers.md
new file mode 100644
index 000000000..e69de29bb
diff --git a/doc/community-governance.md b/doc/community-governance.md
index 146f22aa8..25918dd5f 100644
--- a/doc/community-governance.md
+++ b/doc/community-governance.md
@@ -1,52 +1,74 @@
# 夜莺开源项目和社区治理架构(草案)
-#### 用户(User)
+## 社区架构
->欢迎任何个人、公司以及组织,使用 Nightingale,并积极的反馈 bug、提交功能需求、以及相互帮助,我们推荐使用 github issue 来跟踪 bug 和管理需求。
+### 用户(User)
-#### 贡献者(Contributer)
+> 欢迎任何个人、公司以及组织,使用夜莺监控,并积极的反馈 bug、提交功能需求、以及相互帮助,我们推荐使用 [github issue](https://github.com/ccfos/nightingale/issues) 来跟踪 bug 和管理需求。
->欢迎每一位用户,包括但不限于以下列方式参与到 Nightingale 开源项目并做出贡献:
->1. 在 [github issue](https://github.com/ccfos/nightingale/issues) 中积极参与讨论;
->2. 提交代码补丁;
->3. 修订、补充和完善文档;
->4. 提交建议 / 批评;
+社区用户,可以通过在 **[Who is Using Nightingale](https://github.com/ccfos/nightingale/issues/897)** 登记您的使用情况,并分享您使用夜莺监控的经验,将会自动进入 **[END USERS](./end-users.md)** 列表,并获得社区的 **VIP Support**。
-#### 提交者(Committer)
+### 贡献者(Contributer)
->Committer 是指拥有 Nightingale 代码仓库写操作权限的贡献者,而且他们也签署了 Nightingale 项目贡献者许可协议(CLA),他们拥有 ccf.org.cn 为后缀的邮箱地址。原则上 Committer 能够自主决策某个代码补丁是否可以合入到 Nightingale 代码仓库,但是项目管委会拥有最终的决策权。
+> 欢迎每一位用户,包括但不限于以下列方式参与到夜莺开源社区并做出贡献:
-#### 项目管委会成员(PMC Member)
+1. 在 [github issue](https://github.com/ccfos/nightingale/issues) 中积极参与讨论,参与社区活动;
+1. 提交代码补丁;
+1. 翻译、修订、补充和完善[文档](https://n9e.github.io);
+1. 分享夜莺监控的使用经验,积极布道;
+1. 提交建议 / 批评;
-> 项目管委会成员,从贡献者或者 Committer 中选举产生,他们拥有 Nightingale 代码仓库的写操作权限,拥有 ccf.org.cn 为后缀的邮箱地址,拥有 Nightingale 社区相关事务的投票权、以及提名 Committer 候选人的权利。 项目管委会作为一个实体,为整个项目的发展全权负责。
+年度累计向 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale) 提交 **5** 个PR(被合并),或者因为其他贡献被**项目管委会**一致认可,将会自动进入到 **[ACTIVE CONTRIBUTORS](./active-contributors.md)** 列表,并获得 **[CCF ODC](https://www.ccf.org.cn/kyfzwyh/)** 颁发的电子证书,享有夜莺开源社区一定的权益和福利。
-#### 项目管委会主席(PMC Chair)
-> 项目管委会主席采用任命制,由 [CCF ODC](https://www.ccf.org.cn/kyfzwyh/) 从项目管委会成员中任命产生。项目管委会作为一个统一的实体,来管理和领导 Nightingale 项目。管委会主席是 CCF ODC 和项目管委会之间的沟通桥梁,履行特定的项目管理职责。
+### 提交者(Committer)
-# 沟通机制(Communication)
+> Committer 是指拥有 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale) 代码仓库写操作权限的贡献者,他们拥有 ccf.org.cn 为后缀的邮箱地址(待上线)。原则上 Committer 能够自主决策某个代码补丁是否可以合入到夜莺代码仓库,但是项目管委会拥有最终的决策权。
+
+Committer 承担以下一个或多个职责:
+- 积极回应 Issues;
+- Review PRs;
+- 参加开发者例行会议,积极讨论项目规划和技术方案;
+- 代表夜莺开源社区出席相关技术会议并做演讲;
+
+Committer 记录并公示于 **[COMMITTERS](./committers.md)** 列表,并获得 **[CCF ODC](https://www.ccf.org.cn/kyfzwyh/)** 颁发的电子证书,以及享有夜莺开源社区的各种权益和福利。
+
+
+### 项目管委会成员(PMC Member)
+
+> 项目管委会成员,从 Contributor 或者 Committer 中选举产生,他们拥有 [CCFOS/NIGHTINGALE](https://github.com/ccfos/nightingale) 代码仓库的写操作权限,拥有 ccf.org.cn 为后缀的邮箱地址(待上线),拥有 Nightingale 社区相关事务的投票权、以及提名 Committer 候选人的权利。 项目管委会作为一个实体,为整个项目的发展全权负责。项目管委会成员记录并公示于 **[PMC](./pmc.md)** 列表。
+
+### 项目管委会主席(PMC Chair)
+
+> 项目管委会主席采用任命制,由 **[CCF ODC](https://www.ccf.org.cn/kyfzwyh/)** 从项目管委会成员中任命产生。项目管委会作为一个统一的实体,来管理和领导夜莺项目。管委会主席是 CCF ODC 和项目管委会之间的沟通桥梁,履行特定的项目管理职责。
+
+## 沟通机制(Communication)
1. 我们推荐使用邮件列表来反馈建议(待发布);
2. 我们推荐使用 [github issue](https://github.com/ccfos/nightingale/issues) 跟踪 bug 和管理需求;
3. 我们推荐使用 [github milestone](https://github.com/ccfos/nightingale/milestones) 来管理项目进度和规划;
-4. 我们推荐使用腾讯会议来定期召开项目例会;
+4. 我们推荐使用腾讯会议来定期召开项目例会(会议 ID 待发布);
-# 文档(Documentation)
+## 文档(Documentation)
1. 我们推荐使用 [github pages](https://n9e.github.io) 来沉淀文档;
2. 我们推荐使用 [gitlink wiki](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq) 来沉淀FAQ;
-# 运营机制(Operation)
+## 运营机制(Operation)
+
1. 我们定期组织用户、贡献者、项目管委会成员之间的沟通会议,讨论项目开发的目标、方案、进度,以及讨论相关需求的合理性、优先级等议题;
2. 我们定期组织 meetup (线上&线下),创造良好的用户交流分享环境,并沉淀相关内容到文档站点;
-3. 我们定期组织 Nightingale 开发者大会,分享 best user story、同步年度开发目标和计划、讨论新技术方向等;
+3. 我们定期组织夜莺开发者大会,分享 best user story、同步年度开发目标和计划、讨论新技术方向等;
-# 社区指导原则(Philosophy)
-- 尊重、认可和记录每一位贡献者的工作;
+## 社区指导原则(Philosophy)
+
+**尊重、认可和记录每一位贡献者的工作。**
+
+## 关于提问的原则
-# 关于提问的原则
按照**尊重、认可、记录每一位贡献者的工作**原则,我们提倡**高效的提问**,这既是对开发者时间的尊重,也是对整个社区的知识沉淀的贡献:
1. 提问之前请先查阅 [FAQ](https://www.gitlink.org.cn/ccfos/nightingale/wiki/faq) ;
2. 提问之前请先搜索 [github issue](https://github.com/ccfos/nightingale/issues);
3. 我们优先推荐通过提交 github issue 来提问,如果[有问题点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Fbug&template=bug_report.yml) | [有需求建议点击这里](https://github.com/ccfos/nightingale/issues/new?assignees=&labels=kind%2Ffeature&template=enhancement.md);
-4. 最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题);
\ No newline at end of file
+
+最后,我们推荐你加入微信群,针对相关开放式问题,相互交流咨询 (请先加好友:[UlricGO](https://www.gitlink.org.cn/UlricQin/gist/tree/master/self.jpeg) 备注:夜莺加群+姓名+公司,交流群里会有开发者团队和专业、热心的群友回答问题);
\ No newline at end of file
diff --git a/doc/end-users.md b/doc/end-users.md
new file mode 100644
index 000000000..e69de29bb
diff --git a/doc/pmc.md b/doc/pmc.md
new file mode 100644
index 000000000..e69de29bb
From 65439df7fb34536a9da77a56ec3bfe6d9e254244 Mon Sep 17 00:00:00 2001
From: Yening Qin <710leo@gmail.com>
Date: Mon, 18 Jul 2022 14:37:31 +0800
Subject: [PATCH 2/9] fix event push api (#1057)
---
src/server/router/router_event.go | 37 +++++++++++++++++++++++++++++--
1 file changed, 35 insertions(+), 2 deletions(-)
diff --git a/src/server/router/router_event.go b/src/server/router/router_event.go
index e7da20ce1..969ae971b 100644
--- a/src/server/router/router_event.go
+++ b/src/server/router/router_event.go
@@ -2,6 +2,7 @@ package router
import (
"fmt"
+ "strings"
"github.com/didi/nightingale/v5/src/models"
"github.com/didi/nightingale/v5/src/server/config"
@@ -14,14 +15,46 @@ import (
)
func pushEventToQueue(c *gin.Context) {
- var event models.AlertCurEvent
+ var event *models.AlertCurEvent
ginx.BindJSON(c, &event)
if event.RuleId == 0 {
ginx.Bomb(200, "event is illegal")
}
+ event.TagsMap = make(map[string]string)
+ for i := 0; i < len(event.TagsJSON); i++ {
+ pair := strings.TrimSpace(event.TagsJSON[i])
+ if pair == "" {
+ continue
+ }
+
+ arr := strings.Split(pair, "=")
+ if len(arr) != 2 {
+ continue
+ }
+
+ event.TagsMap[arr[0]] = arr[1]
+ }
+
+ if err := event.ParseRuleNote(); err != nil {
+ event.RuleNote = fmt.Sprintf("failed to parse rule note: %v", err)
+ }
+
+ // 如果 rule_note 中有 ; 前缀,则使用 rule_note 替换 tags 中的内容
+ if strings.HasPrefix(event.RuleNote, ";") {
+ event.RuleNote = strings.TrimPrefix(event.RuleNote, ";")
+ event.Tags = strings.ReplaceAll(event.RuleNote, " ", ",,")
+ event.TagsJSON = strings.Split(event.Tags, ",,")
+ } else {
+ event.Tags = strings.Join(event.TagsJSON, ",,")
+ }
+
+ event.Callbacks = strings.Join(event.CallbacksJSON, " ")
+ event.NotifyChannels = strings.Join(event.NotifyChannelsJSON, " ")
+ event.NotifyGroups = strings.Join(event.NotifyGroupsJSON, " ")
+
promstat.CounterAlertsTotal.WithLabelValues(config.C.ClusterName).Inc()
- engine.LogEvent(&event, "http_push_queue")
+ engine.LogEvent(event, "http_push_queue")
if !engine.EventQueue.PushFront(event) {
msg := fmt.Sprintf("event:%+v push_queue err: queue is full", event)
ginx.Bomb(200, msg)
From 2847a315b1c809b2ed9fe99b8ec1e872364a6c33 Mon Sep 17 00:00:00 2001
From: Ulric Qin
Date: Mon, 18 Jul 2022 17:05:45 +0800
Subject: [PATCH 3/9] add server-dash.json
---
doc/server-dash.json | 234 +++++++++++++++++++++++++++++++++++++++++++
1 file changed, 234 insertions(+)
create mode 100644 doc/server-dash.json
diff --git a/doc/server-dash.json b/doc/server-dash.json
new file mode 100644
index 000000000..bbf8ad227
--- /dev/null
+++ b/doc/server-dash.json
@@ -0,0 +1,234 @@
+{
+ "name": "夜莺大盘",
+ "tags": "",
+ "configs": {
+ "var": [],
+ "panels": [
+ {
+ "targets": [
+ {
+ "refId": "A",
+ "expr": "rate(n9e_server_samples_received_total[1m])"
+ }
+ ],
+ "name": "每秒接收的数据点个数",
+ "options": {
+ "tooltip": {
+ "mode": "all",
+ "sort": "none"
+ },
+ "legend": {
+ "displayMode": "hidden"
+ },
+ "standardOptions": {},
+ "thresholds": {}
+ },
+ "custom": {
+ "drawStyle": "lines",
+ "lineInterpolation": "smooth",
+ "fillOpacity": 0.5,
+ "stack": "off"
+ },
+ "version": "2.0.0",
+ "type": "timeseries",
+ "layout": {
+ "h": 4,
+ "w": 12,
+ "x": 0,
+ "y": 0,
+ "i": "53fcb9dc-23f9-41e0-bc5e-121eed14c3a4",
+ "isResizable": true
+ },
+ "id": "53fcb9dc-23f9-41e0-bc5e-121eed14c3a4"
+ },
+ {
+ "targets": [
+ {
+ "refId": "A",
+ "expr": "rate(n9e_server_alerts_total[10m])"
+ }
+ ],
+ "name": "每秒产生的告警事件个数",
+ "options": {
+ "tooltip": {
+ "mode": "all",
+ "sort": "none"
+ },
+ "legend": {
+ "displayMode": "hidden"
+ },
+ "standardOptions": {},
+ "thresholds": {}
+ },
+ "custom": {
+ "drawStyle": "lines",
+ "lineInterpolation": "smooth",
+ "fillOpacity": 0.5,
+ "stack": "off"
+ },
+ "version": "2.0.0",
+ "type": "timeseries",
+ "layout": {
+ "h": 4,
+ "w": 12,
+ "x": 12,
+ "y": 0,
+ "i": "47fc6252-9cc8-4b53-8e27-0c5c59a47269",
+ "isResizable": true
+ },
+ "id": "f70dcb8b-b58b-4ef9-9e48-f230d9e17140"
+ },
+ {
+ "targets": [
+ {
+ "refId": "A",
+ "expr": "n9e_server_alert_queue_size"
+ }
+ ],
+ "name": "告警事件内存队列长度",
+ "options": {
+ "tooltip": {
+ "mode": "all",
+ "sort": "none"
+ },
+ "legend": {
+ "displayMode": "hidden"
+ },
+ "standardOptions": {},
+ "thresholds": {}
+ },
+ "custom": {
+ "drawStyle": "lines",
+ "lineInterpolation": "smooth",
+ "fillOpacity": 0.5,
+ "stack": "off"
+ },
+ "version": "2.0.0",
+ "type": "timeseries",
+ "layout": {
+ "h": 4,
+ "w": 12,
+ "x": 0,
+ "y": 4,
+ "i": "ad1af16c-de0c-45f4-8875-cea4e85d51d0",
+ "isResizable": true
+ },
+ "id": "caf23e58-d907-42b0-9ed6-722c8c6f3c5f"
+ },
+ {
+ "targets": [
+ {
+ "refId": "A",
+ "expr": "n9e_server_http_request_duration_seconds_sum/n9e_server_http_request_duration_seconds_count"
+ }
+ ],
+ "name": "数据接收接口平均响应时间(单位:秒)",
+ "options": {
+ "tooltip": {
+ "mode": "all",
+ "sort": "desc"
+ },
+ "legend": {
+ "displayMode": "hidden"
+ },
+ "standardOptions": {},
+ "thresholds": {}
+ },
+ "custom": {
+ "drawStyle": "lines",
+ "lineInterpolation": "smooth",
+ "fillOpacity": 0.5,
+ "stack": "noraml"
+ },
+ "version": "2.0.0",
+ "type": "timeseries",
+ "layout": {
+ "h": 4,
+ "w": 12,
+ "x": 12,
+ "y": 4,
+ "i": "64c3abc2-404c-4462-a82f-c109a21dac91",
+ "isResizable": true
+ },
+ "id": "6b8d2db1-efca-4b9e-b429-57a9d2272bc5"
+ },
+ {
+ "targets": [
+ {
+ "refId": "A",
+ "expr": "n9e_server_sample_queue_size"
+ }
+ ],
+ "name": "内存数据队列长度",
+ "options": {
+ "tooltip": {
+ "mode": "all",
+ "sort": "desc"
+ },
+ "legend": {
+ "displayMode": "hidden"
+ },
+ "standardOptions": {},
+ "thresholds": {}
+ },
+ "custom": {
+ "drawStyle": "lines",
+ "lineInterpolation": "smooth",
+ "fillOpacity": 0.5,
+ "stack": "off"
+ },
+ "version": "2.0.0",
+ "type": "timeseries",
+ "layout": {
+ "h": 4,
+ "w": 12,
+ "x": 0,
+ "y": 8,
+ "i": "1c7da942-58c2-40dc-b42f-983e4a35b89b",
+ "isResizable": true
+ },
+ "id": "bd41677d-40d3-482e-bb6e-fbd25df46d87"
+ },
+ {
+ "targets": [
+ {
+ "refId": "A",
+ "expr": "avg(n9e_server_forward_duration_seconds_sum/n9e_server_forward_duration_seconds_count)"
+ }
+ ],
+ "name": "数据发往TSDB平均耗时(单位:秒)",
+ "options": {
+ "tooltip": {
+ "mode": "all",
+ "sort": "desc"
+ },
+ "legend": {
+ "displayMode": "hidden"
+ },
+ "standardOptions": {
+ "decimals": 8
+ },
+ "thresholds": {}
+ },
+ "custom": {
+ "drawStyle": "lines",
+ "lineInterpolation": "smooth",
+ "fillOpacity": 0.5,
+ "stack": "noraml"
+ },
+ "version": "2.0.0",
+ "type": "timeseries",
+ "layout": {
+ "h": 4,
+ "w": 12,
+ "x": 12,
+ "y": 8,
+ "i": "eed94a0b-954f-48ac-82e5-a2eada1c8a3d",
+ "isResizable": true
+ },
+ "id": "c8642e72-f384-46a5-8410-1e6be2953c3c"
+ }
+ ],
+ "version": "2.0.0"
+ }
+}
\ No newline at end of file
From ba6f089c78a9091c09ef883249e596c6fb37944e Mon Sep 17 00:00:00 2001
From: Yening Qin <710leo@gmail.com>
Date: Tue, 19 Jul 2022 12:10:02 +0800
Subject: [PATCH 4/9] fix: get alert rules by api (#1059)
* fix event push api
---
src/models/alert_rule.go | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/src/models/alert_rule.go b/src/models/alert_rule.go
index f2a034ae3..bb5f336a4 100644
--- a/src/models/alert_rule.go
+++ b/src/models/alert_rule.go
@@ -337,7 +337,7 @@ func AlertRuleGetsByCluster(cluster string) ([]*AlertRule, error) {
}
func AlertRulesGetsBy(prods []string, query string) ([]*AlertRule, error) {
- session := DB().Where("disabled = ? and prod in (?)", 0, prods)
+ session := DB().Where("prod in (?)", prods)
if query != "" {
arr := strings.Fields(query)
From 04cb501ab446963942984582a3409cf227af6661 Mon Sep 17 00:00:00 2001
From: hwloser
Date: Thu, 21 Jul 2022 14:46:27 +0800
Subject: [PATCH 5/9] [fix] fix the docker problem of apple chip (#1060)
Co-authored-by: huanwei
---
docker/docker-compose.yaml | 1 +
1 file changed, 1 insertion(+)
diff --git a/docker/docker-compose.yaml b/docker/docker-compose.yaml
index 4bb5baa38..589e7aa37 100644
--- a/docker/docker-compose.yaml
+++ b/docker/docker-compose.yaml
@@ -6,6 +6,7 @@ networks:
services:
mysql:
+ platform: linux/x86_64
image: "mysql:5.7"
container_name: mysql
hostname: mysql
From c45cbd02cc984216932a9889ab6c1031b7de6e05 Mon Sep 17 00:00:00 2001
From: lsy1990
Date: Fri, 22 Jul 2022 17:02:49 +0800
Subject: [PATCH 6/9] supply plugin to notify maintainer (#1063)
---
etc/script/notify/notify.go | 8 ++++
src/server/engine/notify.go | 1 +
src/server/engine/notify_maintainer.go | 58 +++++++++++++++++++++++---
3 files changed, 62 insertions(+), 5 deletions(-)
diff --git a/etc/script/notify/notify.go b/etc/script/notify/notify.go
index faa65a526..f3bca2e0d 100644
--- a/etc/script/notify/notify.go
+++ b/etc/script/notify/notify.go
@@ -11,6 +11,7 @@ import (
type inter interface {
Descript() string
Notify([]byte)
+ NotifyMaintainer([]byte)
}
// N9E complete
@@ -37,6 +38,13 @@ func (n *N9EPlugin) Notify(bs []byte) {
}
}
+func (n *N9EPlugin) NotifyMaintainer(bs []byte) {
+ fmt.Println("do something... begin")
+ result := string(bs)
+ fmt.Println("%T",result)
+ fmt.Println("do something... end")
+}
+
// will be loaded for alertingCall , The first letter must be capitalized to be exported
var N9eCaller = N9EPlugin{
Name: "n9e",
diff --git a/src/server/engine/notify.go b/src/server/engine/notify.go
index 359504e37..535aacb16 100644
--- a/src/server/engine/notify.go
+++ b/src/server/engine/notify.go
@@ -401,6 +401,7 @@ func alertingCallScript(stdinBytes []byte) {
type Notifier interface {
Descript() string
Notify([]byte)
+ NotifyMaintainer([]byte)
}
// call notify.so via golang plugin build
diff --git a/src/server/engine/notify_maintainer.go b/src/server/engine/notify_maintainer.go
index c2a1cae89..c67886306 100644
--- a/src/server/engine/notify_maintainer.go
+++ b/src/server/engine/notify_maintainer.go
@@ -1,8 +1,12 @@
package engine
import (
+ "encoding/json"
+ "plugin"
+ "runtime"
"time"
+ "github.com/didi/nightingale/v5/src/models"
"github.com/didi/nightingale/v5/src/server/common/sender"
"github.com/didi/nightingale/v5/src/server/config"
"github.com/didi/nightingale/v5/src/server/memsto"
@@ -10,20 +14,65 @@ import (
"github.com/toolkits/pkg/logger"
)
-// notify to maintainer to handle the error
-func notifyToMaintainer(e error, title string) {
+type NoticeMaintainer struct {
+ NotifyUsersObj []*models.User `json:"notify_user_obj" gorm:"-"`
+ Title string `json:"title"`
+ Content string `json:"content"`
+}
- logger.Errorf("notifyToMaintainer,title:%s, error:%v", title, e)
+func noticeCallPlugin(stdinBytes []byte) {
+ if !config.C.Alerting.CallPlugin.Enable {
+ return
+ }
- if len(config.C.Alerting.NotifyBuiltinChannels) == 0 {
+ if runtime.GOOS == "windows" {
+ logger.Errorf("call notify plugin on unsupported os: %s", runtime.GOOS)
+ return
+ }
+
+ p, err := plugin.Open(config.C.Alerting.CallPlugin.PluginPath)
+ if err != nil {
+ logger.Errorf("failed to open notify plugin: %v", err)
+ return
+ }
+ caller, err := p.Lookup(config.C.Alerting.CallPlugin.Caller)
+ if err != nil {
+ logger.Errorf("failed to load caller: %v", err)
+ return
+ }
+ notifier, ok := caller.(Notifier)
+ if !ok {
+ logger.Errorf("notifier interface not implemented): %v", err)
return
}
+ notifier.NotifyMaintainer(stdinBytes)
+ logger.Debugf("noticeCallPlugin done. %s", notifier.Descript())
+}
+
+// notify to maintainer to handle the error
+func notifyToMaintainer(e error, title string) {
+
+ logger.Errorf("notifyToMaintainer,title:%s, error:%v", title, e)
+ var noticeMaintainer NoticeMaintainer
maintainerUsers := memsto.UserCache.GetMaintainerUsers()
if len(maintainerUsers) == 0 {
return
}
+ triggerTime := time.Now().Format("2006/01/02 - 15:04:05")
+ noticeMaintainer.NotifyUsersObj = maintainerUsers
+ noticeMaintainer.Content = "【内部处理错误】当前标题: " + title + "\n【内部处理错误】当前异常: " + e.Error() + "\n【内部处理错误】发送时间: " + triggerTime
+ noticeMaintainer.Title = title
+ stdinBytes, err := json.Marshal(noticeMaintainer)
+ if err != nil {
+ logger.Errorf("notifyToMaintainer: failed to marshal noticeMaintainer: %v", err)
+ } else {
+ noticeCallPlugin(stdinBytes)
+ }
+ if len(config.C.Alerting.NotifyBuiltinChannels) == 0 {
+ return
+ }
emailset := make(map[string]struct{})
phoneset := make(map[string]struct{})
wecomset := make(map[string]struct{})
@@ -62,7 +111,6 @@ func notifyToMaintainer(e error, title string) {
}
phones := StringSetKeys(phoneset)
- triggerTime := time.Now().Format("2006/01/02 - 15:04:05")
for _, ch := range config.C.Alerting.NotifyBuiltinChannels {
switch ch {
From 17c73616203b1a1b1f29e2616b1ac3d39143e885 Mon Sep 17 00:00:00 2001
From: ulricqin
Date: Fri, 22 Jul 2022 17:56:52 +0800
Subject: [PATCH 7/9] code refactor notify plugin (#1065)
---
etc/script/notify/notify.go | 6 +--
src/notifier/notifier.go | 9 ++++
src/server/config/config.go | 31 ++++++++++++
src/server/engine/notify.go | 36 ++-----------
src/server/engine/notify_maintainer.go | 70 ++++++++++++--------------
src/server/engine/worker.go | 4 +-
6 files changed, 80 insertions(+), 76 deletions(-)
create mode 100644 src/notifier/notifier.go
diff --git a/etc/script/notify/notify.go b/etc/script/notify/notify.go
index f3bca2e0d..86e584129 100644
--- a/etc/script/notify/notify.go
+++ b/etc/script/notify/notify.go
@@ -41,13 +41,13 @@ func (n *N9EPlugin) Notify(bs []byte) {
func (n *N9EPlugin) NotifyMaintainer(bs []byte) {
fmt.Println("do something... begin")
result := string(bs)
- fmt.Println("%T",result)
+ fmt.Println(result)
fmt.Println("do something... end")
}
// will be loaded for alertingCall , The first letter must be capitalized to be exported
var N9eCaller = N9EPlugin{
- Name: "n9e",
- Description: "演示告警通过动态链接库方式通知",
+ Name: "N9EPlugin",
+ Description: "Notification by lib",
BuildAt: time.Now().Local().Format("2006/01/02 15:04:05"),
}
diff --git a/src/notifier/notifier.go b/src/notifier/notifier.go
new file mode 100644
index 000000000..3fdda89ef
--- /dev/null
+++ b/src/notifier/notifier.go
@@ -0,0 +1,9 @@
+package notifier
+
+type Notifier interface {
+ Descript() string
+ Notify([]byte)
+ NotifyMaintainer([]byte)
+}
+
+var Instance Notifier
diff --git a/src/server/config/config.go b/src/server/config/config.go
index 2980e34b9..2cf280d84 100644
--- a/src/server/config/config.go
+++ b/src/server/config/config.go
@@ -2,8 +2,11 @@ package config
import (
"fmt"
+ "log"
"net"
"os"
+ "plugin"
+ "runtime"
"strings"
"sync"
"time"
@@ -11,6 +14,7 @@ import (
"github.com/gin-gonic/gin"
"github.com/koding/multiconfig"
+ "github.com/didi/nightingale/v5/src/notifier"
"github.com/didi/nightingale/v5/src/pkg/httpx"
"github.com/didi/nightingale/v5/src/pkg/logx"
"github.com/didi/nightingale/v5/src/pkg/ormx"
@@ -100,6 +104,33 @@ func MustLoad(fpaths ...string) {
}
}
+ if C.Alerting.CallPlugin.Enable {
+ if runtime.GOOS == "windows" {
+ fmt.Println("notify plugin on unsupported os:", runtime.GOOS)
+ os.Exit(1)
+ }
+
+ p, err := plugin.Open(C.Alerting.CallPlugin.PluginPath)
+ if err != nil {
+ fmt.Println("failed to load plugin:", err)
+ os.Exit(1)
+ }
+
+ caller, err := p.Lookup(C.Alerting.CallPlugin.Caller)
+ if err != nil {
+ fmt.Println("failed to lookup plugin Caller:", err)
+ os.Exit(1)
+ }
+
+ ins, ok := caller.(notifier.Notifier)
+ if !ok {
+ log.Println("notifier interface not implemented")
+ os.Exit(1)
+ }
+
+ notifier.Instance = ins
+ }
+
if C.WriterOpt.QueueMaxSize <= 0 {
C.WriterOpt.QueueMaxSize = 100000
}
diff --git a/src/server/engine/notify.go b/src/server/engine/notify.go
index 535aacb16..3328289fc 100644
--- a/src/server/engine/notify.go
+++ b/src/server/engine/notify.go
@@ -9,8 +9,6 @@ import (
"net/http"
"os/exec"
"path"
- "plugin"
- "runtime"
"strings"
"time"
@@ -22,6 +20,7 @@ import (
"github.com/toolkits/pkg/slice"
"github.com/didi/nightingale/v5/src/models"
+ "github.com/didi/nightingale/v5/src/notifier"
"github.com/didi/nightingale/v5/src/pkg/sys"
"github.com/didi/nightingale/v5/src/pkg/tplx"
"github.com/didi/nightingale/v5/src/server/common/sender"
@@ -103,7 +102,6 @@ func alertingRedisPub(bs []byte) {
func handleNotice(notice Notice, bs []byte) {
alertingCallScript(bs)
-
alertingCallPlugin(bs)
if len(config.C.Alerting.NotifyBuiltinChannels) == 0 {
@@ -398,12 +396,6 @@ func alertingCallScript(stdinBytes []byte) {
logger.Infof("event_notify: exec %s output: %s", fpath, buf.String())
}
-type Notifier interface {
- Descript() string
- Notify([]byte)
- NotifyMaintainer([]byte)
-}
-
// call notify.so via golang plugin build
// ig. etc/script/notify/notify.so
func alertingCallPlugin(stdinBytes []byte) {
@@ -411,26 +403,8 @@ func alertingCallPlugin(stdinBytes []byte) {
return
}
- if runtime.GOOS == "windows" {
- logger.Errorf("call notify plugin on unsupported os: %s", runtime.GOOS)
- return
- }
-
- p, err := plugin.Open(config.C.Alerting.CallPlugin.PluginPath)
- if err != nil {
- logger.Errorf("failed to open notify plugin: %v", err)
- return
- }
- caller, err := p.Lookup(config.C.Alerting.CallPlugin.Caller)
- if err != nil {
- logger.Errorf("failed to load caller: %v", err)
- return
- }
- notifier, ok := caller.(Notifier)
- if !ok {
- logger.Errorf("notifier interface not implemented): %v", err)
- return
- }
- notifier.Notify(stdinBytes)
- logger.Debugf("alertingCallPlugin done. %s", notifier.Descript())
+ logger.Debugf("alertingCallPlugin begin")
+ logger.Debugf("payload:", string(stdinBytes))
+ notifier.Instance.Notify(stdinBytes)
+ logger.Debugf("alertingCallPlugin done")
}
diff --git a/src/server/engine/notify_maintainer.go b/src/server/engine/notify_maintainer.go
index c67886306..0d5a135d6 100644
--- a/src/server/engine/notify_maintainer.go
+++ b/src/server/engine/notify_maintainer.go
@@ -2,11 +2,11 @@ package engine
import (
"encoding/json"
- "plugin"
"runtime"
"time"
"github.com/didi/nightingale/v5/src/models"
+ "github.com/didi/nightingale/v5/src/notifier"
"github.com/didi/nightingale/v5/src/server/common/sender"
"github.com/didi/nightingale/v5/src/server/config"
"github.com/didi/nightingale/v5/src/server/memsto"
@@ -14,13 +14,13 @@ import (
"github.com/toolkits/pkg/logger"
)
-type NoticeMaintainer struct {
- NotifyUsersObj []*models.User `json:"notify_user_obj" gorm:"-"`
- Title string `json:"title"`
- Content string `json:"content"`
+type MaintainMessage struct {
+ Tos []*models.User `json:"tos"`
+ Title string `json:"title"`
+ Content string `json:"content"`
}
-func noticeCallPlugin(stdinBytes []byte) {
+func notifyMaintainerWithPlugin(e error, title, triggerTime string, users []*models.User) {
if !config.C.Alerting.CallPlugin.Enable {
return
}
@@ -30,56 +30,48 @@ func noticeCallPlugin(stdinBytes []byte) {
return
}
- p, err := plugin.Open(config.C.Alerting.CallPlugin.PluginPath)
- if err != nil {
- logger.Errorf("failed to open notify plugin: %v", err)
- return
- }
- caller, err := p.Lookup(config.C.Alerting.CallPlugin.Caller)
+ stdinBytes, err := json.Marshal(MaintainMessage{
+ Tos: users,
+ Title: title,
+ Content: "Title: " + title + "\nContent: " + e.Error() + "\nTime: " + triggerTime,
+ })
+
if err != nil {
- logger.Errorf("failed to load caller: %v", err)
+ logger.Error("failed to marshal MaintainMessage:", err)
return
}
- notifier, ok := caller.(Notifier)
- if !ok {
- logger.Errorf("notifier interface not implemented): %v", err)
- return
- }
- notifier.NotifyMaintainer(stdinBytes)
- logger.Debugf("noticeCallPlugin done. %s", notifier.Descript())
+
+ notifier.Instance.NotifyMaintainer(stdinBytes)
+ logger.Debugf("notify maintainer with plugin done")
}
// notify to maintainer to handle the error
func notifyToMaintainer(e error, title string) {
+ logger.Errorf("notifyToMaintainer, title:%s, error:%v", title, e)
- logger.Errorf("notifyToMaintainer,title:%s, error:%v", title, e)
-
- var noticeMaintainer NoticeMaintainer
- maintainerUsers := memsto.UserCache.GetMaintainerUsers()
- if len(maintainerUsers) == 0 {
+ users := memsto.UserCache.GetMaintainerUsers()
+ if len(users) == 0 {
return
}
+
triggerTime := time.Now().Format("2006/01/02 - 15:04:05")
- noticeMaintainer.NotifyUsersObj = maintainerUsers
- noticeMaintainer.Content = "【内部处理错误】当前标题: " + title + "\n【内部处理错误】当前异常: " + e.Error() + "\n【内部处理错误】发送时间: " + triggerTime
- noticeMaintainer.Title = title
- stdinBytes, err := json.Marshal(noticeMaintainer)
- if err != nil {
- logger.Errorf("notifyToMaintainer: failed to marshal noticeMaintainer: %v", err)
- } else {
- noticeCallPlugin(stdinBytes)
- }
+ notifyMaintainerWithPlugin(e, title, triggerTime, users)
+ notifyMaintainerWithBuiltin(e, title, triggerTime, users)
+}
+
+func notifyMaintainerWithBuiltin(e error, title, triggerTime string, users []*models.User) {
if len(config.C.Alerting.NotifyBuiltinChannels) == 0 {
return
}
+
emailset := make(map[string]struct{})
phoneset := make(map[string]struct{})
wecomset := make(map[string]struct{})
dingtalkset := make(map[string]struct{})
feishuset := make(map[string]struct{})
- for _, user := range maintainerUsers {
+ for _, user := range users {
if user.Email != "" {
emailset[user.Email] = struct{}{}
}
@@ -118,13 +110,13 @@ func notifyToMaintainer(e error, title string) {
if len(emailset) == 0 {
continue
}
- content := "【内部处理错误】当前标题: " + title + "\n【内部处理错误】当前异常: " + e.Error() + "\n【内部处理错误】发送时间: " + triggerTime
+ content := "Title: " + title + "\nContent: " + e.Error() + "\nTime: " + triggerTime
sender.WriteEmail(title, content, StringSetKeys(emailset))
case "dingtalk":
if len(dingtalkset) == 0 {
continue
}
- content := "**【内部处理错误】当前标题: **" + title + "\n**【内部处理错误】当前异常: **" + e.Error() + "\n**【内部处理错误】发送时间: **" + triggerTime
+ content := "**Title: **" + title + "\n**Content: **" + e.Error() + "\n**Time: **" + triggerTime
sender.SendDingtalk(sender.DingtalkMessage{
Title: title,
Text: content,
@@ -135,7 +127,7 @@ func notifyToMaintainer(e error, title string) {
if len(wecomset) == 0 {
continue
}
- content := "**【内部处理错误】当前标题: **" + title + "\n**【内部处理错误】当前异常: **" + e.Error() + "\n**【内部处理错误】发送时间: **" + triggerTime
+ content := "**Title: **" + title + "\n**Content: **" + e.Error() + "\n**Time: **" + triggerTime
sender.SendWecom(sender.WecomMessage{
Text: content,
Tokens: StringSetKeys(wecomset),
@@ -145,7 +137,7 @@ func notifyToMaintainer(e error, title string) {
continue
}
- content := "【内部处理错误】当前标题: " + title + "\n【内部处理错误】当前异常: " + e.Error() + "\n【内部处理错误】发送时间: " + triggerTime
+ content := "Title: " + title + "\nContent: " + e.Error() + "\nTime: " + triggerTime
sender.SendFeishu(sender.FeishuMessage{
Text: content,
AtMobiles: phones,
diff --git a/src/server/engine/worker.go b/src/server/engine/worker.go
index 352b085c4..6f254a78b 100644
--- a/src/server/engine/worker.go
+++ b/src/server/engine/worker.go
@@ -116,8 +116,7 @@ func (r RuleEval) Work() {
value, warnings, err = reader.Client.Query(context.Background(), promql, time.Now())
if err != nil {
logger.Errorf("rule_eval:%d promql:%s, error:%v", r.RuleID(), promql, err)
- // 告警查询prometheus逻辑出错,发告警信息给管理员
- notifyToMaintainer(err, "查询prometheus出错")
+ notifyToMaintainer(err, "failed to query prometheus")
return
}
@@ -190,7 +189,6 @@ func (ws *WorkersType) Build(rids []int64) {
elst, err := models.AlertCurEventGetByRule(rules[hash].Id)
if err != nil {
logger.Errorf("worker_build: AlertCurEventGetByRule failed: %v", err)
- notifyToMaintainer(err, "AlertCurEventGetByRule Error,ruleID="+fmt.Sprint(rules[hash].Id))
continue
}
From 0bd7ba95490afaeeb3f740c9eb7c715e531cd03c Mon Sep 17 00:00:00 2001
From: ulricqin
Date: Fri, 22 Jul 2022 18:12:42 +0800
Subject: [PATCH 8/9] code refactor notify (#1066)
---
etc/script/notify/notify.go | 9 +--------
src/server/engine/notify_maintainer.go | 6 ------
2 files changed, 1 insertion(+), 14 deletions(-)
diff --git a/etc/script/notify/notify.go b/etc/script/notify/notify.go
index 86e584129..2d8afa9cd 100644
--- a/etc/script/notify/notify.go
+++ b/etc/script/notify/notify.go
@@ -7,13 +7,6 @@ import (
"github.com/tidwall/gjson"
)
-// the caller can be called for alerting notify by complete this interface
-type inter interface {
- Descript() string
- Notify([]byte)
- NotifyMaintainer([]byte)
-}
-
// N9E complete
type N9EPlugin struct {
Name string
@@ -48,6 +41,6 @@ func (n *N9EPlugin) NotifyMaintainer(bs []byte) {
// will be loaded for alertingCall , The first letter must be capitalized to be exported
var N9eCaller = N9EPlugin{
Name: "N9EPlugin",
- Description: "Notification by lib",
+ Description: "Notify by lib",
BuildAt: time.Now().Local().Format("2006/01/02 15:04:05"),
}
diff --git a/src/server/engine/notify_maintainer.go b/src/server/engine/notify_maintainer.go
index 0d5a135d6..9f61708c3 100644
--- a/src/server/engine/notify_maintainer.go
+++ b/src/server/engine/notify_maintainer.go
@@ -2,7 +2,6 @@ package engine
import (
"encoding/json"
- "runtime"
"time"
"github.com/didi/nightingale/v5/src/models"
@@ -25,11 +24,6 @@ func notifyMaintainerWithPlugin(e error, title, triggerTime string, users []*mod
return
}
- if runtime.GOOS == "windows" {
- logger.Errorf("call notify plugin on unsupported os: %s", runtime.GOOS)
- return
- }
-
stdinBytes, err := json.Marshal(MaintainMessage{
Tos: users,
Title: title,
From ba7ff133e630c39cb914d1553988462cdf84135f Mon Sep 17 00:00:00 2001
From: ulricqin
Date: Sat, 23 Jul 2022 17:50:16 +0800
Subject: [PATCH 9/9] modify prometheus query batch response format (#1068)
---
src/webapi/router/router.go | 2 --
src/webapi/router/router_prometheus.go | 19 +++++--------------
2 files changed, 5 insertions(+), 16 deletions(-)
diff --git a/src/webapi/router/router.go b/src/webapi/router/router.go
index 262039487..27a001066 100644
--- a/src/webapi/router/router.go
+++ b/src/webapi/router/router.go
@@ -101,11 +101,9 @@ func configRoute(r *gin.Engine, version string) {
if config.C.AnonymousAccess.PromQuerier {
pages.Any("/prometheus/*url", prometheusProxy)
-
pages.POST("/query-range-batch", promBatchQueryRange)
} else {
pages.Any("/prometheus/*url", auth(), prometheusProxy)
-
pages.POST("/query-range-batch", auth(), promBatchQueryRange)
}
diff --git a/src/webapi/router/router_prometheus.go b/src/webapi/router/router_prometheus.go
index 624ed936a..5bb2b87b0 100644
--- a/src/webapi/router/router_prometheus.go
+++ b/src/webapi/router/router_prometheus.go
@@ -32,21 +32,15 @@ type batchQueryForm struct {
func promBatchQueryRange(c *gin.Context) {
xcluster := c.GetHeader("X-Cluster")
if xcluster == "" {
- c.String(500, "X-Cluster is blank")
- return
+ ginx.Bomb(http.StatusBadRequest, "header(X-Cluster) is blank")
}
var f batchQueryForm
- err := c.BindJSON(&f)
- if err != nil {
- c.String(500, err.Error())
- return
- }
+ ginx.Dangerous(c.BindJSON(&f))
cluster, exist := prom.Clusters.Get(xcluster)
if !exist {
- c.String(http.StatusBadRequest, "cluster(%s) not found", xcluster)
- return
+ ginx.Bomb(http.StatusBadRequest, "cluster(%s) not found", xcluster)
}
var lst []model.Value
@@ -59,15 +53,12 @@ func promBatchQueryRange(c *gin.Context) {
}
resp, _, err := cluster.PromClient.QueryRange(context.Background(), item.Query, r)
- if err != nil {
- c.String(500, err.Error())
- return
- }
+ ginx.Dangerous(err)
lst = append(lst, resp)
}
- c.JSON(200, lst)
+ ginx.NewRender(c).Data(lst, nil)
}
func prometheusProxy(c *gin.Context) {