监控面板的使用相关-Grafuna

Feb 2, 2024

假如我们现在起了一个服务，用户会来调用我们写的接口，我们可以通过我们的数据库，log，服务器状态，等等方式来查看到我们起的服务的运行情况。

那么，更加直观一点的方式有吗

有的

Prometheus 和 Grafana 是一对常常一起使用的监控和可视化工具，它们在构建和管理监控系统时发挥着不同的角色。

Prometheus： https://github.com/prometheus/prometheus

是一个开源的系统监控和警报工具。
使用拉模型从目标服务（例如应用程序、数据库、服务器）中收集指标数据。
使用自定义的查询语言 PromQL 来查询和处理这些指标数据。
提供了强大的数据存储和查询功能，支持多维度数据（标签）。

Grafana： https://github.com/grafana/grafana

是一个开源的可视化和仪表板工具。
可以与多种数据源集成，其中包括 Prometheus。
提供了灵活的可视化选项，包括折线图、柱状图、仪表盘等。
允许用户创建和定制仪表板，将多个数据源的信息集成在一个界面上。

Prometheus Pushgateway：https://github.com/prometheus/pushgateway

是一个开源的属于Prometheus的推送网关
通过将数据先推送给Pushgateway，再统一由Prometheus进行拉取

大致就是这么个流程：

APP —数据推送—>Prometheus Pushgateway<—数据拉取—Prometheus——Grafuna数据显示

orbstack-docker

这里可以看到我本地起了这三个服务的docker

上报数据的方式

上报数据有两种一种是通过pull的方式，prometheus自动从配置文件的指定源去拉取数据，另一种是从我们的应用程序上报到pushgateway，然后从pushgateway推送数据

Docker

grafana（数据可视化工具，用于可视化监控数据）配置：/Users/siky/yaml/grafana.ini

prometheus（数据源）配置： /Users/siky/yaml/prometheus.yml

pushgateway （属于prometheus的网关）

这是我之前记下的可以写的配置

如果暂时没有测试的接口或是数据，我们可以使用Exporter‣

Prometheus会周期性的从Exporter暴露的HTTP服务地址（通常是/metrics）拉取监控样本数据

比如这是我prometheus.yaml里的配置

global:
  scrape_interval: 15s

scrape_configs:
  - job_name: "prometheus"
    static_configs:
      - targets: ['docker.for.mac.localhost:9090']
  - job_name: "push-metrics"
    static_configs:
      - targets: ['docker.for.mac.localhost:9091']
        labels:
          instance: pushgateway
  - job_name: 'node'
    static_configs:
      - targets: ['docker.for.mac.localhost:9100']
  - job_name: go_get_prometheus_qps
    static_configs:
      - targets: ['docker.for.mac.localhost:9000']

我运行了一个job_name为node的东西，这就是上面说的Exporter

这是我们访问本地的http://localhost:9100/metrics可以看到我们本地的硬件运行信息比如CPU, 内存，磁盘等

node-metrics

当然了他也会上报到Pushgateway

我们可以通过访问prometheus，来搜索这些数据

prometheus-data

prometheus也具备一定数据图形化的功能，但更主要的还是作为数据源

Grafuna

在启动Grafuna的docker后

访问http://localhost:3000/

第一次登录默认用户名和密码为 admin:admin

Grafuna-view

我们可以在这个页面上进行数据源的设置，进行数据面板的设置

可以看到他支持Prometheus，Graphite，influxDB，openTSDB作为数据源，现在我们先用Prometheus

choose-data-type

由于我们是Docker启动的

这里的9090是你的Prometheus的启动端口

在mac上可以使用http://docker.for.mac.localhost:9090

在windows上可以使用http://host.docker.internal:9090

选择数据源地址

然后直接save就好了

然后我们创建数据面板

数据面板创建

在第一个位置选择数据源

在第二个位置选择运行的命令

在第三个位置点击运行，就会出现你想要的数据面板了，记得apply

比如我现在换了个命令，换成看内存，数据就不一样了

memory

Prometheus的数据类型

Counter：只增不减的计数器
Gauge：可增可减的仪表盘
Histogram 直方图数据分布情况
Summary 摘要数据分布情况

Counter 一般用于统计比如：http请求总数(统计QPS)

Gauge 一般用于查看变化比如：半小时内内存可用情况的变动

比如我们要在一个Go程序里进行数据的上报

写一个路由

// 添加用于公开Prometheus指标的路由，WrapH 是 Gin 框架提供的一个函数，用于将标准库 http.Handler 包装成 Gin 的 gin.HandlerFunc
	r.GET("/metrics", gin.WrapH(promhttp.Handler()))

- job_name: go_get_prometheus_qps
    static_configs:
      - targets: ['docker.for.mac.localhost:9000']

增加一个配置文件：表示Prometheus会从9000端口抓取数据

Prometheus-9000

比如需要对应用进行每个接口进行QPS统计和耗时统计

先定义变量

const (
	PrometheusUrl          = "http://127.0.0.1:9091" //Prometheus PushGateway 的地址，用于推送指标数据。
	PrometheusJob          = "go_get_prometheus_qps" //Prometheus job 名称，用于标识这个应用程序的任务。,需要和yaml的job_name一致
	PrometheusNamespace    = "go_get_data"           //Prometheus 指标的命名空间。
	EndpointsDataSubsystem = "endpoints"             //Prometheus 指标的子系统，用于更细粒度地标识指标
)

var (
	Pusher *push.Pusher //Prometheus pusher，用于将指标数据推送到 Prometheus PushGateway。

	//一个 Counter 向量，用于记录接口 QPS 的统计信息.用于记录每个接口的请求数量。
	EndpointsQPSMonitor = prometheus.NewCounterVec(
		prometheus.CounterOpts{
			Namespace: PrometheusNamespace,    //命名空间
			Subsystem: EndpointsDataSubsystem, //子系统
			Name:      "QPS_statistic",        //指标名称
			Help:      "统计QPS数据",              //帮助信息
		}, []string{EndpointsDataSubsystem}, //用于标识不同的接口
	)
	//一个 Histogram 向量，用于记录接口耗时的统计信息。用于记录每个接口的耗时。
	EndpointsLantencyMonitor = prometheus.NewHistogramVec(
		prometheus.HistogramOpts{
			Namespace: PrometheusNamespace,                                      //命名空间
			Subsystem: EndpointsDataSubsystem,                                   //子系统
			Name:      "lantency_statistic",                                     //指标名称
			Help:      "统计耗时数据",                                                 //帮助信息
			Buckets:   []float64{1, 5, 10, 20, 50, 100, 500, 1000, 5000, 10000}, //耗时区间，如果一个请求的耗时在某个区间内，就会被记录到对应的桶中。
		}, []string{EndpointsDataSubsystem}, //用于标识不同的接口
	)
)

我们使用之前提到的Prometheus内的Counter和Histogram类型新建两个监控指标

然后进行指标的初始化

// 初始化指标
func Init() {
	// 创建一个新的pusher实例，使用PrometheusUrl和PrometheusJob进行配置
	Pusher = push.New(PrometheusUrl, PrometheusJob)

	// 使用prometheus.MustRegister函数注册两个监控指标，分别是EndpointsQPSMonitor和EndpointsLantencyMonitor
	prometheus.MustRegister(
		EndpointsQPSMonitor,
		EndpointsLantencyMonitor,
	)

	// 将EndpointsQPSMonitor和EndpointsLantencyMonitor添加到Pusher实例中的收集器列表中
	Pusher.Collector(EndpointsQPSMonitor)
	Pusher.Collector(EndpointsLantencyMonitor)
}

这里我们写一个方法用Pusher包里的Add方法进行15秒一次数据的上报给pushgateway

func PushGateway() {
	// 每15秒上报一次数据
	for range time.Tick(15 * time.Second) {
		if err := Pusher.Add(); err != nil {
			log.Println(err)
		}
		log.Println("push ")
	}
}

由于我们需要对我们的每个路由进行监控，所以我们需要给每个路由用到监控的中间件，这里调用我们刚刚定义好的监控指标 EndpointsQPSMonitor 和 EndpointsLantencyMonitor

// 用于在接口被调用时递增相应的 QPS 计数器，并将 endpoint 作为 label。
func HandleEndpointQps() gin.HandlerFunc {
	return func(c *gin.Context) {
		//获取当前请求的接口路径
		endpoint := c.Request.URL.Path
		fmt.Println(endpoint)
		//递增 QPS 计数器，将当前请求的 endpoint 作为 label。然后，通过 Inc 方法递增 QPS 计数器。
		// 排除 /metrics 路由
		// if endpoint != "/metrics" {
		metrics.EndpointsQPSMonitor.With(prometheus.Labels{metrics.EndpointsDataSubsystem: endpoint}).Inc()
		// }
		c.Next()
	}
}

func HandleEndpointLantency() gin.HandlerFunc {
	return func(c *gin.Context) {
		// 获取当前请求的接口路径
		endpoint := c.Request.URL.Path
		// 记录当前时间，作为请求处理开始的时间点
		start := time.Now()
		defer func(c *gin.Context) {
			// 计算请求处理耗时
			lantency := time.Since(start)
			// 将耗时直接转换为毫秒，保留三位小数
			lantencyFloat64 := float64(lantency.Nanoseconds()) / 1e6

			fmt.Printf("请求接口:%v，请求处理耗时：%f ms\n", endpoint, lantencyFloat64)
			// 将当前请求的 endpoint 作为 label。然后通过 Observe 方法记录耗时。
			metrics.EndpointsLantencyMonitor.With(prometheus.Labels{metrics.EndpointsDataSubsystem: endpoint}).Observe(lantencyFloat64)
		}(c)
		c.Next()
	}
}

这样就大功告成了

运行！

last-1

last-2