腾讯云Elasticsearch集群规划及性能优化实践 (5)

日期：2021-05-23 栏目：程序人生浏览：次

PUT /index_name/_settings { "index": { "routing": { "allocation": { "require": { "temperature": "warm" } } } } }

（5）节点长时间掉线后重新加入集群，引入了脏数据

cannot allocate because all found copies of the shard are either stale or corrupt

解决方法：通过reroute API来重新分配一个主分片：

POST _cluster/reroute?pretty" -d '{ "commands" : [ { "allocate_stale_primary" : { "index" : "article", "shard" : 1, "node" : "98365000222032", "accept_data_loss": true } } ] }

（6）未分配分片太多，达到了分片恢复的阈值，其他分片排队等待

reached the limit of incoming shard recoveries [2], cluster setting [cluster.routing.allocation.node_concurrent_incoming_recoveries=2] (can also be set via [cluster.routing.allocation.node_concurrent_recoveries])

这种情况通常出现在集群重启，或者某一个节点重启后。且由于设置的分片并发恢复的值较低导致。为了尽快恢复集群健康状态。

解决方法：可以通过调用下面的API来提升分片恢复的速度和并发度：

PUT /_cluster/settings { "transient" : { "cluster.routing.allocation.node_concurrent_recoveries": "20", "indices.recovery.max_bytes_per_sec": "100mb" } }

结语

本文介绍了集群规模和索引配置规划的评估准则，依据这些准则提前规划集群，可以保证集群的稳定性和可用性，简化复杂的运维工作。

另外介绍了一些常见的写入性能优化的建议和方法。能够进一步提升集群的写入性能和稳定性。最后介绍了日常运维工作中常见的排查集群问题的方法和思路。希望本文能够帮助到腾讯云的每一个ES客户。

转载注明出处：https://www.heiqu.com/wpgjfd.html

腾讯云Elasticsearch集群规划及性能优化实践 (5)

相关推荐