目录(脑图) ClickHousePaaS云原生多租户平台(Altinity。Cloud) 官网:https:altinity。cloudPaaS架构概览 设计一个拥有云原生编排能力、支持多云环境部署、自动化运维、弹性扩缩容、故障自愈等特性,同时提供租户隔离、权限管理、操作审计等企业级能力的高性能、低成本的分布式中间件服务是真挺难的。 SaaS模式交付给用户 SentrySnuba事件大数据分析引擎架构概览 Snuba是一个在Clickhouse基础上提供丰富数据模型、快速摄取消费者和查询优化器的服务。以搜索和提供关于Sentry事件数据的聚合引擎。 数据完全存储在Clickhouse表和物化视图中,它通过输入流(目前只有Kafka主题)摄入,可以通过时间点查询或流查询(订阅)进行查询。 文档:https:getsentry。github。iosnubaarchitectureoverview。htmlKubernetesClickHouseOperator什么是KubernetesOperator? KubernetesOperator是一种封装、部署和管理Kubernetes应用的方法。我们使用KubernetesAPI(应用编程接口)和kubectl工具在Kubernetes上部署并管理Kubernetes应用。https:kubernetes。iozhcndocsconceptsextendkubernetesoperatorAltinityOperatorforClickHouse Altinity:ClickHouseOperator业界领先开源提供商。Altinity:https:altinity。comGitHub:https:github。comAltinityclickhouseoperatorYoutube:https:www。youtube。comAltinity 当然这种多租户隔离的ClickHouse中间件PaaS云平台,公司或云厂商几乎是不开源的。RadonDBClickHousehttps:github。comradondbradondbclickhouseoperatorhttps:github。comradondbradondbclickhousekubernetes 云厂商(青云)基于altinityclickhouseoperator定制的。对于快速部署生产集群做了些优化。HelmOperator快速上云ClickHouse集群云原生实验环境VKEK8SCluster,Vultr托管集群(v1。23。14)Kubespherev3。3。1集群可视化管理,全栈的Kubernetes容器云PaaS解决方案。Longhorn1。14,Kubernetes的云原生分布式块存储。部署clickhouseoperator 这里我们使用RadonDB定制的Operator。values。operator。yaml定制如下两个参数:operator监控集群所有namespace的clickhouse部署watchAllNamespaces:true启用operator指标监控enablePrometheusMonitor:truehelm部署operator:cdvipk8spaas10cloudnativeclickhouse部署在kubesystemhelminstallclickhouseoperator。clickhouseoperatorfvalues。operator。yamlnkubesystemkubectlnkubesystemgetpogrepclickhouseoperatorclickhouseoperator6457c6dcddszgpd11Running03m33skubectlnkubesystemgetsvcgrepclickhouseoperatorclickhouseoperatormetricsClusterIP10。110。129。244none8888TCP4m18skubectlapiresourcesgrepclickhouseclickhouseinstallationschiclickhouse。radondb。comv1trueClickHouseInstallationclickhouseinstallationtemplateschitclickhouse。radondb。comv1trueClickHouseInstallationTemplateclickhouseoperatorconfigurationschopconfclickhouse。radondb。comv1trueClickHouseOperatorConfiguration部署clickhousecluster 这里我们使用RadonDB定制的clickhouseclusterhelmcharts。 快速部署2shards2replicas3zknodes的集群。values。cluster。yaml定制:clickhouse:clusterName:snubaclickhousenodesshardscount:2replicascount:2。。。zookeeper:install:truereplicas:3helm部署clickhousecluster:kubectlcreatenscloudclickhousehelminstallclickhouse。clickhouseclusterfvalues。cluster。yamlncloudclickhousekubectlgetponcloudclickhousechiclickhousesnubacknodes00033Running5(6m13sago)16mchiclickhousesnubacknodes01033Running1(5m33sago)6m23schiclickhousesnubacknodes10033Running1(4m58sago)5m44schiclickhousesnubacknodes11033Running1(4m28sago)5m10szkclickhouse011Running017mzkclickhouse111Running017mzkclickhouse211Running017m借助Operator快速扩展clickhouse分片集群使用如下命令,将shardsCount改为3:kubectleditchiclickhousencloudclickhouse 查看pods:kubectlgetponcloudclickhouseNAMEREADYSTATUSRESTARTSAGEchiclickhousesnubacknodes00033Running5(24mago)34mchiclickhousesnubacknodes01033Running1(23mago)24mchiclickhousesnubacknodes10033Running1(22mago)23mchiclickhousesnubacknodes11033Running1(22mago)23mchiclickhousesnubacknodes20033Running1(108sago)2m33schiclickhousesnubacknodes21033Running1(72sago)119szkclickhouse011Running035mzkclickhouse111Running035mzkclickhouse211Running035m 发现多出chiclickhousesnubacknodes200与chiclickhousesnubacknodes210。分片与副本已自动由Operator新建。小试牛刀ReplicatedMergeTreeDistributedZookeeper构建多分片多副本集群 连接clickhouse 我们进入Pod,使用原生命令行客户端clickhouseclient连接。kubectlexecitchiclickhousesnubacknodes000ncloudclickhousebashkubectlexecitchiclickhousesnubacknodes010ncloudclickhousebashkubectlexecitchiclickhousesnubacknodes100ncloudclickhousebashkubectlexecitchiclickhousesnubacknodes110ncloudclickhousebashkubectlexecitchiclickhousesnubacknodes200ncloudclickhousebashkubectlexecitchiclickhousesnubacknodes210ncloudclickhousebash 我们直接通过终端分别进入这6个pod。然后进行测试:clickhouseclientmultilineuusernamehippasswordpassowrdclickhouseclientm 创建分布式数据库查看system。clustersselectfromsystem。clusters; 2。创建名为test的数据库createdatabasetestonclustersnubacknodes;删除:dropdatabasetestonclustersnubacknodes; 在各个节点查看,都已存在test数据库。showdatabases; 创建本地表(ReplicatedMergeTree)建表语句如下: 在集群中各个节点test数据库中创建tlocal本地表,采用ReplicatedMergeTree表引擎,接受两个参数:zoopathzookeeper中表的路径,针对表同一个分片的不同副本,定义相同路径。clickhousetables{shard}testtlocalreplicanamezookeeper中表的副本名称CREATETABLEtest。tlocalonclustersnubacknodes(EventDateDateTime,CounterIDUInt32,UserIDUInt32)ENGINEReplicatedMergeTree(clickhousetables{shard}testtlocal,{replica})PARTITIONBYtoYYYYMM(EventDate)ORDERBY(CounterID,EventDate,intHash32(UserID))SAMPLEBYintHash32(UserID); 宏(macros)占位符: 建表语句参数包含的宏替换占位符(如:{replica})。会被替换为配置文件里macros部分的值。 查看集群中clickhouse分片副本节点configmap:kubectlgetconfigmapncloudclickhousegrepclickhouseNAMEDATAAGEchiclickhousecommonconfigd620hchiclickhousecommonusersd620hchiclickhousedeployconfdsnubacknodes00220hchiclickhousedeployconfdsnubacknodes01220hchiclickhousedeployconfdsnubacknodes10220hchiclickhousedeployconfdsnubacknodes11220hchiclickhousedeployconfdsnubacknodes20219hchiclickhousedeployconfdsnubacknodes21219h 查看节点配置值:kubectldescribeconfigmapchiclickhousedeployconfdsnubacknodes00ncloudclickhouse 创建对应的分布式表(Distributed)CREATETABLEtest。tdistonclustersnubacknodes(EventDateDateTime,CounterIDUInt32,UserIDUInt32)ENGINEDistributed(snubacknodes,test,tlocal,rand());droptabletest。tdistonclustersnubacknodes; 这里,Distributed引擎的所用的四个参数:cluster服务为配置中的集群名(snubacknodes)database远程数据库名(test)table远程数据表名(tlocal)shardingkey(可选)分片key(CounterIDrand()) 查看相关表,如:usetest;showtables;tdisttlocal 通过分布式表插入几条数据:插入INSERTINTOtest。tdistVALUES(2022121600:00:00,1,1),(2023010100:00:00,2,2),(2023020100:00:00,3,3); 任一节点查询数据:selectfromtest。tdist; 实战,为Snuba引擎提供ClickHousePaaS拆解与分析SentryHelmCharts 在我们迁移到KubernetesOperator之前,我们先拆解与分析下sentrycharts中自带的clickhousezookeepercharts。 非官方SentryHelmCharts:https:github。comsentrykubernetescharts 他的Chart。yaml如下:apiVersion:v2appVersion:22。11。0dependencies:condition:sourcemaps。enabledname:memcachedrepository:https:charts。bitnami。combitnamiversion:6。1。5condition:redis。enabledname:redisrepository:https:charts。bitnami。combitnamiversion:16。12。1condition:kafka。enabledname:kafkarepository:https:charts。bitnami。combitnamiversion:16。3。2condition:clickhouse。enabledname:clickhouserepository:https:sentrykubernetes。github。iochartsversion:3。2。0condition:zookeeper。enabledname:zookeeperrepository:https:charts。bitnami。combitnamiversion:9。0。0alias:rabbitmqcondition:rabbitmq。enabledname:rabbitmqrepository:https:charts。bitnami。combitnamiversion:8。32。2condition:postgresql。enabledname:postgresqlrepository:https:charts。bitnami。combitnamiversion:10。16。2condition:nginx。enabledname:nginxrepository:https:charts。bitnami。combitnamiversion:12。0。4description:AHelmchartforKubernetesmaintainers:name:sentrykubernetesname:sentrytype:applicationversion:17。9。0 这个sentrycharts将所有中间件helmcharts耦合依赖在一起部署,不适合sentry微服务中间件集群扩展。更高级的做法是每个中间件拥有定制的KubernetesOperator(如:clickhouseoperator)独立的K8S集群,形成中间件PaaS平台对外提供服务。 这里我们拆分中间件charts到独立的namespace或单独的集群运维。设计为:ZooKeeper命名空间:cloudzookeeperpaasClickHouse命名空间:cloudclickhousepaas独立部署ZooKeeperHelmChart 这里zookeeperchart采用的是bitnamizookeeper,他的仓库地址如下:https:github。combitnamichartstreemasterbitnamizookeeperhttps:github。combitnamicontainerstreemainbitnamizookeeperZooKeeperOperator会在后续文章专项讨论。创建命名空间:kubectlcreatenscloudzookeeperpaas简单定制下values。yaml:暴露下prometheus监控所需的服务metrics:containerPort:9141enabled:true。。。。。。。。service:annotations:{}clusterIP:disableBaseClientPort:falseexternalTrafficPolicy:ClusterextraPorts:〔〕headless:annotations:{}publishNotReadyAddresses:trueloadBalancerIP:loadBalancerSourceRanges:〔〕nodePorts:client:tls:ports:client:2181election:3888follower:2888tls:3181sessionAffinity:Nonetype:ClusterIP 注意:在使用支持外部负载均衡器的云提供商的服务时,需设置Sevice的type的值为LoadBalancer,将为Service提供负载均衡器。来自外部负载均衡器的流量将直接重定向到后端Pod上,不过实际它们是如何工作的,这要依赖于云提供商。helm部署:helminstallzookeeper。zookeeperfvalues。yamlncloudzookeeperpaas 集群内,可使用zookeeper。cloudzookeeperpaas。svc。cluster。local:2181对外提供服务。zkCli连接ZooKeeper:exportPODNAME(kubectlgetpodsnamespacecloudzookeeperpaaslapp。kubernetes。ionamezookeeper,app。kubernetes。ioinstancezookeeper,app。kubernetes。iocomponentzookeeperojsonpath{。items〔0〕。metadata。name})kubectlncloudzookeeperpaasexecitPODNAMEzkCli。shtest〔zk:localhost:2181(CONNECTED)0〕ls〔zookeeper〕〔zk:localhost:2181(CONNECTED)1〕lszookeeper〔config,quota〕〔zk:localhost:2181(CONNECTED)2〕quit外部访问kubectlportforwardnamespacecloudzookeeperpaassvczookeeper2181:zkCli。sh127。0。0。1:2181查看zoo。cfgkubectlncloudzookeeperpaasexecitPODNAMEcatoptbitnamizookeeperconfzoo。cfgThenumberofmillisecondsofeachticktickTime2000ThenumberofticksthattheinitialsynchronizationphasecantakeinitLimit10ThenumberofticksthatcanpassbetweensendingarequestandgettinganacknowledgementsyncLimit5thedirectorywherethesnapshotisstored。donotusetmpforstorage,tmphereisjustexamplesakes。dataDirbitnamizookeeperdatatheportatwhichtheclientswillconnectclientPort2181themaximumnumberofclientconnections。increasethisifyouneedtohandlemoreclientsmaxClientCnxns60Besuretoreadthemaintenancesectionoftheadministratorguidebeforeturningonautopurge。https:zookeeper。apache。orgdoccurrentzookeeperAdmin。htmlscmaintenanceThenumberofsnapshotstoretainindataDirautopurge。snapRetainCount3PurgetaskintervalinhoursSetto0todisableautopurgefeatureautopurge。purgeInterval0MetricsProvidershttps:prometheus。ioMetricsExportermetricsProvider。classNameorg。apache。zookeeper。metrics。prometheus。PrometheusMetricsProvidermetricsProvider。httpHost0。0。0。0metricsProvider。httpPort9141metricsProvider。exportJvmInfotruepreAllocSize65536snapCount100000maxCnxns0reconfigEnabledfalsequorumListenOnAllIPsfalse4lw。commands。whitelistsrvr,mntr,ruokmaxSessionTimeout40000admin。serverPort8080admin。enableServertrueserver。1zookeeper0。zookeeperheadless。cloudzookeeperpaas。svc。cluster。local:2888:3888;2181server。2zookeeper1。zookeeperheadless。cloudzookeeperpaas。svc。cluster。local:2888:3888;2181server。3zookeeper2。zookeeperheadless。cloudzookeeperpaas。svc。cluster。local:2888:3888;2181独立部署ClickHouseHelmChart 这里clickhousechart采用的是sentrykubernetescharts自己维护的一个版本:sentrysnuba目前对于clickhouse21。x等以上版本支持的并不友好,这里的镜像版本是yandexclickhouseserver:20。8。19。4。https:github。comsentrykuberneteschartstreedevelopclickhouseClickHouseOperatorClickHouseKeeper会在后续文章专项讨论。 这个自带的clickhousecharts存在些问题,Service部分需简单修改下允许配置type:LoadBalancerortype:NodePort。 注意:在使用支持外部负载均衡器的云提供商的服务时,需设置Sevice的type的值为LoadBalancer,将为Service提供负载均衡器。来自外部负载均衡器的流量将直接重定向到后端Pod上,不过实际它们是如何工作的,这要依赖于云提供商。创建命名空间:kubectlcreatenscloudclickhousepaas简单定制下values。yaml: 注意上面zoo。cfg的3个zookeeper实例的地址:server。1zookeeper0。zookeeperheadless。cloudzookeeperpaas。svc。cluster。local:2888:3888;2181server。2zookeeper1。zookeeperheadless。cloudzookeeperpaas。svc。cluster。local:2888:3888;2181server。3zookeeper2。zookeeperheadless。cloudzookeeperpaas。svc。cluster。local:2888:3888;2181修改zookeeperserversclickhouse:configmap:zookeeperservers:config:hostTemplate:zookeeper0。zookeeperheadless。cloudzookeeperpaas。svc。cluster。localindex:clickhouseport:2181hostTemplate:zookeeper1。zookeeperheadless。cloudzookeeperpaas。svc。cluster。localindex:clickhouseport:2181hostTemplate:zookeeper2。zookeeperheadless。cloudzookeeperpaas。svc。cluster。localindex:clickhouseport:2181enabled:trueoperationtimeoutms:10000sessiontimeoutms:30000暴露下prometheus监控所需的服务metrics:enabled:true 当然这里也可以不用HeadlessService,因为是同一个集群的不同namespace的内部访问,所以也可简单填入ClusterIP类型Sevice:修改zookeeperserversclickhouse:configmap:zookeeperservers:config:hostTemplate:zookeeper。cloudzookeeperpaas。svc。cluster。localindex:clickhouseport:2181enabled:trueoperationtimeoutms:10000sessiontimeoutms:30000暴露下prometheus监控所需的服务metrics:enabled:truehelm部署:helminstallclickhouse。clickhousefvalues。yamlncloudclickhousepaas连接clickhousekubectlncloudclickhousepaasexecitclickhouse0clickhouseclientmultilinehostclickhouse1。clickhouseheadless。cloudclickhousepaas验证集群showdatabases;selectfromsystem。clusters;selectfromsystem。zookeeperwherepathclickhouse; 当前ClickHouse集群的ConfigMap kubectlgetconfigmapncloudclickhousepaasgrepclickhouseclickhouseconfig128hclickhousemetrica128hclickhouseusers128hclickhouseconfig(config。xml)yandexpathvarlibclickhousepathtmppathvarlibclickhousetmptmppathuserfilespathvarlibclickhouseuserfilesuserfilespathformatschemapathvarlibclickhouseformatschemasformatschemapathincludefrometcclickhouseservermetrica。dmetrica。xmlincludefromusersconfigusers。xmlusersconfigdisplaynameclickhousedisplaynamelistenhost0。0。0。0listenhosthttpport8123httpporttcpport9000tcpportinterserverhttpport9009interserverhttpportmaxconnections4096maxconnectionskeepalivetimeout3keepalivetimeoutmaxconcurrentqueries100maxconcurrentqueriesuncompressedcachesize8589934592uncompressedcachesizemarkcachesize5368709120markcachesizetimezoneUTCtimezoneumask022umaskmlockexecutablefalsemlockexecutableremoteserversinclclickhouseremoteserversoptionaltruezookeeperinclzookeeperserversoptionaltruemacrosinclmacrosoptionaltruebuiltindictionariesreloadinterval3600builtindictionariesreloadintervalmaxsessiontimeout3600maxsessiontimeoutdefaultsessiontimeout60defaultsessiontimeoutdisableinternaldnscache1disableinternaldnscachequerylogdatabasesystemdatabasetablequerylogtablepartitionbytoYYYYMM(eventdate)partitionbyflushintervalmilliseconds7500flushintervalmillisecondsquerylogquerythreadlogdatabasesystemdatabasetablequerythreadlogtablepartitionbytoYYYYMM(eventdate)partitionbyflushintervalmilliseconds7500flushintervalmillisecondsquerythreadlogdistributedddlpathclickhousetaskqueueddlpathdistributedddlloggerleveltracelevellogvarlogclickhouseserverclickhouseserver。loglogerrorlogvarlogclickhouseserverclickhouseserver。err。logerrorlogsize1000Msizecount10countloggeryandexclickhousemetrica(metrica。xml)yandexzookeeperserversnodeindexclickhousehostzookeeper0。zookeeperheadless。cloudzookeeperpaas。svc。cluster。localhostport2181portnodenodeindexclickhousehostzookeeper1。zookeeperheadless。cloudzookeeperpaas。svc。cluster。localhostport2181portnodenodeindexclickhousehostzookeeper2。zookeeperheadless。cloudzookeeperpaas。svc。cluster。localhostport2181portnodesessiontimeoutms30000sessiontimeoutmsoperationtimeoutms10000operationtimeoutmsrootrootidentityidentityzookeeperserversclickhouseremoteserversclickhouseshardreplicainternalreplicationtrueinternalreplicationhostclickhouse0。clickhouseheadless。cloudclickhousepaas。svc。cluster。localhostport9000portuserdefaultusercompressiontruecompressionreplicashardshardreplicainternalreplicationtrueinternalreplicationhostclickhouse1。clickhouseheadless。cloudclickhousepaas。svc。cluster。localhostport9000portuserdefaultusercompressiontruecompressionreplicashardshardreplicainternalreplicationtrueinternalreplicationhostclickhouse2。clickhouseheadless。cloudclickhousepaas。svc。cluster。localhostport9000portuserdefaultusercompressiontruecompressionreplicashardclickhouseclickhouseremoteserversmacrosreplicafromenvHOSTNAMEreplicashardfromenvSHARDshardmacrosyandexclickhouseusers(users。xml)yandexyandexSentryHelmCharts定制接入ClickHousePaaS,单集群多节点 我们简单修改values。yml禁用sentrycharts中的clickHousezookeeperclickhouse:enabled:falsezookeeper:enabled:false修改externalClickhouseexternalClickhouse:database:defaulthost:clickhouse。cloudclickhousepaas。svc。cluster。localhttpPort:8123password:singleNode:falseclusterName:clickhousetcpPort:9000username:default 注意:这里只是简单的集群内部接入1个多节点分片集群,而Snuba系统的设计是允许你接入多个ClickHouse多节点多分片多副本集群,将多个Schema分散到不同的集群,从而实现超大规模吞吐。因为是同一个集群的不同namespace的内部访问,所以这里简单填入类型为ClusterIPSevice即可。注意这里singleNode要设置成false。因为我们是多节点,同时我们需要提供clusterName:源码分析:这将用于确定:将运行哪些迁移(仅本地或本地和分布式表)查询中的差异例如是否选择了local或dist表以及确定来使用不同的ClickHouseTableEngines等。当然,ClickHouse本身是一个单独的技术方向,这里就不展开讨论了。部署helminstallsentry。sentryfvalues。yamlnsentry验证local与dist表以及system。zookeeperkubectlncloudclickhousepaasexecitclickhouse0clickhouseclientmultilinehostclickhouse1。clickhouseheadless。cloudclickhousepaasshowdatabases;showtables;selectfromsystem。zookeeperwherepathclickhouse; 高级部分超大规模吞吐接入ClickHouse多集群多节点多分片多副本的中间件PaaS 独立部署多套VKELoadBlancerVKEK8SClusterZooKeeperOperatorClickHouseOperator,分散Schema到不同的集群以及多节点分片。分析Snuba系统设计查看测试用例源码,了解系统设计与高阶配置 关于针对ClickHouse集群各个分片、副本之间的读写负载均衡、连接池等问题。Snuba在系统设计、代码层面部分就已经做了充分的考虑以及优化。 关于ClickHouseOperator独立的多个云原生编排集群以及Snuba系统设计等高级部分会在VIP专栏直播课单独讲解。更多公众号:黑客下午茶,直播分享通知云原生中间件PaaS实践:https:k8spaas。hackerlinner。com