作者归档:Analyser

ClickHouse 向后不兼容变更

from ClickHouse Release 19.17.6.36, 2019-12-27

ClickHouse release v20.11.2.1, 2020-11-11

  • If some profile was specified in distributed_ddl config section, then this profile could overwrite settings of default profile on server startup. It’s fixed, now settings of distributed DDL queries should not affect global server settings. #16635 (tavplubix).
  • Restrict to use of non-comparable data types (like AggregateFunction) in keys (Sorting key, Primary key, Partition key, and so on). #16601 (alesapin).
  • Remove ANALYZE and AST queries, and make the setting enable_debug_queries obsolete since now it is the part of full featured EXPLAIN query. #16536 (Ivan).
  • Aggregate functions boundingRatio, rankCorr, retention, timeSeriesGroupSum, timeSeriesGroupRateSum, windowFunnel were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. #16407 (alexey-milovidov).
  • Make rankCorr function return nan on insufficient data https://github.com/ClickHouse/ClickHouse/issues/16124. #16135 (hexiaoting).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.10.3.30, 2020-10-28

  • Make multiple_joins_rewriter_version obsolete. Remove first version of joins rewriter. #15472 (Artem Zuikov).
  • Change default value of format_regexp_escaping_rule setting (it’s related to Regexp format) to Raw (it means – read whole subpattern as a value) to make the behaviour more like to what users expect. #15426 (alexey-milovidov).
  • Add support for nested multiline comments /* comment /* comment */ */ in SQL. This conforms to the SQL standard. #14655 (alexey-milovidov).
  • Added MergeTree settings (max_replicated_merges_with_ttl_in_queue and max_number_of_merges_with_ttl_in_pool) to control the number of merges with TTL in the background pool and replicated queue. This change breaks compatibility with older versions only if you use delete TTL. Otherwise, replication will stay compatible. You can avoid incompatibility issues if you update all shard replicas at once or execute SYSTEM STOP TTL MERGES until you finish the update of all replicas. If you’ll get an incompatible entry in the replication queue, first of all, execute SYSTEM STOP TTL MERGES and after ALTER TABLE ... DETACH PARTITION ... the partition where incompatible TTL merge was assigned. Attach it back on a single replica. #14490 (alesapin).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.9.2.20, 2020-09-22

  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.8.2.3-stable, 2020-09-08

  • Now OPTIMIZE FINAL query doesn’t recalculate TTL for parts that were added before TTL was created. Use ALTER TABLE ... MATERIALIZE TTL once to calculate them, after that OPTIMIZE FINAL will evaluate TTL’s properly. This behavior never worked for replicated tables. #14220 (alesapin).
  • Extend parallel_distributed_insert_select setting, adding an option to run INSERT into local table. The setting changes type from Bool to UInt64, so the values false and true are no longer supported. If you have these values in server configuration, the server will not start. Please replace them with 0 and 1, respectively. #14060 (Azat Khuzhin).
  • Remove support for the ODBCDriver input/output format. This was a deprecated format once used for communication with the ClickHouse ODBC driver, now long superseded by the ODBCDriver2 format. Resolves #13629. #13847 (hexiaoting).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.7.2.30-stable, 2020-08-31

  • Function modulo (operator %) with at least one floating point number as argument will calculate remainder of division directly on floating point numbers without converting both arguments to integers. It makes behaviour compatible with most of DBMS. This also applicable for Date and DateTime data types. Added alias mod. This closes #7323. #12585 (alexey-milovidov).
  • Deprecate special printing of zero Date/DateTime values as 0000-00-00 and 0000-00-00 00:00:00. #12442 (alexey-milovidov).
  • The function groupArrayMoving* was not working for distributed queries. It’s result was calculated within incorrect data type (without promotion to the largest type). The function groupArrayMovingAvg was returning integer number that was inconsistent with the avg function. This fixes #12568. #12622 (alexey-milovidov).
  • Add sanity check for MergeTree settings. If the settings are incorrect, the server will refuse to start or to create a table, printing detailed explanation to the user. #13153 (alexey-milovidov).
  • Protect from the cases when user may set background_pool_size to value lower than number_of_free_entries_in_pool_to_execute_mutation or number_of_free_entries_in_pool_to_lower_max_size_of_merge. In these cases ALTERs won’t work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes #10897. #12728 (alexey-milovidov).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.6.3.28-stable

  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.5.2.7-stable 2020-07-02

  • Return non-Nullable result from COUNT(DISTINCT), and uniq aggregate functions family. If all passed values are NULL, return zero instead. This improves SQL compatibility. #11661 (alexey-milovidov).
  • Added a check for the case when user-level setting is specified in a wrong place. User-level settings should be specified in users.xml inside <profile> section for specific user profile (or in <default> for default settings). The server won’t start with exception message in log. This fixes #9051. If you want to skip the check, you can either move settings to the appropriate place or add <skip_check_for_incorrect_settings>1</skip_check_for_incorrect_settings> to config.xml. #11449 (alexey-milovidov).
  • The setting input_format_with_names_use_header is enabled by default. It will affect parsing of input formats -WithNames and -WithNamesAndTypes. #10937 (alexey-milovidov).
  • Remove experimental_use_processors setting. It is enabled by default. #10924 (Nikolai Kochetov).
  • Update zstd to 1.4.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages Data after merge is not byte-identical to data on another replicas. with explanation. These messages are Ok and you should not worry. This change is backward compatible but we list it here in changelog in case you will wonder about these messages. #10663 (alexey-milovidov).
  • Added a check for meaningless codecs and a setting allow_suspicious_codecs to control this check. This closes #4966. #10645 (alexey-milovidov).
  • Several Kafka setting changes their defaults. See #11388.
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.4.2.9, 2020-05-12

  • System tables (e.g. system.query_log, system.trace_log, system.metric_log) are using compact data part format for parts smaller than 10 MiB in size. Compact data part format is supported since version 20.3. If you are going to downgrade to version less than 20.3, you should manually delete table data for system logs in /var/lib/clickhouse/data/system/.
  • When string comparison involves FixedString and compared arguments are of different sizes, do comparison as if smaller string is padded to the length of the larger. This is intented for SQL compatibility if we imagine that FixedString data type corresponds to SQL CHAR. This closes #9272. #10363 (alexey-milovidov)
  • Make SHOW CREATE TABLE multiline. Now it is more readable and more like MySQL. #10049 (Azat Khuzhin)
  • Added a setting validate_polygons that is used in pointInPolygon function and enabled by default. #9857 (alexey-milovidov)

ClickHouse release v20.3.2.1, 2020-03-12

  • Fixed the issue file name too long when sending data for Distributed tables for a large number of replicas. Fixed the issue that replica credentials were exposed in the server log. The format of directory name on disk was changed to [shard{shard_index}[_replica{replica_index}]]. #8911 (Mikhail Korotov) After you upgrade to the new version, you will not be able to downgrade without manual intervention, because old server version does not recognize the new directory format. If you want to downgrade, you have to manually rename the corresponding directories to the old format. This change is relevant only if you have used asynchronous INSERTs to Distributed tables. In the version 20.3.3 we will introduce a setting that will allow you to enable the new format gradually.
  • Changed the format of replication log entries for mutation commands. You have to wait for old mutations to process before installing the new version.
  • Implement simple memory profiler that dumps stacktraces to system.trace_log every N bytes over soft allocation limit #8765 (Ivan) #9472 (alexey-milovidov) The column of system.trace_log was renamed from timer_type to trace_type. This will require changes in third-party performance analysis and flamegraph processing tools.
  • Use OS thread id everywhere instead of internal thread number. This fixes #7477 Old clickhouse-client cannot receive logs that are send from the server when the setting send_logs_level is enabled, because the names and types of the structured log messages were changed. On the other hand, different server versions can send logs with different types to each other. When you don’t use the send_logs_level setting, you should not care. #8954 (alexey-milovidov)
  • Remove indexHint function #9542 (alexey-milovidov)
  • Remove findClusterIndex, findClusterValue functions. This fixes #8641. If you were using these functions, send an email to [email protected] #9543 (alexey-milovidov)
  • Now it’s not allowed to create columns or add columns with SELECT subquery as default expression. #9481 (alesapin)
  • Require aliases for subqueries in JOIN. #9274 (Artem Zuikov)
  • Improved ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression doesn’t change type of column and MODIFY type doesn’t loose default expression value. Fixes #8669. #9227 (alesapin)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)
  • The setting experimental_use_processors is enabled by default. This setting enables usage of the new query pipeline. This is internal refactoring and we expect no visible changes. If you will see any issues, set it to back zero. #8768 (alexey-milovidov)

ClickHouse release v20.1.2.4, 2020-01-22

  • Make the setting merge_tree_uniform_read_distribution obsolete. The server still recognizes this setting but it has no effect. #8308 (alexey-milovidov)
  • Changed return type of the function greatCircleDistance to Float32 because now the result of calculation is Float32. #7993 (alexey-milovidov)
  • Now it’s expected that query parameters are represented in “escaped” format. For example, to pass string a<tab>b you have to write a\tb or a\<tab>b and respectively, a%5Ctb or a%5C%09b in URL. This is needed to add the possibility to pass NULL as \N. This fixes #7488. #8517 (alexey-milovidov)
  • Enable use_minimalistic_part_header_in_zookeeper setting for ReplicatedMergeTree by default. This will significantly reduce amount of data stored in ZooKeeper. This setting is supported since version 19.1 and we already use it in production in multiple services without any issues for more than half a year. Disable this setting if you have a chance to downgrade to versions older than 19.1. #6850 (alexey-milovidov)
  • Data skipping indices are production ready and enabled by default. The settings allow_experimental_data_skipping_indices, allow_experimental_cross_to_join_conversion and allow_experimental_multiple_joins_emulation are now obsolete and do nothing. #7974 (alexey-milovidov)
  • Add new ANY JOIN logic for StorageJoin consistent with JOIN operation. To upgrade without changes in behaviour you need add SETTINGS any_join_distinct_right_table_keys = 1 to Engine Join tables metadata or recreate these tables after upgrade. #8400 (Artem Zuikov)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)

OCFS2 手册

OCFS2 手册

OCFS2 – Linux共享磁盘集群文件系统

引言

OCFS2是一个文件系统。它允许用户存储和检索数据。数据存储在以分层目录树组织的文件中。它是POSIX兼容的文件系统,支持该规范说明的标准接口和行为语义。

它也是一个共享磁盘集群文件系统,该文件系统允许多个节点同时访问同一磁盘。这就是乐趣的开始,因为允许在多个节点上访问文件系统会打开蠕虫罐。(This is where the fun begins as allowing a file system to be accessible on multiple nodes opens a can of worms.) 如果节点具有不同的架构怎么办?如果节点在写入文件系统时挂掉,该怎么办?如果两个节点上的进程同时进行读写操作,可以期待什么样的数据一致性?如果一个节点在文件仍在另一节点上使用的同时删除了该文件怎么办?

继续阅读

Oracle 集群文件系统(Cluster File System – OCFS2)用户指南

Oracle 集群文件系统(Cluster File System – OCFS2)用户指南

1. 引言

集群文件系统允许集群中的所有节点通过标准文件系统接口并发访问设备。如此可以轻松管理需在集群中运行的应用程序。

OCFS(版本1)于2002年12月发布,这使 Oracle Real Application Cluster(RAC)用户无需处理 RAW 设备即可运行集群数据库。该文件系统旨在存储与数据库相关的文件,例如数据文件,控制文件,重做日志,存档日志,等等。

OCFS2 是”下一代” Oracle 集群文件系统。它被设计为通用集群文件系统。使用它,不仅可以将数据库相关文件存储在共享磁盘上,还可以存储 Oracle 二进制文件和配置文件(共享的 Oracle 主目录),从而使 RAC 的管理更加容易。

继续阅读

刘军宁:市场经济与有限政府

现在,我们面临的问题是,市场经济在中国的确立需要我们的社会作出什么样的、根本的、不可或缺的变革,才能具备与市场经济相兼容的社会政治条件。

在20世纪世界上为数不多的试图彻底回避市场经济的国家中,中国无疑曾经是最执著、肯为之付出莫大代价的国家之一。时至今日,中国虽然踏上了市场经济的不归路,但仍然面临著许多有形的和无形的巨大阻力。另一方面,市场经济在中国是不可逆转的进程,中国人对市场经济的选择是义无反顾的选择。现在,我们面临的问题是,市场经济在中国的确立需要我们的社会作出什么样的、根本的、不可或缺的变革,才能具备与市场经济相兼容的社会政治条件。

这样的变革的重要性在于,如果我们不准备满足这样的条件,那么市场经济就无法在中国建立起来,我们费了巨大的周折才作出的对市场经济的选择就可能半途而废。如此看来,市场经济所必需的社会政治条件又是些什么样的条件呢?这与人们常常问到的另一个问题相关,这个问题是:中国当今面临的最大的挑战是什么?中国目前面临的问题可以说是千千万,有政治方面的,有经济方面的,有人口方面的,有资源方面的,还有社会道德和文化传统方面的。有来自本土的挑战,也有来自异域的挑战。但是,我认为,目前中国面临的最大的挑战还是来自实行市场经济的必要性与落实市场经济的社会与政治条件的缺乏之间所构成的紧张关系所形成的挑战,或者说,是构建市场经济所必不可少的有限政府(Limited Government),实现由与计划经济相适应的无限政府向与市场经济相适应的有限政府的变革。在我国,从1978年的联产计酬承包责任制、到1992年正式宣布以市场经济取代计划经济,再到1997年前不久正式宣布对公有制进行重大的改造,市场经济在中国的确立已是历史的必然。那么,随之带来的一个新的重大问题是,如何构建一个与市场经济相配套的限政秩序,迎来一个前所未有的,而又无法回避的限政时代?又如何解决限制政府的必要与限制政府的难度之间的矛盾?

从早期实行市场经济的国家(如英国、美国)的经验来看,人们当初并不知道世界上有个叫市场经济的的东西,没有在宪法和建国时期的任何官方文件中规定要实行市场经济,但是一旦宪法(不论是成文的、还是不成文的)保障了公民的财产权和经济自由,用代表制度、宪政、分权制衡和司法独立的方式对政府的权力进行了有效的限制,即有限政府;一旦市场经济到位,有限政府到位,民主也就水到渠成。对第一波的民主化国家来说,民主只是追授的荣誉,而不是刻意追求的成果。所以在市场经济为一方,与公民的权利、自由和有限政府为另一方的关系中,前者是果,后者是因。仅有实行市场经济的意愿,而没有确立相应的公民的权利和自由及限政,市场经济只能是空中楼阁。相反,如果承认并保障财产权与经济自由等民权和自由权,实行限政,那么,市场经济会自动实现。所以,能否实行市场经济,关键并不在与是否有这样的意愿,而是更在与作为市场经济之因的必要配套条件是否具备。所以,没有限政时代的来临,就不会有市场经济的成功确立。

让我们先看看市场经济的两个核心要素:财产权、经济自由与有限政府的关系。

继续阅读

棒棒医生:循证的崩溃

  • 作者:棒棒医生

循证医学在中国本来就虚弱得可怜,在暴虐的新冠病毒面前,已经濒于全面崩溃。

临床决策必须建立在当前最佳证据的基础上,这一原则被“萨妹”无情地蹂躏和唾弃。

阿比朵尔和达芦那韦,仅仅做过体外细胞实验,就敢于宣称对新冠肺炎有效,甚至是“克星”,直接在临床广泛使用,某院士还呼吁要把它纳入国家卫健委第六版方案中去。须知,体外细胞实验到可以临床应用中间还隔着动物实验、一期二期和三期临床试验的十万八千里,这个漫长的过程会淘汰至少99%的所谓“有效”。如果体外实验有效就算有效的话,那么,一夜之间发明一万种有效药物有何难哉?

奥司他韦,本是治疗流感的“特效”药物,它的效果也不过是起病24小时内服用可以缩短病程减轻症状(不超过40%)而已。关键是,这个药是针对流感病毒神经络氨酸酶立体结构而在分子水平进行精准设计的,它怎么可能会对新冠病毒也有效呢?但是,武汉前线国内顶级医院大力应用后,已经在基层全面开花,即使国家方案不推荐也无济于事。

抗菌药物对病毒性肺炎的治疗无效是绝对的国际共识和医学常识,五个版本的国家方案里也一再强调要“避免盲目和不恰当使用”,但是,怎敌得住顶级医院的先锋示范呢?现在的基层医院,早已不仅仅用于重型危重型合并细菌感染者,而是同时用于几乎所有的轻型和普通型,名之曰“预防”。国家开展“抗菌药物专项整治”活动“十年辛苦不寻常”的成果,在这次疫情中被碾压得体无完肤。

大剂量维生素C抗自由基,这一疗法据说来自美国一位华裔专家。然而,这个专家很快就被挖出专业和病毒以及传染病没有关系,发表的几十篇论文不但和病毒无关,连一作和通讯作者都不是,这样的人说的话不但在大众中疯传,专业医生们也狐疑不定或者坚信不疑地直接临床试用了。没有任何临床证据和理论依据的疗法可以凭着传闻就直接大量用于临床,循证精神荡然无存。

康复患者血浆疗法是最新的“特效疗法”,已经由官方正式推荐应用。但是,它仅仅来自武汉区区10例重症病人的观察,没有对照,没有入组的标准说明,也没有可信的终点指标,这种“疗效”在循证医学里连最低级别的证据都算不上。如果搜索以往的证据则可以看到,在埃博拉和流感等疾病的多个国际大型研究中,血浆疗法早已经被否定。而理论上,血浆疗法如果有效,是基于康复者血液中的抗体,是特异作用于病毒的,那么,只有早期轻型时应用才会有效果。等到危重时,病毒已经不是问题,抗体对付不了严重的炎症反应。此外,血液的安全性和可及性也很难保证。这一种既往被证明无效的安全风险高的疗法首先需要进行严谨设计的临床研究,确保受试者的安全,然后才能宣称“有效”,然后才能推荐用于临床。不幸,所有的流程都被“特事特办”了。

纷纷你方唱罢我登场,我只看到一个药还在坚持着循证的信念,那就是瑞德西韦。这个药在国外个案已经显示了“特效”,之前也通过了体外实验、动物实验、一期和二期临床试验,证明是安全的,但它仍然不敢宣称对新冠肺炎是“有效”的,非要等到四月份揭盲以后才敢下结论。它为什么不敢?因为这是现代药物不可逾越的雷池!如果可以随意逾越的话,现代医学的大厦瞬间就崩溃了!

当然,更多的中药,1号2号直到N号,需另当别论,它们过于博大精深,浅薄的循证对它们是没有约束力的。

继续崩溃到哪里才是尽头呢?我不知道,我希望至少在悬崖的边上能止住。

Free SSL Cert from Let’s Encrypt! It’s REALLY FRAGRANT!!

Traefik Docker Compose Example:

version: '3.7'
networks:
  livedignet:
    external: true
services:
  traefik:
    image: "traefik:2.1"
    container_name: "traefik"
    networks:
      - livedignet
    command:
    # - "--log.level=DEBUG"
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.ldhttpchallenge.acme.httpchallenge=true"
      - "--certificatesresolvers.ldhttpchallenge.acme.httpchallenge.entrypoint=web"
    # When u are on test stage. UnComment the line below.
    # - "--certificatesresolvers.ldhttpchallenge.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "[email protected]m"
      - "--certificatesresolvers.ldhttpchallenge.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80/tcp"
      - "443:443/tcp"
      - "8080:8080/tcp"
    volumes:
      - type: bind
        source: "./letsencrypt"
        target: "/letsencrypt"
      - type: bind
        source: "/var/run/docker.sock"
        target: "/var/run/docker.sock"

Web Application Example:

version: '3.7'
networks:
  livedignet:
    external: true
services:
  livedig:
    image: 'wordpress:latest'
    container_name: "livedig"
    networks:
      - livedignet
    external_links:
      - mysql
    environment:
      WORDPRESS_DB_HOST:      'mysql:3306'
      WORDPRESS_DB_USER:      'mysql_usrname'
      WORDPRESS_DB_PASSWORD:  'mysql_password'
      WORDPRESS_DB_NAME:      'mysql_dbname'
      WORDPRESS_TABLE_PREFIX: 'wp_'
    working_dir: '/var/www/html'
    labels:
      - "traefik.enable=true"

      - "traefik.http.routers.livedig_http.rule=Host(`livedig.com`)"
      - "traefik.http.routers.livedig_http.entrypoints=web"
      - "traefik.http.routers.livedig_http.middlewares=redirect-to-https"
      - "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"

      - "traefik.http.routers.livedig.rule=Host(`livedig.com`)"
      - "traefik.http.routers.livedig.entrypoints=websecure"
      - "traefik.http.routers.livedig.tls.certresolver=ldhttpchallenge"
    ports:
      - '80'
    volumes:
      - type: bind
        source: ./wp-content
        target: /var/www/html/wp-content
        read_only: false

ClickHouse Mutation

ClickHouse Mutation

在 ClickHouse 中,ALTER UPDATE/DELETE 等,被称为 Mutation。

一、Mutation 异步执行注意事项

Mutation 是异步执行的,所以如果有后续任何类型的 Query,无论是INSERTSELECTALTER等,如果这些 Query 的条件对 Mutation 的结果有依赖,那么都应该等待 Mutation 完全结束之后再操作。

确认 Mutation 是否完成,可以通过查询system.mutations表中是否有相关的”is_done=0″记录来完成。

检测是否有未完成的 Mutations:

SELECT COUNT() FROM `system`.mutations
WHERE `database`='${db_name}' AND `table`='${tbl_name}' AND is_done=0;

二、删除挂起的 Mutation

某同学曾经在ALTER UPDATE中错误地赋值,将NULL赋给不可为NULL的字段,从而使 Mutation 无法正确执行,然后就一直挂在系统里,而且不断报错。

2019.09.05 10:46:11.450867 [ 9 ] {} <Error> db_name.tbl_name: DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Code: 349, e.displayText() = DB::Exception: Cannot convert NULL value to non-Nullable type, Stack trace:

此时便需要将挂起的 Mutation 清理掉。

首先,通过

SELECT * FROM `system`.mutations
WHERE `database`='${db_name}' AND `table`='${tbl_name}' AND is_done=0;

查得与挂起 Mutation 相关记录的mutation_id字段值等相关信息。然后进行清理。

清理有两种手法:

1. 系统自带清理

语法:

KILL MUTATION [ON CLUSTER cluster]
  WHERE <where expression to SELECT FROM system.mutations query>
  [TEST]
  [FORMAT format]

例子:

-- 取消并移除某单表的所有 Mutations:
KILL MUTATION WHERE database = '${db_name}' AND table = '${tbl_name}'

-- 取消某特定的 Mutation:
KILL MUTATION WHERE database = '${db_name}' AND table = '${tbl_name}' AND mutation_id = '${mutation_id}'

官方文档参考:KILL MUTATION

2. 手工清理

有两种 Case,一种是复制表,一种是非复制表。

2.1 对于复制表,处理 ZooKeeper 中相应的 ZNode 即可。

在 ZooKeeper 中找到znode /${path_to_table}/mutations/${mutation_id},将其删除。

2.2 对于非复制表

先将表卸载:

DETACH TABLE `${tbl_name}`;

${clickhouse_data_dir}/${db_name}/${tbl_name}/ 目录中,删除 mutation_${mutation_id}.txt 文件。

重新装载表:

ATTACH TABLE `${tbl_name}`;

Grafana Active Directory LDAP configuration

Grafana Active Directory LDAP configuration examples.

Configration example below allows your active directory member user use their sAMAccountName login into your Grafana service.

U need manage the Admin/Editor/Viewer roles in AD through add the user to the specialfied AD group.

Remember, DN is case sensitive, this is very important.

# Set to true to log user information returned from LDAP
verbose_logging = false

[[servers]]
# Ldap server host (specify multiple hosts space separated)
host = "${livedig.yourServersIPorFQDN}"
# Default port is 389 or 636 if use_ssl = true
port = 389
# Set to true if ldap server supports TLS
use_ssl = false
# Set to true if connect ldap server with STARTTLS pattern (create connection in insecure, then upgrade to secure connection with TLS)
start_tls = false
# set to true if you want to skip ssl cert validation
ssl_skip_verify = false
# set to the path to your root CA certificate or leave unset to use system defaults
# root_ca_cert = "/path/to/certificate.crt"

# Search user bind dn
bind_dn = "CN=robot,CN=IT System,CN=Users,DC=example,DC=io"
# Search user bind password
# If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
bind_password = '${livedig.urUserBaseDNPassword}'

# User search filter, for example "(cn=%s)" or "(sAMAccountName=%s)" or "(uid=%s)"
search_filter = "(&(objectCategory=Person)(sAMAccountName=%s)(!(UserAccountControl:1.2.840.113556.1.4.803:=2)))"

# An array of base dns to search through
search_base_dns = ["CN=Users,DC=example,DC=io"]

# In POSIX LDAP schemas, without memberOf attribute a secondary query must be made for groups.
# This is done by enabling group_search_filter below. You must also set member_of= "cn"
# in [servers.attributes] below.

## Group search filter, to retrieve the groups of which the user is a member (only set if memberOf attribute is not available)
#group_search_filter = ""
## An array of the base DNs to search through for groups. Typically uses ou=groups
#group_search_base_dns = [""]

# Specify names of the ldap attributes your ldap uses
[servers.attributes]
name = "givenName"
surname = "sn"
username = "sAMAccountName"
member_of = "memberOf"
email =  "mail"

# Map ldap groups to grafana org roles
[[servers.group_mappings]]
group_dn = "CN=Grafana Admin,CN=IT System,CN=Users,DC=example,DC=io"
org_role = "Admin"
# The Grafana organization database id, optional, if left out the default org (id 1) will be used.  Setting this allows for multiple group_dn's to be assigned to the same org_role provided the org_id differs
# org_id = 1

[[servers.group_mappings]]
group_dn = "CN=Grafana Editor,CN=IT System,CN=Users,DC=example,DC=io"
org_role = "Editor"

[[servers.group_mappings]]
# If you want to match all (or no ldap groups) then you can use wildcard
group_dn = "CN=Grafana Viewer,CN=IT System,CN=Users,DC=example,DC=io"
org_role = "Viewer"

Python中”if __name__ == ‘__main__’:”的解析

Python中if __name__ == '__main__':的解析

Python 源代码常会在代码最下方看到形如if __name__ == '__main__':的语句,下面介绍其作用。

所有Python模块都有一个内置属性__name__,其值取决于如何使用模块。

  1. 如果被import,则模块__name__值通常为模块文件名,且不带路径及文件扩展名。
  2. 直接运行,则__name__值为__main__

所以,if __name__ == '__main__'用来判断该模块的使用方式。

如:

class UsageTest:
    def __init(self):
        pass

    def f(self):
        print('Hello, World!')

if __name__ == '__main__':
    UsageTest().f()

在终端直接运行:

$ python UsageTest.py
Hello, World!

import 方式:

$ python
  >>>import UsageTest
  >>>UsageTest.__name__ # Test模块的__name__
  'UsageTest'
  >>>__name__ # 当前程序的__name__
  '__main__'