分类目录归档:信息技术

ClickHouse 向后不兼容变更

from ClickHouse Release 19.17.6.36, 2019-12-27

ClickHouse release v20.11.2.1, 2020-11-11

  • If some profile was specified in distributed_ddl config section, then this profile could overwrite settings of default profile on server startup. It’s fixed, now settings of distributed DDL queries should not affect global server settings. #16635 (tavplubix).
  • Restrict to use of non-comparable data types (like AggregateFunction) in keys (Sorting key, Primary key, Partition key, and so on). #16601 (alesapin).
  • Remove ANALYZE and AST queries, and make the setting enable_debug_queries obsolete since now it is the part of full featured EXPLAIN query. #16536 (Ivan).
  • Aggregate functions boundingRatio, rankCorr, retention, timeSeriesGroupSum, timeSeriesGroupRateSum, windowFunnel were erroneously made case-insensitive. Now their names are made case sensitive as designed. Only functions that are specified in SQL standard or made for compatibility with other DBMS or functions similar to those should be case-insensitive. #16407 (alexey-milovidov).
  • Make rankCorr function return nan on insufficient data https://github.com/ClickHouse/ClickHouse/issues/16124. #16135 (hexiaoting).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.10.3.30, 2020-10-28

  • Make multiple_joins_rewriter_version obsolete. Remove first version of joins rewriter. #15472 (Artem Zuikov).
  • Change default value of format_regexp_escaping_rule setting (it’s related to Regexp format) to Raw (it means – read whole subpattern as a value) to make the behaviour more like to what users expect. #15426 (alexey-milovidov).
  • Add support for nested multiline comments /* comment /* comment */ */ in SQL. This conforms to the SQL standard. #14655 (alexey-milovidov).
  • Added MergeTree settings (max_replicated_merges_with_ttl_in_queue and max_number_of_merges_with_ttl_in_pool) to control the number of merges with TTL in the background pool and replicated queue. This change breaks compatibility with older versions only if you use delete TTL. Otherwise, replication will stay compatible. You can avoid incompatibility issues if you update all shard replicas at once or execute SYSTEM STOP TTL MERGES until you finish the update of all replicas. If you’ll get an incompatible entry in the replication queue, first of all, execute SYSTEM STOP TTL MERGES and after ALTER TABLE ... DETACH PARTITION ... the partition where incompatible TTL merge was assigned. Attach it back on a single replica. #14490 (alesapin).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.9.2.20, 2020-09-22

  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.8.2.3-stable, 2020-09-08

  • Now OPTIMIZE FINAL query doesn’t recalculate TTL for parts that were added before TTL was created. Use ALTER TABLE ... MATERIALIZE TTL once to calculate them, after that OPTIMIZE FINAL will evaluate TTL’s properly. This behavior never worked for replicated tables. #14220 (alesapin).
  • Extend parallel_distributed_insert_select setting, adding an option to run INSERT into local table. The setting changes type from Bool to UInt64, so the values false and true are no longer supported. If you have these values in server configuration, the server will not start. Please replace them with 0 and 1, respectively. #14060 (Azat Khuzhin).
  • Remove support for the ODBCDriver input/output format. This was a deprecated format once used for communication with the ClickHouse ODBC driver, now long superseded by the ODBCDriver2 format. Resolves #13629. #13847 (hexiaoting).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.7.2.30-stable, 2020-08-31

  • Function modulo (operator %) with at least one floating point number as argument will calculate remainder of division directly on floating point numbers without converting both arguments to integers. It makes behaviour compatible with most of DBMS. This also applicable for Date and DateTime data types. Added alias mod. This closes #7323. #12585 (alexey-milovidov).
  • Deprecate special printing of zero Date/DateTime values as 0000-00-00 and 0000-00-00 00:00:00. #12442 (alexey-milovidov).
  • The function groupArrayMoving* was not working for distributed queries. It’s result was calculated within incorrect data type (without promotion to the largest type). The function groupArrayMovingAvg was returning integer number that was inconsistent with the avg function. This fixes #12568. #12622 (alexey-milovidov).
  • Add sanity check for MergeTree settings. If the settings are incorrect, the server will refuse to start or to create a table, printing detailed explanation to the user. #13153 (alexey-milovidov).
  • Protect from the cases when user may set background_pool_size to value lower than number_of_free_entries_in_pool_to_execute_mutation or number_of_free_entries_in_pool_to_lower_max_size_of_merge. In these cases ALTERs won’t work or the maximum size of merge will be too limited. It will throw exception explaining what to do. This closes #10897. #12728 (alexey-milovidov).
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.6.3.28-stable

  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.5.2.7-stable 2020-07-02

  • Return non-Nullable result from COUNT(DISTINCT), and uniq aggregate functions family. If all passed values are NULL, return zero instead. This improves SQL compatibility. #11661 (alexey-milovidov).
  • Added a check for the case when user-level setting is specified in a wrong place. User-level settings should be specified in users.xml inside <profile> section for specific user profile (or in <default> for default settings). The server won’t start with exception message in log. This fixes #9051. If you want to skip the check, you can either move settings to the appropriate place or add <skip_check_for_incorrect_settings>1</skip_check_for_incorrect_settings> to config.xml. #11449 (alexey-milovidov).
  • The setting input_format_with_names_use_header is enabled by default. It will affect parsing of input formats -WithNames and -WithNamesAndTypes. #10937 (alexey-milovidov).
  • Remove experimental_use_processors setting. It is enabled by default. #10924 (Nikolai Kochetov).
  • Update zstd to 1.4.4. It has some minor improvements in performance and compression ratio. If you run replicas with different versions of ClickHouse you may see reasonable error messages Data after merge is not byte-identical to data on another replicas. with explanation. These messages are Ok and you should not worry. This change is backward compatible but we list it here in changelog in case you will wonder about these messages. #10663 (alexey-milovidov).
  • Added a check for meaningless codecs and a setting allow_suspicious_codecs to control this check. This closes #4966. #10645 (alexey-milovidov).
  • Several Kafka setting changes their defaults. See #11388.
  • When upgrading from versions older than 20.5, if rolling update is performed and cluster contains both versions 20.5 or greater and less than 20.5, if ClickHouse nodes with old versions are restarted and old version has been started up in presence of newer versions, it may lead to Part ... intersects previous part errors. To prevent this error, first install newer clickhouse-server packages on all cluster nodes and then do restarts (so, when clickhouse-server is restarted, it will start up with the new version).

ClickHouse release v20.4.2.9, 2020-05-12

  • System tables (e.g. system.query_log, system.trace_log, system.metric_log) are using compact data part format for parts smaller than 10 MiB in size. Compact data part format is supported since version 20.3. If you are going to downgrade to version less than 20.3, you should manually delete table data for system logs in /var/lib/clickhouse/data/system/.
  • When string comparison involves FixedString and compared arguments are of different sizes, do comparison as if smaller string is padded to the length of the larger. This is intented for SQL compatibility if we imagine that FixedString data type corresponds to SQL CHAR. This closes #9272. #10363 (alexey-milovidov)
  • Make SHOW CREATE TABLE multiline. Now it is more readable and more like MySQL. #10049 (Azat Khuzhin)
  • Added a setting validate_polygons that is used in pointInPolygon function and enabled by default. #9857 (alexey-milovidov)

ClickHouse release v20.3.2.1, 2020-03-12

  • Fixed the issue file name too long when sending data for Distributed tables for a large number of replicas. Fixed the issue that replica credentials were exposed in the server log. The format of directory name on disk was changed to [shard{shard_index}[_replica{replica_index}]]. #8911 (Mikhail Korotov) After you upgrade to the new version, you will not be able to downgrade without manual intervention, because old server version does not recognize the new directory format. If you want to downgrade, you have to manually rename the corresponding directories to the old format. This change is relevant only if you have used asynchronous INSERTs to Distributed tables. In the version 20.3.3 we will introduce a setting that will allow you to enable the new format gradually.
  • Changed the format of replication log entries for mutation commands. You have to wait for old mutations to process before installing the new version.
  • Implement simple memory profiler that dumps stacktraces to system.trace_log every N bytes over soft allocation limit #8765 (Ivan) #9472 (alexey-milovidov) The column of system.trace_log was renamed from timer_type to trace_type. This will require changes in third-party performance analysis and flamegraph processing tools.
  • Use OS thread id everywhere instead of internal thread number. This fixes #7477 Old clickhouse-client cannot receive logs that are send from the server when the setting send_logs_level is enabled, because the names and types of the structured log messages were changed. On the other hand, different server versions can send logs with different types to each other. When you don’t use the send_logs_level setting, you should not care. #8954 (alexey-milovidov)
  • Remove indexHint function #9542 (alexey-milovidov)
  • Remove findClusterIndex, findClusterValue functions. This fixes #8641. If you were using these functions, send an email to [email protected] #9543 (alexey-milovidov)
  • Now it’s not allowed to create columns or add columns with SELECT subquery as default expression. #9481 (alesapin)
  • Require aliases for subqueries in JOIN. #9274 (Artem Zuikov)
  • Improved ALTER MODIFY/ADD queries logic. Now you cannot ADD column without type, MODIFY default expression doesn’t change type of column and MODIFY type doesn’t loose default expression value. Fixes #8669. #9227 (alesapin)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)
  • The setting experimental_use_processors is enabled by default. This setting enables usage of the new query pipeline. This is internal refactoring and we expect no visible changes. If you will see any issues, set it to back zero. #8768 (alexey-milovidov)

ClickHouse release v20.1.2.4, 2020-01-22

  • Make the setting merge_tree_uniform_read_distribution obsolete. The server still recognizes this setting but it has no effect. #8308 (alexey-milovidov)
  • Changed return type of the function greatCircleDistance to Float32 because now the result of calculation is Float32. #7993 (alexey-milovidov)
  • Now it’s expected that query parameters are represented in “escaped” format. For example, to pass string a<tab>b you have to write a\tb or a\<tab>b and respectively, a%5Ctb or a%5C%09b in URL. This is needed to add the possibility to pass NULL as \N. This fixes #7488. #8517 (alexey-milovidov)
  • Enable use_minimalistic_part_header_in_zookeeper setting for ReplicatedMergeTree by default. This will significantly reduce amount of data stored in ZooKeeper. This setting is supported since version 19.1 and we already use it in production in multiple services without any issues for more than half a year. Disable this setting if you have a chance to downgrade to versions older than 19.1. #6850 (alexey-milovidov)
  • Data skipping indices are production ready and enabled by default. The settings allow_experimental_data_skipping_indices, allow_experimental_cross_to_join_conversion and allow_experimental_multiple_joins_emulation are now obsolete and do nothing. #7974 (alexey-milovidov)
  • Add new ANY JOIN logic for StorageJoin consistent with JOIN operation. To upgrade without changes in behaviour you need add SETTINGS any_join_distinct_right_table_keys = 1 to Engine Join tables metadata or recreate these tables after upgrade. #8400 (Artem Zuikov)
  • Require server to be restarted to apply the changes in logging configuration. This is a temporary workaround to avoid the bug where the server logs to a deleted log file (see #8696). #8707 (Alexander Kuzmenkov)

OCFS2 手册

OCFS2 手册

OCFS2 – Linux共享磁盘集群文件系统

引言

OCFS2是一个文件系统。它允许用户存储和检索数据。数据存储在以分层目录树组织的文件中。它是POSIX兼容的文件系统,支持该规范说明的标准接口和行为语义。

它也是一个共享磁盘集群文件系统,该文件系统允许多个节点同时访问同一磁盘。这就是乐趣的开始,因为允许在多个节点上访问文件系统会打开蠕虫罐。(This is where the fun begins as allowing a file system to be accessible on multiple nodes opens a can of worms.) 如果节点具有不同的架构怎么办?如果节点在写入文件系统时挂掉,该怎么办?如果两个节点上的进程同时进行读写操作,可以期待什么样的数据一致性?如果一个节点在文件仍在另一节点上使用的同时删除了该文件怎么办?

继续阅读

Oracle 集群文件系统(Cluster File System – OCFS2)用户指南

Oracle 集群文件系统(Cluster File System – OCFS2)用户指南

1. 引言

集群文件系统允许集群中的所有节点通过标准文件系统接口并发访问设备。如此可以轻松管理需在集群中运行的应用程序。

OCFS(版本1)于2002年12月发布,这使 Oracle Real Application Cluster(RAC)用户无需处理 RAW 设备即可运行集群数据库。该文件系统旨在存储与数据库相关的文件,例如数据文件,控制文件,重做日志,存档日志,等等。

OCFS2 是”下一代” Oracle 集群文件系统。它被设计为通用集群文件系统。使用它,不仅可以将数据库相关文件存储在共享磁盘上,还可以存储 Oracle 二进制文件和配置文件(共享的 Oracle 主目录),从而使 RAC 的管理更加容易。

继续阅读

Free SSL Cert from Let’s Encrypt! It’s REALLY FRAGRANT!!

Traefik Docker Compose Example:

version: '3.7'
networks:
  livedignet:
    external: true
services:
  traefik:
    image: "traefik:2.1"
    container_name: "traefik"
    networks:
      - livedignet
    command:
    # - "--log.level=DEBUG"
      - "--api.insecure=true"
      - "--providers.docker=true"
      - "--providers.docker.exposedbydefault=false"
      - "--entrypoints.web.address=:80"
      - "--entrypoints.websecure.address=:443"
      - "--certificatesresolvers.ldhttpchallenge.acme.httpchallenge=true"
      - "--certificatesresolvers.ldhttpchallenge.acme.httpchallenge.entrypoint=web"
    # When u are on test stage. UnComment the line below.
    # - "--certificatesresolvers.ldhttpchallenge.acme.caserver=https://acme-staging-v02.api.letsencrypt.org/directory"
      - "[email protected]m"
      - "--certificatesresolvers.ldhttpchallenge.acme.storage=/letsencrypt/acme.json"
    ports:
      - "80:80/tcp"
      - "443:443/tcp"
      - "8080:8080/tcp"
    volumes:
      - type: bind
        source: "./letsencrypt"
        target: "/letsencrypt"
      - type: bind
        source: "/var/run/docker.sock"
        target: "/var/run/docker.sock"

Web Application Example:

version: '3.7'
networks:
  livedignet:
    external: true
services:
  livedig:
    image: 'wordpress:latest'
    container_name: "livedig"
    networks:
      - livedignet
    external_links:
      - mysql
    environment:
      WORDPRESS_DB_HOST:      'mysql:3306'
      WORDPRESS_DB_USER:      'mysql_usrname'
      WORDPRESS_DB_PASSWORD:  'mysql_password'
      WORDPRESS_DB_NAME:      'mysql_dbname'
      WORDPRESS_TABLE_PREFIX: 'wp_'
    working_dir: '/var/www/html'
    labels:
      - "traefik.enable=true"

      - "traefik.http.routers.livedig_http.rule=Host(`livedig.com`)"
      - "traefik.http.routers.livedig_http.entrypoints=web"
      - "traefik.http.routers.livedig_http.middlewares=redirect-to-https"
      - "traefik.http.middlewares.redirect-to-https.redirectscheme.scheme=https"

      - "traefik.http.routers.livedig.rule=Host(`livedig.com`)"
      - "traefik.http.routers.livedig.entrypoints=websecure"
      - "traefik.http.routers.livedig.tls.certresolver=ldhttpchallenge"
    ports:
      - '80'
    volumes:
      - type: bind
        source: ./wp-content
        target: /var/www/html/wp-content
        read_only: false

ClickHouse Mutation

ClickHouse Mutation

在 ClickHouse 中,ALTER UPDATE/DELETE 等,被称为 Mutation。

一、Mutation 异步执行注意事项

Mutation 是异步执行的,所以如果有后续任何类型的 Query,无论是INSERTSELECTALTER等,如果这些 Query 的条件对 Mutation 的结果有依赖,那么都应该等待 Mutation 完全结束之后再操作。

确认 Mutation 是否完成,可以通过查询system.mutations表中是否有相关的”is_done=0″记录来完成。

检测是否有未完成的 Mutations:

SELECT COUNT() FROM `system`.mutations
WHERE `database`='${db_name}' AND `table`='${tbl_name}' AND is_done=0;

二、删除挂起的 Mutation

某同学曾经在ALTER UPDATE中错误地赋值,将NULL赋给不可为NULL的字段,从而使 Mutation 无法正确执行,然后就一直挂在系统里,而且不断报错。

2019.09.05 10:46:11.450867 [ 9 ] {} <Error> db_name.tbl_name: DB::StorageReplicatedMergeTree::queueTask()::<lambda(DB::StorageReplicatedMergeTree::LogEntryPtr&)>: Code: 349, e.displayText() = DB::Exception: Cannot convert NULL value to non-Nullable type, Stack trace:

此时便需要将挂起的 Mutation 清理掉。

首先,通过

SELECT * FROM `system`.mutations
WHERE `database`='${db_name}' AND `table`='${tbl_name}' AND is_done=0;

查得与挂起 Mutation 相关记录的mutation_id字段值等相关信息。然后进行清理。

清理有两种手法:

1. 系统自带清理

语法:

KILL MUTATION [ON CLUSTER cluster]
  WHERE <where expression to SELECT FROM system.mutations query>
  [TEST]
  [FORMAT format]

例子:

-- 取消并移除某单表的所有 Mutations:
KILL MUTATION WHERE database = '${db_name}' AND table = '${tbl_name}'

-- 取消某特定的 Mutation:
KILL MUTATION WHERE database = '${db_name}' AND table = '${tbl_name}' AND mutation_id = '${mutation_id}'

官方文档参考:KILL MUTATION

2. 手工清理

有两种 Case,一种是复制表,一种是非复制表。

2.1 对于复制表,处理 ZooKeeper 中相应的 ZNode 即可。

在 ZooKeeper 中找到znode /${path_to_table}/mutations/${mutation_id},将其删除。

2.2 对于非复制表

先将表卸载:

DETACH TABLE `${tbl_name}`;

${clickhouse_data_dir}/${db_name}/${tbl_name}/ 目录中,删除 mutation_${mutation_id}.txt 文件。

重新装载表:

ATTACH TABLE `${tbl_name}`;

Grafana Active Directory LDAP configuration

Grafana Active Directory LDAP configuration examples.

Configration example below allows your active directory member user use their sAMAccountName login into your Grafana service.

U need manage the Admin/Editor/Viewer roles in AD through add the user to the specialfied AD group.

Remember, DN is case sensitive, this is very important.

# Set to true to log user information returned from LDAP
verbose_logging = false

[[servers]]
# Ldap server host (specify multiple hosts space separated)
host = "${livedig.yourServersIPorFQDN}"
# Default port is 389 or 636 if use_ssl = true
port = 389
# Set to true if ldap server supports TLS
use_ssl = false
# Set to true if connect ldap server with STARTTLS pattern (create connection in insecure, then upgrade to secure connection with TLS)
start_tls = false
# set to true if you want to skip ssl cert validation
ssl_skip_verify = false
# set to the path to your root CA certificate or leave unset to use system defaults
# root_ca_cert = "/path/to/certificate.crt"

# Search user bind dn
bind_dn = "CN=robot,CN=IT System,CN=Users,DC=example,DC=io"
# Search user bind password
# If the password contains # or ; you have to wrap it with trippel quotes. Ex """#password;"""
bind_password = '${livedig.urUserBaseDNPassword}'

# User search filter, for example "(cn=%s)" or "(sAMAccountName=%s)" or "(uid=%s)"
search_filter = "(&(objectCategory=Person)(sAMAccountName=%s)(!(UserAccountControl:1.2.840.113556.1.4.803:=2)))"

# An array of base dns to search through
search_base_dns = ["CN=Users,DC=example,DC=io"]

# In POSIX LDAP schemas, without memberOf attribute a secondary query must be made for groups.
# This is done by enabling group_search_filter below. You must also set member_of= "cn"
# in [servers.attributes] below.

## Group search filter, to retrieve the groups of which the user is a member (only set if memberOf attribute is not available)
#group_search_filter = ""
## An array of the base DNs to search through for groups. Typically uses ou=groups
#group_search_base_dns = [""]

# Specify names of the ldap attributes your ldap uses
[servers.attributes]
name = "givenName"
surname = "sn"
username = "sAMAccountName"
member_of = "memberOf"
email =  "mail"

# Map ldap groups to grafana org roles
[[servers.group_mappings]]
group_dn = "CN=Grafana Admin,CN=IT System,CN=Users,DC=example,DC=io"
org_role = "Admin"
# The Grafana organization database id, optional, if left out the default org (id 1) will be used.  Setting this allows for multiple group_dn's to be assigned to the same org_role provided the org_id differs
# org_id = 1

[[servers.group_mappings]]
group_dn = "CN=Grafana Editor,CN=IT System,CN=Users,DC=example,DC=io"
org_role = "Editor"

[[servers.group_mappings]]
# If you want to match all (or no ldap groups) then you can use wildcard
group_dn = "CN=Grafana Viewer,CN=IT System,CN=Users,DC=example,DC=io"
org_role = "Viewer"

Python中”if __name__ == ‘__main__’:”的解析

Python中if __name__ == '__main__':的解析

Python 源代码常会在代码最下方看到形如if __name__ == '__main__':的语句,下面介绍其作用。

所有Python模块都有一个内置属性__name__,其值取决于如何使用模块。

  1. 如果被import,则模块__name__值通常为模块文件名,且不带路径及文件扩展名。
  2. 直接运行,则__name__值为__main__

所以,if __name__ == '__main__'用来判断该模块的使用方式。

如:

class UsageTest:
    def __init(self):
        pass

    def f(self):
        print('Hello, World!')

if __name__ == '__main__':
    UsageTest().f()

在终端直接运行:

$ python UsageTest.py
Hello, World!

import 方式:

$ python
  >>>import UsageTest
  >>>UsageTest.__name__ # Test模块的__name__
  'UsageTest'
  >>>__name__ # 当前程序的__name__
  '__main__'

使用 Docker 构建

使用 Docker 构建

原文:https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/ci/docker/using_docker_build.md

GitLab CI 允许你使用 Docker Engine 构建和测试基于 Docker 的项目。

这也允许您使用docker-compose和其他 docker-enabled 的工具。

持续集成/持续部署的新趋势之一是:

  1. 创建应用程序镜像,
  2. 针对创建的镜像运行测试,
  3. 将镜像推送到远程 Registry,并
  4. 从推送的镜像部署到服务器。

当您的应用程序已经具有可用于创建和测试镜像的Dockerfile时,它也很有用:

$ docker build -t my-image dockerfiles/
$ docker run my-docker-image /script/to/run/tests
$ docker tag my-image my-registry:5000/my-image
$ docker push my-registry:5000/my-image

这需要 GitLab Runner 的特殊配置,以在作业期间启用docker支持。

Runner 配置

在作业中有三种方法可以使用docker builddocker run,每个都有自己的考虑。

使用 Shell 执行器

最简单的方法是在shell执行模式下安装 GitLab Runner。然后 GitLab Runner 作为gitlab-runner用户执行作业脚本。

  1. 安装 GitLab Runner

  2. 在 GitLab Runner 安装期间,选择shell作为执行作业脚本或使用命令的方法:

    sudo gitlab-ci-multi-runner register -n \
      --url https://gitlab.com/ \
      --registration-token REGISTRATION_TOKEN \
      --executor shell \
      --description "My Runner"
    
  3. 在服务器上安装 Docker Engine。

    有关如何在不同系统上安装 Docker Engine 的更多信息,请参阅 Supported installations

  4. 新增 gitlab-runner 用户到 docker 组:

    sudo usermod -aG docker gitlab-runner
    
  5. 验证gitlab-runner是否可以访问Docker:
    sudo -u gitlab-runner -H docker info
    

    现在你可以通过将docker info添加到.gitlab-ci.yml中来验证一切是否正常:

    before_script:
      - docker info
    
    build_image:
      script:
        - docker build -t my-docker-image .
        - docker run my-docker-image /script/to/run/tests
    
  6. 现在可以使用docker命令,如果需要可以安装docker-compose

注:
* 通过在docker组中添加gitlab-runner,你可以有效地授予gitlab-runner的完整的 root 权限。有关更多信息,请阅读 On Docker security: docker group considered harmful

使用 docker-in-docker 执行器

第二种方法是使用专门的 Docker 镜像 docker-in-docker(dind),它安装了所有工具(dockerdocker-compose),并以特权模式在该镜像的上下文中运行作业脚本。

为了做到这一点,请按以下步骤操作:

  1. 安装 GitLab Runner

  2. 从命令行注册 GitLab Runner 以使用dockerprivileged模式:

    sudo gitlab-ci-multi-runner register -n \
      --url https://gitlab.com/ \
      --registration-token REGISTRATION_TOKEN \
      --executor docker \
      --description "My Docker Runner" \
      --docker-image "docker:latest" \
      --docker-privileged
    

    上面的命令将注册一个新的 Runner 来使用 Docker 所提供的特殊docker:latest镜像。请注意,它使用privileged模式启动构建和服务容器。如果要使用docker-in-docker模式,您始终必须在 Docker 容器中使用privileged = true

    上面的命令将创建一个类似于这个的config.toml条目:

    [[runners]]
      url = "https://gitlab.com/"
      token = TOKEN
      executor = "docker"
      [runners.docker]
        tls_verify = false
        image = "docker:latest"
        privileged = true
        disable_cache = false
        volumes = ["/cache"]
      [runners.cache]
        Insecure = false
    
  3. 您现在可以在构建脚本中使用docker(请注意包含docker:dind服务)
    image: docker:latest
    
    # When using dind, it's wise to use the overlayfs driver for
    # improved performance.
    variables:
      DOCKER_DRIVER: overlay
    
    services:
      - docker:dind
    
    before_script:
      - docker info
    
    build:
      stage: build
      script:
        - docker build -t my-docker-image .
        - docker run my-docker-image /script/to/run/tests
    

Docker-in-Docker 运行良好,是推荐的配置,但并不是没有挑战:

  • 启用--docker-privileged禁用了容器的所有安全机制,并使你的主机由于特权升级而暴露,从而导致容器突破(主机-容器屏障)。有关更多信息,请查看官方 Docker 文档 Runtime privilege and Linux capabilities
  • 当使用 docker-in-docker 时,每个作业都处于一个干净的环境中,没有过去的历史。并发作业工作正常,因为每个构建都获得自己的 Docker Engine 实例,因此不会相互冲突。但这也意味着作业可能会更慢,因为没有缓存层。
  • 默认情况下,docker:dind使用--storage-driver vfs,这是最慢的形式。要使用其他驱动程序,请参阅使用 overlayfs 驱动程序

使用这种方法的示例项目可以在这里找到:https://gitlab.com/gitlab-examples/docker.

Use Docker socket binding

第三种方法是将/var/run/docker.sock绑定装载到容器中,以便 docker 在该镜像的上下文中可用。

为了做到这点,遵循以下步骤:

  1. 安装 GitLab Runner.

  2. 从命令行注册 GitLab Runner 以使用docker并共享/var/run/docker.sock

    sudo gitlab-ci-multi-runner register -n \
      --url https://gitlab.com/ \
      --registration-token REGISTRATION_TOKEN \
      --executor docker \
      --description "My Docker Runner" \
      --docker-image "docker:latest" \
      --docker-volumes /var/run/docker.sock:/var/run/docker.sock
    

    上面的命令将注册一个新的 Runner 来使用 Docker 提供的特殊docker:latest镜像。请注意,它正在使用 Runner 本身的 Docker 守护进程,docker 命令产生的任何容器都将是 Runner 的兄弟,而不是所运行程序的子进程。这可能会有不适合您的工作流程的复杂性和局限性。

    上面的命令将创建一个类似于这个的config.toml条目:

    [[runners]]
      url = "https://gitlab.com/"
      token = REGISTRATION_TOKEN
      executor = "docker"
      [runners.docker]
        tls_verify = false
        image = "docker:latest"
        privileged = false
        disable_cache = false
        volumes = ["/var/run/docker.sock:/var/run/docker.sock", "/cache"]
      [runners.cache]
        Insecure = false
    
  3. 您现在可以在构建脚本中使用docker(请注意,在 Docker 执行器中使用 Docker 时,不需要包含docker:dind服务):
    image: docker:latest
    
    before_script:
    - docker info
    
    build:
      stage: build
      script:
      - docker build -t my-docker-image .
      - docker run my-docker-image /script/to/run/tests
    

虽然上述方法避免在特权模式下使用 Docker,但您应该了解以下影响:

  • 共享 docker 守护进程,禁用了容器的所有安全机制,并将主机暴露给特权提升,从而导致容器突破屏障。例如,如果一个项目运行docker rm -f $(docker ps -a -q),它将删除 GitLab Runner 容器。
  • 并发作业可能无效;如果你的测试创建了具有特定名称的容器,它们可能会相互冲突。
  • 将文件和目录从源代码库共享到容器中可能无法正常工作,因为卷装载是在主机上下文中完成的,而不是在构建容器中,例如:
    docker run --rm -t -i -v $(pwd)/src:/home/app/src test-image:latest run_app_tests
    

使用 OverlayFS 驱动程序

默认情况下,使用docker:dind时,Docker 使用vfs存储驱动程序,每次运行时都会拷贝文件系统。磁盘操作非常密集,如果使用不同的驱动程序(例如overlay),则可以避免这种情况。

  1. 确保使用最近的内核,最好是>= 4.2
  2. 检查overlay模块是否加载:
    sudo lsmod | grep overlay
    

    如果没有结果,那就没有加载。加载之:

    sudo modprobe overlay
    

    如果一切顺利,您需要确保在系统重启时也加载该模块。Ubuntu 系统上是通过编辑/etc/modules完成的。添加以下行:

    overlay
    
  3. .gitlab-ci.yml顶部定义一个变量以使用该驱动:
    variables:
      DOCKER_DRIVER: overlay
    

使用 GitLab 容器 Registry

注:
– 此功能需要 GitLab 8.8 和 GitLab Runner 1.2。
– 从 GitLab 8.12 开始,如果你的帐户启用了两步认证,则需要传递个人访问令牌而不是密码,才能登录到 GitLab 的 Container Registry。

一旦构建了 Docker 镜像,就可以将其推送到内置的 GitLab 容器 Registry 中。例如,如果你在 runner 上使用 docker-in-docker,那.gitlab-ci.yml可能如下:

 build:
   image: docker:latest
   services:
   - docker:dind
   stage: build
   script:
     - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN registry.example.com
     - docker build -t registry.example.com/group/project/image:latest .
     - docker push registry.example.com/group/project/image:latest

必须使用为你创建的特殊gitlab-ci-token用户,才能推送到连接到项目的 Registry。它的密码由$CI_JOB_TOKEN变量提供。这允许您自动构建和部署 Docker 镜像。

您也可以利用其他变量来避免硬编码:

services:
  - docker:dind

variables:
  IMAGE_TAG: $CI_REGISTRY_IMAGE:$CI_COMMIT_REF_NAME

before_script:
  - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN $CI_REGISTRY

build:
  stage: build
  script:
    - docker build -t $IMAGE_TAG .
    - docker push $IMAGE_TAG

在这里,$CI_REGISTRY_IMAGE将解析为与该项目相关联的 Registry 地址,$CI_COMMIT_REF_NAME将解析为该作业所在分支或标签的名称。还声明了我们自己的变量$IMAGE_TAG,将两者结合起来,以节省我们在script部分中的输入。

这是一个更详细的例子,将任务分解为 4 个流水线(pipeline)阶段,包括并行运行的两个测试。build存储在容器 Registry 中,并在后续阶段使用,需要时则下载该镜像。对master的更改也被标记为latest,并使用特定于应用程序的部署脚本进行部署:

image: docker:latest
services:
  - docker:dind

stages:
  - build
  - test
  - release
  - deploy

variables:
  CONTAINER_TEST_IMAGE: registry.example.com/my-group/my-project/my-

image:$CI_COMMIT_REF_NAME
  CONTAINER_RELEASE_IMAGE: registry.example.com/my-group/my-project/my-image:latest

before_script:
  - docker login -u gitlab-ci-token -p $CI_JOB_TOKEN registry.example.com

build:
  stage: build
  script:
    - docker build --pull -t $CONTAINER_TEST_IMAGE .
    - docker push $CONTAINER_TEST_IMAGE

test1:
  stage: test
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker run $CONTAINER_TEST_IMAGE /script/to/run/tests

test2:
  stage: test
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker run $CONTAINER_TEST_IMAGE /script/to/run/another/test

release-image:
  stage: release
  script:
    - docker pull $CONTAINER_TEST_IMAGE
    - docker tag $CONTAINER_TEST_IMAGE $CONTAINER_RELEASE_IMAGE
    - docker push $CONTAINER_RELEASE_IMAGE
  only:
    - master

deploy:
  stage: deploy
  script:
    - ./deploy.sh
  only:
    - master

使用容器 Registry 的注意事项:

  • 在运行命令之前必须先登录到容器 Registry。把它放在before_script里,会在每个作业之前运行它。
  • 使用docker build --pull可以确保 Docker 在构建前获取 base 镜像的任何更改,以防您的缓存失效。这需要运行更长时间,但意味着不会遇到未打安全补丁的 base 镜像。
  • 在每个docker run之前做一个明确的docker pull,确保获取刚构建的最新镜像。如果您正在使用多个会在本地缓存镜像的 Runner,这一点尤为重要。在镜像标签中使用 git SHA 又使得这不太必要,因为每个作业都将是唯一的,并且您不应该有一个过时的镜像,但是如在依赖更改后,重新构建给定的 Commit,这仍然可能(有过时的镜像)。
  • 同时发生多个作业的情况下,你不会想直接构建为latest

GitLab Runners

GitLab Runners

Original URL: https://gitlab.com/gitlab-org/gitlab-ce/blob/master/doc/ci/runners/README.md

GitLab CI 中,Runners 运行你的 yaml。Runner 是一个独立/隔离的(虚拟)机器,它通过 GitLab CI 的协调器(coordinator) API 拾取作业。

Runner 可以服务特定的项目,或为 GitLab CI 的任何项目提供服务。为所有项目提供服务的 Runner 称为 Shared Runner

理想情况下,GitLab Runner 不应与 GitLab 安装在相同机器上。阅读需求文档以获取更多信息。

Shared vs. Specific Runners

具体的 Runner 仅针对指定的项目运行。共享的 Runner 可以为启用了 Allow shared Runners 的项目运行作业。

Shared Runners 对多个项目间需求类似的作业很有用。您可以使用一个或多个 Runner 来处理多个项目,而不是让多个 Runner 闲置等待多个项目。这样可以更容易维护和更新 Runner。

Specific Runners 对于要求特殊的作业,或有特定需求的项目很有用。如果一个作业有一定的要求,你可以设置一个 Specific Runner 来搞,而不必为所有 Runner 这样做。例如,如果要部署某个项目,可以设置 Specific Runner 以获取正确的凭据。

CI 活动频繁旺盛的项目也可以从使用 Specific Runner 中受益。通过 Specific Runners 可确保 Runner 不被其他项目的作业阻塞。

您可以设置 Specific Runner 供多个项目使用。与 Shared Runner 的区别在于,必须为 Runner 显式地启用上述每个项目,才能运行这些项目的作业。

Specific Runner 不会自动与 fork 出去的项目共享。fork 将复制所克隆的代码仓库 CI 设置(jobsallow shared 等)。

创建并注册 Runner

有几种方法可以创建 Runner。只有创建后,注册才能确定其为SharedSpecific状态。

安装 Runner 实例的不同方法,请参阅文档

安装 Runner 后,可将其注册为SharedSpecific。如果您有 GitLab 实例的管理员访问权限,则只能注册一个Shared Runner

注册Shared Runner

如果您是链接的 GitLab 实例上的管理员,则只能注册一个Shared Runner
(You can only register a shared Runner if you are an admin on the linked GitLab instance.)

在 GitLab CI 实例的admin/runners页面上获取shared-Runner令牌。

shared token

现在只需注册 Runner 即可:

sudo gitlab-ci-multi-runner register

从 GitLab 8.2 起,Shared Runners 默认启用,但可以使用DISABLE SHARED RUNNERS按钮禁用。以前版本的 GitLab 中,Shared Runners 默认为禁用。

注册Specific Runner

Registering a specific can be done in two ways:

  1. Creating a Runner with the project registration token
  2. Converting a shared Runner into a specific Runner (one-way, admin only)

There are several ways to create a Runner instance. The steps below only concern registering the Runner on GitLab CI.

Registering a Specific Runner with a Project Registration token

To create a specific Runner without having admin rights to the GitLab instance, visit the project you want to make the Runner work for in GitLab CI.

Click on the Runner tab and use the registration token you find there to setup a specific Runner for this project.

project Runners in GitLab CI

To register the Runner, run the command below and follow instructions:

sudo gitlab-ci-multi-runner register

Lock a specific Runner from being enabled for other projects

You can configure a Runner to assign it exclusively to a project. When a Runner is locked this way, it can no longer be enabled for other projects. This setting is available on each Runner in Project Settings > Runners.

Making an existing Shared Runner Specific

If you are an admin on your GitLab instance, you can make any shared Runner a specific Runner, but you can not make a specific Runner a shared Runner.

To make a shared Runner specific, go to the Runner page (/admin/runners) and find your Runner. Add any projects on the left to make this Runner run exclusively for these projects, therefore making it a specific Runner.

making a shared Runner specific

Using Shared Runners Effectively

If you are planning to use shared Runners, there are several things you should keep in mind.

Use Tags

You must setup a Runner to be able to run all the different types of jobs that it may encounter on the projects it’s shared over. This would be problematic for large amounts of projects, if it wasn’t for tags.

By tagging a Runner for the types of jobs it can handle, you can make sure shared Runners will only run the jobs they are equipped to run.

For instance, at GitLab we have Runners tagged with “rails” if they contain the appropriate dependencies to run Rails test suites.

Prevent Runner with tags from picking jobs without tags

You can configure a Runner to prevent it from picking jobs with tags when the Runner does not have tags assigned. This setting is available on each Runner in Project Settings > Runners.

Be careful with sensitive information

If you can run a job on a Runner, you can get access to any code it runs and get the token of the Runner. With shared Runners, this means that anyone that runs jobs on the Runner, can access anyone else’s code that runs on the Runner.

In addition, because you can get access to the Runner token, it is possible to create a clone of a Runner and submit false jobs, for example.

The above is easily avoided by restricting the usage of shared Runners on large public GitLab instances and controlling access to your GitLab instance.

Forks

Whenever a project is forked, it copies the settings of the jobs that relate to it. This means that if you have shared Runners setup for a project and someone forks that project, the shared Runners will also serve jobs of this project.

Attack vectors in Runners

Mentioned briefly earlier, but the following things of Runners can be exploited. We’re always looking for contributions that can mitigate these Security Considerations.

在容器内运行 GitLab Runner

在容器内运行 GitLab Runner

这描述如何在 Docker 容器中运行 GitLab Runner。

Docker 镜像安装及配置

首先安装 Docker:

curl -sSL https://get.docker.com/ | sh

需要在gitlab-runner容器中装载一个用于配置和其他资源的配置存储卷,:

docker run -d --name gitlab-runner --restart always \
  -v /srv/gitlab-runner/config:/etc/gitlab-runner \
  -v /var/run/docker.sock:/var/run/docker.sock \
  gitlab/gitlab-runner:latest

或者,你可以使用配置容器来装载自定义数据卷:

docker run -d --name gitlab-runner-config \
    -v /etc/gitlab-runner \
    busybox:latest \
    /bin/true

docker run -d --name gitlab-runner --restart always \
    --volumes-from gitlab-runner-config \
    gitlab/gitlab-runner:latest

如果打算使用 Docker 作为产生 Runners 的方法,则要这样装载 docker socket:

docker run -d --name gitlab-runner --restart always \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /srv/gitlab-runner/config:/etc/gitlab-runner \
  gitlab/gitlab-runner:latest

注册 runner (查看 Runner 文档了解如何获取令牌):

docker exec -it gitlab-runner gitlab-runner register

Please enter the gitlab-ci coordinator URL (e.g. https://gitlab.com )
https://gitlab.com
Please enter the gitlab-ci token for this runner
xxx
Please enter the gitlab-ci description for this runner
my-runner
INFO[0034] fcf5c619 Registering runner... succeeded
Please enter the executor: shell, docker, docker-ssh, ssh?
docker
Please enter the Docker image (eg. ruby:2.1):
ruby:2.1
INFO[0037] Runner registered successfully. Feel free to start it, but if it's
running already the config should be automatically reloaded!

Runner 应该已经启动了,已准备好构建项目!

请确保阅读了有关 GitLab Runner 最常见问题的 FAQ

更新

拉取最新的(latest)版本:

docker pull gitlab/gitlab-runner:latest

停止并删除现有容器:

docker stop gitlab-runner && docker rm gitlab-runner

像原来一样启动容器:

docker run -d --name gitlab-runner --restart always \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /srv/gitlab-runner/config:/etc/gitlab-runner \
  gitlab/gitlab-runner:latest

注意:您需要使用与原来装载数据卷相同的方法(-v /srv/gitlab-runner/config:/etc/gitlab-runner--volumes-from gitlab-runner

安装受信的 SSL 服务器证书

如果你的 GitLab CI 服务器在使用自签名 SSL 证书,那应确保 GitLab CI 服务器证书被 gitlab-ci-multi-runner 容器信任,以便它们能够相互通信。

The gitlab/gitlab-runner image is configured to look for the trusted SSL certificates at /etc/gitlab-runner/certs/ca.crt, this can however be changed using the -e "CA_CERTIFICATES_PATH=/DIR/CERT" configuration option.

Copy the ca.crt file into the certs directory on the data volume (or container).
The ca.crt file should contain the root certificates of all the servers you want gitlab-ci-multi-runner to trust. The gitlab-ci-multi-runner container will import the ca.crt file on startup so if your container is already running you may need to restart it for the changes to take effect.

gitlab/gitlab-runner镜像被配置为在/etc/gitlab-runner/certs/ca.crt上查找受信的 SSL 证书,但可以使用-e "CA_CERTIFICATES_PATH=/DIR/CERT"来配置。

ca.crt文件复制到数据卷(或容器)上的certs目录中。
ca.crt文件应包含所有需要 gitlab-ci-multi-runner 信任的服务器根证书。启动时,gitlab-ci-multi-runner 容器将导入ca.crt文件,因此如果你的容器已经运行,则需重启生效。

Alpine Linux

你还可以使用替代的基于 Alpine Linux 的镜像,文件体积更小:

gitlab/gitlab-runner    latest              3e8077e209f5        13 hours ago        304.3 MB
gitlab/gitlab-runner    alpine              7c431ac8f30f        13 hours ago        25.98 MB

Alpine Linux image is designed to use only Docker as the method of spawning runners.

原本的 gitlab/gitlab-runner:latest 基于 Ubuntu 14.04 LTS。

SELinux

某些发行版(CentOS,RedHat,Fedora)默认使用 SELinux 来增强底层系统的安全性。

处理这种配置时必须特别小心。

  1. If you want to use Docker executor to run builds in containers you need to access the /var/run/docker.sock.
    However, if you have a SELinux in enforcing mode, you will see the Permission denied when accessing the /var/run/docker.sock.
    Install the selinux-dockersock and to resolve the issue: https://github.com/dpw/selinux-dockersock.

  2. Make sure that persistent directory is created on host: mkdir -p /srv/gitlab-runner/config.

  3. Run docker with :Z on volumes:

docker run -d --name gitlab-runner --restart always \
  -v /var/run/docker.sock:/var/run/docker.sock \
  -v /srv/gitlab-runner/config:/etc/gitlab-runner:Z \
  gitlab/gitlab-runner:latest

More information about the cause and resolution can be found here:
http://www.projectatomic.io/blog/2015/06/using-volumes-with-docker-can-cause-problems-with-selinux/