当前位置：首页 > news >正文

网站建设要做什么京东云擎 wordpress 二级域名

news 2025/12/20 14:41:39

网站建设要做什么,京东云擎 wordpress 二级域名,龙口市建设局网站,西部网站管理助手本文介绍如何在 Doris 中导入 JSON 格式的数据文件。Doris 支持导入标准 JSON 格式数据#xff0c;通过配置相关参数#xff0c;可以灵活地处理不同的 JSON 数据结构#xff0c;并支持从 JSON 数据中抽取字段、处理嵌套结构等场景。导入方式以下导入方式支持 JSON 格式…本文介绍如何在 Doris 中导入 JSON 格式的数据文件。Doris 支持导入标准 JSON 格式数据通过配置相关参数可以灵活地处理不同的 JSON 数据结构并支持从 JSON 数据中抽取字段、处理嵌套结构等场景。导入方式以下导入方式支持 JSON 格式的数据导入 Stream LoadBroker LoadRoutine LoadINSERT INTO FROM S3 TVFINSERT INTO FROM HDFS TVF 支持的 JSON 格式 Doris 支持以下三种 JSON 格式以 Array 表示的多行数据适用于批量导入多行数据要求根节点必须是数组数组中每个元素是一个对象表示一行数据必须设置 strip_outer_arraytrue 示例数据 [{id: 123, city: beijing},{id: 456, city: shanghai} ]// 支持嵌套结构 [{id: 123, city: {name: beijing, region: haidian}},{id: 456, city: {name: beijing, region: chaoyang}} ]以 Object 表示的单行数据适用于单行数据导入要求根节点必须是对象整个对象表示一行数据示例数据 {id: 123, city: beijing}// 支持嵌套结构 {id: 123, city: {name: beijing, region: haidian}}注意通常用于 Routine Load 导入方式如 Kafka 中的单条消息。以固定分隔符分隔的多行 Object 数据适用于批量导入多行数据要求每行是一个完整的 JSON 对象必须设置 read_json_by_linetrue可通过 line_delimiter 参数指定行分隔符默认为 \n 示例数据 {id: 123, city: beijing} {id: 456, city: shanghai}参数配置参数支持情况下表列出了各种导入方式支持的 JSON 格式参数参数默认值Stream LoadBroker LoadRoutine LoadTVFjson paths无jsonpathsproperties.jsonpathsproperties.jsonpathsjsonpathsjson root无json_rootproperties.json_rootproperties.json_rootjson_rootstrip outer arrayfalsestrip_outer_arrayproperties.strip_outer_arrayproperties.strip_outer_arraystrip_outer_arrayread json by linefalseread_json_by_line不支持配置都为true不支持read_json_by_line, 默认为truefuzzy parsefalsefuzzy_parseproperties.fuzzy_parse不支持fuzzy_parsenum as stringfalsenum_as_stringproperties.num_as_stringproperties.num_as_stringnum_as_string 注意 Stream Load参数直接通过 HTTP Header 指定如-H jsonpaths: $.dataBroker Load参数通过 PROPERTIES 指定如PROPERTIES(jsonpaths$.data)Routine Load参数通过 PROPERTIES 指定如PROPERTIES(jsonpaths$.data)TVF参数通过 TVF 语句指定如S3(jsonpaths$.data) 参数说明 JSON Path 作用指定如何从 JSON 数据中抽取字段类型字符串数组默认值无默认使用列名匹配使用示例 -- 基本用法 [$.id, $.city]-- 嵌套结构 [$.id, $.info.city, $.data[0].name]JSON Root 作用指定 JSON 数据的解析起点类型字符串默认值无默认从根节点开始解析使用示例 -- 原始数据 {data: {id: 123,city: beijing} }-- 设置 json_root json_root $.dataStrip Outer Array 作用指定是否去除最外层的数组结构类型布尔值默认值false使用示例 -- 原始数据 [{id: 1, city: beijing},{id: 2, city: shanghai} ]-- 设置 strip_outer_arraytrueRead JSON By Line 作用指定是否按行读取 JSON 数据类型布尔值默认值false使用示例 -- 原始数据每行一个完整的 JSON 对象 {id: 1, city: beijing} {id: 2, city: shanghai}-- 设置 read_json_by_linetrueFuzzy Parse 作用加速 JSON 数据的导入效率类型布尔值默认值false限制 Array 中每行数据的字段顺序必须完全一致通常与 strip_outer_array 配合使用性能可提升 3-5 倍导入效率 Num As String 作用指定是否将 JSON 中的数值类型以字符串形式解析类型布尔值默认值false使用场景处理超出数值范围的大数避免数值精度损失使用示例 -- 原始数据 {id: 12345678901234567890,price: 99999999.999999 } -- 设置 num_as_stringtrue price 字段将以字符串形式解析 JSON Path 和 Columns 的关系在数据导入过程中JSON Path 和 Columns 各自承担不同的职责 JSON Path定义数据抽取规则从 JSON 数据中按指定路径抽取字段抽取的字段按 JSON Path 中定义的顺序进行重排列 Columns定义数据映射规则将抽取的字段映射到目标表的列可以进行列的重排和转换这两个参数的处理过程是串行的首先 JSON Path 从源数据中抽取字段并形成有序的数据集然后 Columns 将这些数据映射到表的列中。如果不指定 Columns抽取的字段将按照表的列顺序直接映射。使用示例仅使用 JSON Path 表结构和数据 -- 表结构 CREATE TABLE example_table (k2 int,k1 int );-- JSON 数据 {k1: 1, k2: 2}导入命令 curl -v ... -H format: json \-H jsonpaths: [\$.k2\, \$.k1\] \-T example.json \http://fe_host:fe_http_port/api/db_name/table_name/_stream_load 导入结果 ------------ | k1 | k2 | ------------ | 2 | 1 | ------------使用 JSON Path Columns 使用相同的表结构和数据添加 columns 参数导入命令 curl -v ... -H format: json \-H jsonpaths: [\$.k2\, \$.k1\] \-H columns: k2, k1 \-T example.json \http://fe_host:fe_http_port/api/db_name/table_name/_stream_load导入结果 ------------ | k1 | k2 | ------------ | 1 | 2 | ------------字段重复使用表结构和数据 -- 表结构 CREATE TABLE example_table (k2 int,k1 int,k1_copy int );-- JSON 数据 {k1: 1, k2: 2}导入命令 curl -v ... -H format: json \-H jsonpaths: [\$.k2\, \$.k1\, \$.k1\] \-H columns: k2, k1, k1_copy \-T example.json \http://fe_host:fe_http_port/api/db_name/table_name/_stream_load导入结果 --------------------- | k2 | k1 | k1_copy | --------------------- | 2 | 1 | 1 | ---------------------嵌套字段映射表结构和数据 -- 表结构 CREATE TABLE example_table (k2 int,k1 int,k1_nested1 int,k1_nested2 int );-- JSON 数据 {k1: 1,k2: 2,k3: {k1: 31,k1_nested: {k1: 32}} }导入命令 curl -v ... -H format: json \-H jsonpaths: [\$.k2\, \$.k1\, \$.k3.k1\, \$.k3.k1_nested.k1\] \-H columns: k2, k1, k1_nested1, k1_nested2 \-T example.json \http://fe_host:fe_http_port/api/db_name/table_name/_stream_load导入结果 ------------------------------------ | k2 | k1 | k1_nested1 | k1_nested2 | ------------------------------------ | 2 | 1 | 31 | 32 | ------------------------------------使用示例本节展示了不同导入方式下的 JSON 格式使用方法。 Stream Load 导入 # 使用 JSON Path curl --location-trusted -u user:passwd \-H format: json \-H jsonpaths: [\$.id\, \$.city\] \-T example.json \http://fe_host:fe_http_port/api/example_db/example_table/_stream_load# 指定 JSON root curl --location-trusted -u user:passwd \-H format: json \-H json_root: $.events \-T example.json \http://fe_host:fe_http_port/api/example_db/example_table/_stream_load# 按行读取 JSON curl --location-trusted -u user:passwd \-H format: json \-H read_json_by_line: true \-T example.json \http://fe_host:fe_http_port/api/example_db/example_table/_stream_loadBroker Load 导入 -- 使用 JSON Path LOAD LABEL example_db.example_label (DATA INFILE(s3://bucket/path/example.json)INTO TABLE example_tableFORMAT AS jsonPROPERTIES(jsonpaths [\$.id\, \$.city\]) ) WITH S3 (... );-- 指定 JSON root LOAD LABEL example_db.example_label (DATA INFILE(s3://bucket/path/example.json)INTO TABLE example_tableFORMAT AS jsonPROPERTIES(json_root $.events) ) WITH S3 (... );Routine Load 导入 -- 使用 JSON Path CREATE ROUTINE LOAD example_db.example_job ON example_table PROPERTIES (format json,jsonpaths [\$.id\, \$.city\] ) FROM KAFKA (... );TVF 导入 -- 使用 JSON Path INSERT INTO example_table SELECT * FROM S3 (uri s3://bucket/example.json,format json,jsonpaths [\$.id\, \$.city\],... );-- 指定 JSON root INSERT INTO example_table SELECT * FROM S3 (uri s3://bucket/example.json,format json,json_root $.events,... );

查看全文

http://www.w-s-a.com/news/577697/