Skip to content

Commit b3d0f9c

Browse files
authored
Merge branch 'main' into in-improve
2 parents 5a1713c + 6aa586b commit b3d0f9c

File tree

4 files changed

+67
-87
lines changed

4 files changed

+67
-87
lines changed

docs/doc/01-guides/index.md

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -25,7 +25,7 @@ These tutorials are intended to help you get started with Databend:
2525
## Loading Data into Databend
2626

2727
* [How to Load Data from Local File System](../21-load-data/00-local.md)
28-
* [How to Load Data from Remote Files](../21-load-data/04-remote.md)
28+
* [How to Load Data from Remote Files](../21-load-data/04-http.md)
2929
* [How to Load Data from Amazon S3](../21-load-data/01-s3.md)
3030
* [How to Load Data from Databend Stages](../21-load-data/02-stage.md)
3131
* [How to Load Data from MySQL](../21-load-data/03-mysql.md)

docs/doc/21-load-data/04-remote.md renamed to docs/doc/21-load-data/04-http.md

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -7,7 +7,7 @@ description:
77

88
This tutorial explains how to load data into a table from remote files.
99

10-
The [COPY INTO `<table>` FROM REMOTE FILES](../30-reference/30-sql/10-dml/dml-copy-into-table-url.md) command allows you to load data into a table from one or more remote files by their URL. The supported file types include CSV, JSON, NDJSON, and PARQUET.
10+
The [COPY INTO `<table>` FROM REMOTE FILES](../30-reference/30-sql/10-dml/dml-copy-into-table.md) command allows you to load data into a table from one or more remote files by their URL. The supported file types include CSV, JSON, NDJSON, and PARQUET.
1111

1212
### Before You Begin
1313

@@ -38,7 +38,7 @@ COPY INTO books FROM 'https://datafuse-1253727613.cos.ap-hongkong.myqcloud.com/d
3838

3939
:::tip
4040

41-
The command can also load data from multiple files that are sequentially named. See [COPY INTO `<table>` FROM REMOTE FILES](../30-reference/30-sql/10-dml/dml-copy-into-table-url.md) for details.
41+
The command can also load data from multiple files that are sequentially named. See [COPY INTO `<table>`](../30-reference/30-sql/10-dml/dml-copy-into-table.md) for details.
4242

4343
:::
4444

@@ -52,4 +52,4 @@ SELECT * FROM books;
5252
| Transaction Processing | Jim Gray | 1992 |
5353
| Readings in Database Systems | Michael Stonebraker | 2004 |
5454
+------------------------------+----------------------+-------+
55-
```
55+
```

docs/doc/30-reference/30-sql/10-dml/dml-copy-into-table-url.md

Lines changed: 0 additions & 74 deletions
This file was deleted.

docs/doc/30-reference/30-sql/10-dml/dml-copy-into-table.md

Lines changed: 63 additions & 9 deletions
Original file line numberDiff line numberDiff line change
@@ -1,8 +1,6 @@
11
---
2-
title: 'COPY INTO <table> FROM STAGED FILES'
3-
sidebar_label: 'COPY INTO <table> FROM STAGED FILES'
4-
description:
5-
'Loads data from staged files'
2+
title: 'COPY INTO <table>'
3+
sidebar_label: 'COPY INTO <table>'
64
---
75

86
`COPY` moves data between Databend tables and object storage systems (AWS S3 compatible object storage services and Azure Blob storage).
@@ -11,9 +9,7 @@ This command loads data into a table from files staged in one of the following l
119

1210
* Named internal stage, files can be staged using the [PUT to Stage](../../00-api/10-put-to-stage.md).
1311
* Named external stage that references an external location (AWS S3 compatible object storage services and Azure Blob storage).
14-
* External location. This includes AWS S3 compatible object storage services and Azure Blob storage.
15-
16-
`COPY` can also load data into a table from one or more remote files by their URL. See [COPY INTO \<table\> FROM REMOTE FILES](dml-copy-into-table-url.md).
12+
* External location. This includes AWS S3 compatible object storage services, Azure Blob storage, Google Cloud Storage, Huawei OBS.
1713

1814
## Syntax
1915

@@ -42,7 +38,7 @@ externalStage ::= @<external_stage_name>[/<path>]
4238

4339
### externalLocation
4440

45-
AWS S3 compatible object storage services:
41+
**AWS S3 Compatible Object Storage Service**
4642

4743
```sql
4844
externalLocation ::=
@@ -65,7 +61,7 @@ externalLocation ::=
6561
| REGION | AWS region name. For example, us-east-1. | Optional |
6662
| ENABLE_VIRTUAL_HOST_STYLE | If you use virtual hosting to address the bucket, set it to "true". | Optional |
6763

68-
Azure Blob storage
64+
**Azure Blob storage**
6965

7066
```sql
7167
externalLocation ::=
@@ -84,6 +80,53 @@ externalLocation ::=
8480
| ACCOUNT_NAME | Your account name for connecting the Azure Blob storage. If not provided, Databend will access the container anonymously. | Optional |
8581
| ACCOUNT_KEY | Your account key for connecting the Azure Blob storage. | Optional |
8682

83+
**Google Cloud Storage**
84+
85+
```sql
86+
externalLocation ::=
87+
'gcs://<container>[<path>]'
88+
CONNECTION = (
89+
ENDPOINT_URL = 'https://<endpoint-URL>'
90+
CREDENTIAL = '<your-credential>'
91+
)
92+
```
93+
94+
| Parameter | Description | Required |
95+
|-----------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
96+
| `gcs://<bucket>[<path>]` | External files located at the Google Cloud Storage | Required |
97+
| ENDPOINT_URL | The container endpoint URL starting with "https://". To use a URL starting with "http://", set `allow_insecure` to `true` in the [storage] block of the file `databend-query-node.toml`. | Optional |
98+
| CREDENTIAL | Your credential for connecting the GCS. If not provided, Databend will access the container anonymously. | Optional |
99+
100+
**Huawei Object Storage**
101+
102+
```sql
103+
externalLocation ::=
104+
'obs://<container>[<path>]'
105+
CONNECTION = (
106+
ENDPOINT_URL = 'https://<endpoint-URL>'
107+
ACCESS_KEY_ID = '<your-access-key-id>'
108+
SECRET_ACCESS_KEY = '<your-secret-access-key>'
109+
)
110+
```
111+
112+
| Parameter | Description | Required |
113+
|------------------------------|-----------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------|------------|
114+
| `obs://<bucket>[<path>]` | External files located at the obs | Required |
115+
| ENDPOINT_URL | The container endpoint URL starting with "https://". To use a URL starting with "http://", set `allow_insecure` to `true` in the [storage] block of the file `databend-query-node.toml`. | Optional |
116+
| ACCESS_KEY_ID | Your access key ID for connecting the OBS. If not provided, Databend will access the bucket anonymously. | Optional |
117+
| SECRET_ACCESS_KEY | Your secret access key for connecting the OBS. | Optional |
118+
119+
**HTTP**
120+
121+
```sql
122+
externalLocation ::=
123+
'https://<url>'
124+
```
125+
126+
Especially, HTTP supports glob patterns. For example, use
127+
128+
- `ontime_200{6,7,8}.csv` to represents `ontime_2006.csv`,`ontime_2007.csv`,`ontime_20080.csv`.
129+
- `ontime_200[6-8].csv` to represents `ontime_2006.csv`,`ontime_2007.csv`,`ontime_20080.csv`.
87130

88131
### FILES = ( 'file_name' [ , 'file_name' ... ] )
89132

@@ -246,4 +289,15 @@ COPY INTO mytable
246289
ACCOUNT_NAME = '<account_name>'
247290
ACCOUNT_KEY = '<account_key>'
248291
)
292+
FILE_FORMAT = (type = 'CSV');
293+
```
294+
295+
**HTTP**
296+
297+
This example reads data from a CSV file and inserts them into a table:
298+
299+
```sql
300+
COPY INTO mytable
301+
FROM 'https://repo.databend.rs/dataset/stateful/ontime_200{6,7,8}_200.csv'
302+
FILE_FORMAT = (type = 'CSV');
249303
```

0 commit comments

Comments
 (0)