Load Data from Amazon S3 using COPY

tip

Expected time: 5 minutes ⏱

Databend COPY can read object files(CSV or Parquet format) from Amazon S3 buckets, To understand Amazon S3’s core concepts please read Amazon S3 documentation.

Before you begin

AWS Account: This Quickstart uses Amazon S3 and requires an AWS account’s access key id and secret access key.
Databend: You will connect to the database and using COPY to pull data from your Amazon S3 bucket, please see How to deploy Databend.

Part 1: Creating an Amazon S3 Bucket and Adding a File

On your local machine, create a text file with the following CSV contents and name it books.csv:

books.csv

Transaction Processing,Jim Gray,1992
Readings in Database Systems,Michael Stonebraker,2004

This CSV file field delimiter is , and the record delimiter is \n.

In S3 create a bucket and upload books.csv to the bucket.

Part 2: Creating a Database and Table using COPY

create database book_db;

use book_db;

create table books
(
    title VARCHAR(255),
    author VARCHAR(255),
    date VARCHAR(255)
);

Now that the database and table have been created. In Part 1 of this Quickstart, you uploaded the books.csv file to your bucket. To use the COPY data loading, you will need the following information:

The name of the S3 URI(s3://bucket/to/path/), such as: s3://databend-bohu/data/
Your AWS account’s access keys, such as:
- Access Key ID: your-access-key-id
- Secret Access Key: your-secret-access-key

Using this URI and keys, execute the following statement, replacing the placeholder values with your own:

copy into books
  from 's3://databend-bohu/data/'
  credentials=(aws_key_id='<your-access-key-id>' aws_secret_key='<your-secret-access-key>')
  pattern ='.*[.]csv'
  file_format = (type = 'CSV' field_delimiter = ','  record_delimiter = '\n' skip_header = 0);

Now, let's check the data to make sure data has actually loaded:

mysql> select * from books;
+------------------------------+----------------------+-------+
| title                        | author               | date  |
+------------------------------+----------------------+-------+
| Transaction Processing       |  Jim Gray            |  1992 |
| Readings in Database Systems |  Michael Stonebraker |  2004 |
+------------------------------+----------------------+-------+

tip

If the file(s) is large and we want to check the file format is ok to parse, we can use the SIZE_LIMIT:

copy into books
  from 's3://databend-bohu/data/'
  credentials=(aws_key_id='<your-access-key-id>' aws_secret_key='<your-secret-access-key>')
  pattern ='.*[.]csv'
  file_format = (type = 'CSV' field_delimiter = ','  record_delimiter = '\n' skip_header = 0)
  size_limit = 1; -- only load 1 rows

tip

Before you begin​

Part 1: Creating an Amazon S3 Bucket and Adding a File​

Part 2: Creating a Database and Table using COPY​

tip

Before you begin

Part 1: Creating an Amazon S3 Bucket and Adding a File

Part 2: Creating a Database and Table using COPY