File Import
The file-import
command reads data from files and writes it to Redis.
The basic usage for file imports is:
riotx file-import [OPTIONS] FILE... [REDIS COMMAND...]
To show the full usage, run:
riotx file-import --help
RIOT-X will try to determine the file type from its extension (e.g. .csv
or .json
), but you can specify it with the --type
option.
Gzipped files are supported and the extension before .gz
is used (e.g. myfile.json.gz
→ json
).
-
/path/file.csv
-
/path/file-*.csv
-
/path/file.json
-
http://data.com/file.csv
-
http://data.com/file.json.gz
Use - to read from standard input.
|
Amazon S3 and Google Cloud Storage buckets are supported.
riotx file-import s3://mydata/file.csv --s3-region us-west-1 hset beer:#{id}
riotx file-import gs://mydata/file.json --gcp-key-file key.json hset user:#{id}
Data Structures
If no REDIS COMMAND
is specified, it is assumed that the input file(s) contain Redis data structures serialized as JSON or XML. See the File Export section to learn about the expected format and how to generate such files.
riotx file-import myfile.json
Redis Commands
When one or more `REDIS COMMAND`s are specified, these commands are called for each input record.
Redis client options apply to the root command ( In this example Redis client options will not be taken into account:
|
Redis command keys are constructed from input records by concatenating keyspace prefix and key fields.
blah:<id>
riotx file-import my.json hset blah:#{id}
riotx file-import myfile.json json.set user:#{id} $ .
riotx file-import my.json hset blah:#{id} expire blah:#{id}
blah:<id>
and set TTL and add each id
to a set named myset
riotx file-import my.json hset blah:#{id} expire blah:#{id} sadd myset --member #{id}
Delimited (CSV)
The default delimiter character is comma (,
).
It can be changed with the --delimiter
option.
If the file has a header, use the --header
option to automatically extract field names.
Otherwise specify the field names using the --fields
option.
Let’s consider this CSV file:
row | abv | ibu | id | name | style | brewery | ounces |
---|---|---|---|---|---|---|---|
1 |
0.079 |
45 |
321 |
Fireside Chat (2010) |
Winter Warmer |
368 |
12.0 |
2 |
0.068 |
65 |
173 |
Back in Black |
American Black Ale |
368 |
12.0 |
3 |
0.083 |
35 |
11 |
Monk’s Blood |
Belgian Dark Ale |
368 |
12.0 |
The following command imports this CSV into Redis as hashes using beer
as the key prefix and id
as primary key.
riotx file-import beers.csv --header hset beer:#{id}
This creates hashes with keys beer:321
, beer:173
, …
This command imports a CSV file into a geo set named airportgeo
with airport IDs as members:
riotx file-import airports.csv --header geoadd airportgeo --member #{iata} --lon #{lng} --lat #{lat}
Fixed-Length (Fixed-Width)
Fixed-length files can be imported by specifying the width of each field using the --ranges
option.
riotx file-import fw.txt --ranges 0-4 5-9 10-20 --names field1 field2 field3 hset record:#{field1}
JSON
The expected format for JSON files is:
[
{
"...": "..."
},
{
"...": "..."
}
]
riotx file-import mydata.json hset record:#{id}
JSON records are trees with potentially nested values that need to be flattened when the target is a Redis hash for example.
To that end, RIOT-X uses a field naming convention to flatten JSON objects and arrays:
|
→ |
|
|
→ |
|
XML
Here is a sample XML file that can be imported by RIOT-X:
<?xml version="1.0" encoding="UTF-8"?>
<records>
<trade>
<isin>XYZ0001</isin>
<quantity>5</quantity>
<price>11.39</price>
<customer>Customer1</customer>
</trade>
<trade>
<isin>XYZ0002</isin>
<quantity>2</quantity>
<price>72.99</price>
<customer>Customer2c</customer>
</trade>
<trade>
<isin>XYZ0003</isin>
<quantity>9</quantity>
<price>99.99</price>
<customer>Customer3</customer>
</trade>
</records>
riotx file-import trades.xml --xpath //trade hset trade:#{isin}
Parquet
RIOT-X supports Parquet files.
riotx file-import s3://riotx/userdata1.parquet --s3-region us-west-1 hset user:#{id}