File Import
The file-import command reads data from files and writes it to Redis.
The basic usage for file imports is:
riotx file-import [OPTIONS] FILE... [REDIS COMMAND...]
To show the full usage, run:
riotx file-import --help
RIOT-X will try to determine the file type from its extension (e.g. .csv or .json), but you can specify it with the --type option.
Gzipped files are supported and the extension before .gz is used (e.g. myfile.json.gz → json).
-
/path/file.csv -
/path/file-*.csv -
/path/file.json -
http://data.com/file.csv -
http://data.com/file.json.gz
Use - to read from standard input.
|
Amazon S3 and Google Cloud Storage buckets are supported.
riotx file-import s3://mydata/file.csv --s3-region us-west-1 hset beer:#{id}
riotx file-import gs://mydata/file.json --gcp-key-file key.json hset user:#{id}
Data Structures
If no Redis command is specified, it is assumed that the input file(s) contain Redis data structures serialized as JSON or XML. See the File Export section to learn about the expected format and how to generate such files.
riotx file-import myfile.json
Redis Commands
When one or more `REDIS COMMAND`s are specified, these commands are called for each input record.
|
Redis client options apply to the root command ( In this example Redis client options will not be taken into account:
|
Redis command keys are constructed from input records by concatenating keyspace prefix and key fields.
blah:<id>riotx file-import my.json hset blah:#{id}
riotx file-import myfile.json json.set user:#{id} $ .
riotx file-import my.json hset blah:#{id} expire blah:#{id}
blah:<id> and set TTL and add each id to a set named mysetriotx file-import my.json hset blah:#{id} expire blah:#{id} sadd myset --member #{id}
Delimited (CSV)
The default delimiter character is comma (,).
It can be changed with the --delimiter option.
If the file has a header, use the --header option to automatically extract field names.
Otherwise specify the field names using the --fields option.
Let’s consider this CSV file:
| row | abv | ibu | id | name | style | brewery | ounces |
|---|---|---|---|---|---|---|---|
1 |
0.079 |
45 |
321 |
Fireside Chat (2010) |
Winter Warmer |
368 |
12.0 |
2 |
0.068 |
65 |
173 |
Back in Black |
American Black Ale |
368 |
12.0 |
3 |
0.083 |
35 |
11 |
Monk’s Blood |
Belgian Dark Ale |
368 |
12.0 |
The following command imports this CSV into Redis as hashes using beer as the key prefix and id as primary key.
riotx file-import beers.csv --header hset beer:#{id}
This creates hashes with keys beer:321, beer:173, …
This command imports a CSV file into a geo set named airportgeo with airport IDs as members:
riotx file-import airports.csv --header geoadd airportgeo --member #{iata} --lon #{lng} --lat #{lat}
Fixed-Length (Fixed-Width)
Fixed-length files can be imported by specifying the width of each field using the --ranges option.
riotx file-import fw.txt --ranges 0-4 5-9 10-20 --names field1 field2 field3 hset record:#{field1}
JSON
The expected format for JSON files is:
[
{
"...": "..."
},
{
"...": "..."
}
]
riotx file-import mydata.json hset record:#{id}
JSON records are trees with potentially nested values that need to be flattened when the target is a Redis hash for example.
To that end, RIOT-X uses a field naming convention to flatten JSON objects and arrays:
|
→ |
|
|
→ |
|
XML
Here is a sample XML file that can be imported by RIOT-X:
<?xml version="1.0" encoding="UTF-8"?>
<records>
<trade>
<isin>XYZ0001</isin>
<quantity>5</quantity>
<price>11.39</price>
<customer>Customer1</customer>
</trade>
<trade>
<isin>XYZ0002</isin>
<quantity>2</quantity>
<price>72.99</price>
<customer>Customer2c</customer>
</trade>
<trade>
<isin>XYZ0003</isin>
<quantity>9</quantity>
<price>99.99</price>
<customer>Customer3</customer>
</trade>
</records>
riotx file-import trades.xml --xpath //trade hset trade:#{isin}
Parquet
RIOT-X supports Parquet files.
riotx file-import s3://riotx/userdata1.parquet --s3-region us-west-1 hset user:#{id}