Reading data fixtures from stdin

Data model instances can be serialised to various formats; the resulting serialised data can be used later as a “data fixture” for the ‘loaddata’ management command, to populate the same or a different database.

In Django 2.0, the loaddata command can read the fixture from the standard input.

Standard streams

The standard streams – input (stdin), output (stdout), and error (stderr) – are a feature on Unix-family operating systems. Every running process starts by default with those three streams available.

This feature allows separate processes to communicate as components in a “pipeline”, one process generating output that is consumed by the next as its input, and so on.

Transfer data fixtures over a pipeline

The existing behaviour of the ‘dumpdata’ management command is to produce the serialised data on its stdout stream. This allows the command line to use that output in standard ways, including redirecting the output to a named file:

$ python3 -m manage dumpdata \
    --database lorem_ipsum \
    --format yaml > lorem_ipsum.2017-09-14.yaml

Once that serialised data exists in a file, the loaddata command can specify that file for reading as a fixture:

$ python3 -m manage loaddata \
    --database backup_ipsum \

Now that the loaddata command can be told to read its input from stdin, the two commands form a pair that can be easily connected in a pipeline:

$ python3 -m manage dumpdata \
    --database lorem_ipsum --format yaml \
    | python3 -m manage loaddata \
        --database backup_ipsum --format yaml -

Note that there is no intermediary file for transferring the data; the two processes run at the same time, with loaddata reading what dumpdata generated in real time. This can simplify data transfer and opens up more options for moving the data around.

Do you have questions about Django, or want to hire me for some work in a Django-related project? Come chat to me either privately at my Matrix address, or in the public Django chat channel.