Management commands

The raw-data app includes the following commands for processing and verifying the raw data released in the CAL-ACCESS `nightly exports`_.

As with any Django app management command, these can be invoked on the command line or called within your Python code.


updatecalaccessrawdata

This is the master command. It brings together all of the other management commands listed below to download, unzip, clean and load the latest snapshot of the CAL-ACCESS database.

Examples

Running the entire routine is as simple as this.

$ python manage.py updatecalaccessrawdata

This command will either:

  • Update your copy of the CAL-ACCESS data to the latest snapshot on the California Secretary of State’s website
  • Or complete your previously interrputed update, if possible.

You can skip the download’s confirmation prompt using Django’s standard --noinput option.

$ python manage.py updatecalaccessrawdata --noinput

The source files downloaded as part of the process will be deleted unless the --keep-files option is provided.

$ python manage.py updatecalaccessrawdata --keep-files

The other options are below.

Options

usage: manage.py updatecalaccessrawdata [-h] [--version] [-v {0,1,2,3}]
                                        [--settings SETTINGS]
                                        [--pythonpath PYTHONPATH]
                                        [--traceback] [--no-color]
                                        [--keep-files] [--noinput]
                                        [-a APP_NAME]

Download, unzip, clean and load the latest CAL-ACCESS database ZIP

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --keep-files          Keep zip, unzipped, TSV and CSV files
  --noinput             Update or resume previous update without asking
                        permission
  -a APP_NAME, --app-name APP_NAME
                        Name of Django app with models into which data will be
                        imported (if not calaccess_raw)

Note

The updatecalaccessrawdata command overwrites the previously downloaded, extracted and cleaned files in the application’s download directory.


cleancalaccessrawfile

Clean a source CAL-ACCESS TSV file and reformat it as a CSV. A component of the master updatecalaccessrawdata command.

Examples

Provide the name of the TSV file you would like to process. The command will attempt to find it in the application’s download directory.

$ python manage.py cleancalaccessrawfile RCPT_CD.TSV

The original TSV file will be deleted in favor of the new CSV unless the --keep-file option is provided.

$ python manage.py cleancalaccessrawfile RCPT_CD.TSV --keep-file

Options

usage: manage.py cleancalaccessrawfile [-h] [--version] [-v {0,1,2,3}]
                                       [--settings SETTINGS]
                                       [--pythonpath PYTHONPATH] [--traceback]
                                       [--no-color] [--keep-file]
                                       file_name

Clean a source CAL-ACCESS TSV file and reformat it as a CSV

positional arguments:
  file_name             Name of the TSV file to be cleaned and discarded for a
                        CSV

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --keep-file          Keep original TSV file

Note

The cleancalaccessrawfile command overwrites the CSV files previously processed from the original TSV files.


downloadcalaccessrawdata

Download the latest CAL-ACCESS database ZIP. A component of the master updatecalaccessrawdata command.

Examples

Here is how to run the command.

$ python manage.py downloadcalaccessrawdata

You will then see a prompt with the release date and size of the latest zip of raw CAL-ACCESS data files available to download from the California Secretary of State.

If your previous download did not complete and the same snapshot is still available to download, you will be prompted to resume your previous download.

You can skip the download’s confirmation prompt using Django’s standard --noinput option.

$ python manage.py downloadcalaccessrawdata --noinput

The other options are below.

The server hosting the ZIP doesn’t always provide the most up-to-date resource (as we have documented). As such, a CommandError will be raised under either of the following conditions:

  • If the actual size of the ZIP does not match the value of the Content-Length in the HEAD response.
  • If the Last-modified of HEAD and GET are more than five minutes apart.

Options

usage: manage.py downloadcalaccessrawdata [-h] [--version] [-v {0,1,2,3}]
                                          [--settings SETTINGS]
                                          [--pythonpath PYTHONPATH]
                                          [--traceback] [--no-color]
                                          [--noinput] [--force-restart]

Download the latest CAL-ACCESS database ZIP

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --noinput             Download the ZIP archive without asking permission
  --force-restart, --restart
                        Force re-start (overrides auto-resume).

Note

The downloadcalaccessrawdata command overwrites the previously downloaded zip file.


extractcalaccessrawfiles

Extract the CAL-ACCESS raw data files from downloaded ZIP. A component of the master updatecalaccessrawdata command.

Examples

Here is how to run the command.

$ python manage.py extractcalaccessrawfiles

The downloaded zip file will be deleted unless the --keep-files option is provided.

$ python manage.py extractcalaccessrawfiles --keep-files

Options

usage: manage.py extractcalaccessrawfiles [-h] [--version] [-v {0,1,2,3}]
                                          [--settings SETTINGS]
                                          [--pythonpath PYTHONPATH]
                                          [--traceback] [--no-color]
                                          [--keep-files]

Extract the CAL-ACCESS raw data files from the database export ZIP

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --keep-files          Keep downloaded zipped files

Note

The extractcalaccessrawfiles command overwrites the previously extracted TSV files.


loadcalaccessrawfile

Load clean CAL-ACCESS CSV file into a database model. A component of the master updatecalaccessrawdata command.

Examples

The command expects the name of the Django database model where the file will be loaded.

$ python manage.py loadcalaccessrawfile RcptCd

The model will attempt to load its default CSV file unless one is provided with the --csv argument.

$ python manage.py loadcalaccessrawfile RcptCd --csv=/home/jerry/Data/MyFile.csv

Options

usage: manage.py loadcalaccessrawfile [-h] [--version] [-v {0,1,2,3}]
                                      [--settings SETTINGS]
                                      [--pythonpath PYTHONPATH] [--traceback]
                                      [--no-color] [--c CSV] [--keep-file]
                                      [-a APP_NAME]
                                      model_name

Load clean CAL-ACCESS CSV file into a database model

positional arguments:
  model_name            Name of the model into which data will be loaded

optional arguments:
  -h, --help            show this help message and exit
  --version             show program's version number and exit
  -v {0,1,2,3}, --verbosity {0,1,2,3}
                        Verbosity level; 0=minimal output, 1=normal output,
                        2=verbose output, 3=very verbose output
  --settings SETTINGS   The Python path to a settings module, e.g.
                        "myproject.settings.main". If this isn't provided, the
                        DJANGO_SETTINGS_MODULE environment variable will be
                        used.
  --pythonpath PYTHONPATH
                        A directory to add to the Python path, e.g.
                        "/home/djangoprojects/myproject".
  --traceback           Raise on CommandError exceptions
  --no-color            Don't colorize the command output.
  --c CSV, --csv CSV    Path to comma-delimited file to be loaded. Defaults to
                        one associated with model.
  --keep-file          Keep clean CSV file after loading
  -a APP_NAME, --app-name APP_NAME
                        Name of Django app with models into which data will be
                        imported (if other not calaccess_raw)

Note

The loadcalaccessrawfile command deletes any data previously loaded into the calaccess_raw models before loading in the current data.