Models for tracking updates

The processed-data app also keeps track of each snapshot of CAL-ACCESS database it processes. This tracking information is stored in the data tables outlined below.

Note

By default, the processed-data app does not archive previous versions of the CAL-ACCESS database. Rather, with each call to the management commands, the data files they process are overwritten.

You can configure the raw-data app to keep each copy of the zip file downloaded from the California Secretary of State as well as the indivdual raw .csv files and cleaned .tsv files by flipping the CALACCESS_STORE_ARCHIVE to True in settings.py:

# in settings.py
CALACCESS_STORE_ARCHIVE = True

By default, the older copies of these files will be saved to the path specified by your Django project’s MEDIA_ROOT setting (more on that here). However, if you’ve implemented a custom storage system or installed a third-party app (such as django-storages), that should work too.


ProcessDataVersion

Versions of CAL-ACCESS raw source data, typically released every day.

Fields

Name Type Unique key Definition
id Integer Yes Auto-incrementing unique identifer of versions
raw_version_id Integer Yes Foreign key referencing the raw data version processed
process_start_datetime DateTime No Date and time when the processing of the CAL-ACCESS version started
process_finish_datetime Integer No Date and time when the processing of the CAL-ACCESS version finished
zip_archive FileField No An archive zip of processed files
zip_size Integer No The expected size (in bytes) of the zip of processed files

Instance methods and properties

.update_completed Check if the database update to the version completed. Return True or False.
.update_stalled Check if the database update to the version started but did not complete. Return True or False.
.pretty_expected_size() Returns a prettified version (e.g., "725M") of the expected size of the downloaded zip.

ProcessedDataFile

A data file included in a processed version of CAL-ACCESS.

Fields

Name Type Unique key Definition
id Integer Yes Auto-incrementing unique identifer of the file
version_id Integer No Foreign key referencing the processed version of CAL-ACCESS
file_name String (up to 100) No Name of the processed data file without extension
process_start_datetime DateTime No Date and time when the processing of the file started
process_finish_datetime DateTime No Date and time when the processing of the file finished
records_count Integer No Count of records in the processed file
file_archive FileField No An archive of the processed file
file_size Integer No Size of the processed file (in bytes)

Instance methods and properties

.pretty_file_size() Returns a prettified version (e.g., "725M") of the downloaded file's size.