Models for tracking updates¶
The processed-data app also keeps track of each snapshot of CAL-ACCESS database it processes. This tracking information is stored in the data tables outlined below.
Note
By default, the processed-data app does not archive previous versions of the CAL-ACCESS database. Rather, with each call to the management commands, the data files they process are overwritten.
You can configure the raw-data app to keep each copy of the zip file downloaded from the California Secretary of State as well as the indivdual raw .csv files and cleaned .tsv files by flipping the CALACCESS_STORE_ARCHIVE
to True
in settings.py
:
# in settings.py
CALACCESS_STORE_ARCHIVE = True
By default, the older copies of these files will be saved to the path specified by your Django project’s MEDIA_ROOT
setting (more on that here). However, if you’ve implemented a custom storage system or installed a third-party app (such as django-storages), that should work too.
ProcessDataVersion¶
Versions of CAL-ACCESS raw source data, typically released every day.
Fields¶
Name | Type | Unique key | Definition |
---|---|---|---|
id | Integer | Yes | Auto-incrementing unique identifer of versions |
raw_version_id | Integer | Yes | Foreign key referencing the raw data version processed |
process_start_datetime | DateTime | No | Date and time when the processing of the CAL-ACCESS version started |
process_finish_datetime | Integer | No | Date and time when the processing of the CAL-ACCESS version finished |
zip_archive | FileField | No | An archive zip of processed files |
zip_size | Integer | No | The expected size (in bytes) of the zip of processed files |
Instance methods and properties¶
.update_completed |
Check if the database update to the version completed. Return True or False . |
.update_stalled |
Check if the database update to the version started but did not complete. Return True or False . |
.pretty_expected_size() |
Returns a prettified version (e.g., "725M") of the expected size of the downloaded zip. |
ProcessedDataFile¶
A data file included in a processed version of CAL-ACCESS.
Fields¶
Name | Type | Unique key | Definition |
---|---|---|---|
id | Integer | Yes | Auto-incrementing unique identifer of the file |
version_id | Integer | No | Foreign key referencing the processed version of CAL-ACCESS |
file_name | String (up to 100) | No | Name of the processed data file without extension |
process_start_datetime | DateTime | No | Date and time when the processing of the file started |
process_finish_datetime | DateTime | No | Date and time when the processing of the file finished |
records_count | Integer | No | Count of records in the processed file |
file_archive | FileField | No | An archive of the processed file |
file_size | Integer | No | Size of the processed file (in bytes) |
Instance methods and properties¶
.pretty_file_size() |
Returns a prettified version (e.g., "725M") of the downloaded file's size. |