Database tables¶
Name |
Description |
---|---|
|
A collection, like the collection of files written to a crawl directory by Kingfisher Collect. See collection table. |
|
A file containing a record package or release package, like the files written by Kingfisher Collect. |
|
A passthrough table. #324 |
|
A collection note. See collection_note table. |
|
The metadata for a record package or release package. The metadata is stored separately, to only store one copy of the same metadata within the same collection or across different collections. |
|
The data for a record, release or compiled release. The data is stored separately, to only store one copy of the same data across different collections of the same publication. |
|
A record, from a record package. |
|
A release, from a release package. |
|
A compiled release. |
|
A record’s check results. |
|
A release’s check results. |
|
Temporary rows to track incomplete operations (load, compile, check) on collection files. |
The format of the cove_output
column of the *_check
tables is described in the lib-cove-ocds documentation (also used by the OCDS Data Review Tool), without:
additional_checks
records_aggregates
releases_aggregates
collection table¶
Column |
Description |
---|---|
|
The source from which the files were retrieved, like the spider name from Kingfisher Collect. |
|
The time at which the files were retrieved, like the |
|
Whether the files represent a sample from the source. |
|
One of “compile-releases” or “upgrade-1-0-to-1-1”. |
|
The parent collection from which this collection is derived. |
|
A JSON object for the process manager pattern. |
|
A JSON array with one or more of “upgrade”, “compile”, “check”. |
|
A JSON object like See OCDS Kit’s detect_format() function. |
|
The ID of the job in Scrapyd. Use this column to find the crawl log. |
|
The number of messages to expect from Kingfisher Collect. |
|
The time at which the collection was added. |
|
The time at which the collection was closed. |
|
Whether compilation has started. |
|
The time at which processing completed. |
|
The time at which the collection was deleted. |
|
The number of rows in the |
|
The number of rows in the |
|
The number of rows in the |
collection_note table¶
All warnings and handled errors are logged as notes.
Column |
Description |
---|---|
|
The collection to which the note relates. |
|
The note’s message. |
|
Any additional data, as JSON. |
|
The time at which the note was created. |
|
One of “INFO”, “WARNING” or “ERROR”. Use this column to filter by severity. |
For brevity, emoji are used in the tables for:
- 📑
The original collection
- ⬆️
The upgraded collection
- 🗜
The compiled collection
INFO-level notes¶
|
On |
Occurs when |
Interpretation |
---|---|---|---|
A user-provided note |
📑 |
A user creates a collection via Kingfisher Collect or the load command. |
Determine who created the collection. |
|
📑 |
Kingfisher Collect closes the spider. |
|
|
📑 |
Kingfisher Collect closes the spider. |
Check the crawl statistics (in the |
WARNING-level notes¶
|
On |
Occurs when |
Interpretation |
---|---|---|---|
|
📑 |
file_worker skips a file that contains package metadata only. |
The data source contains empty packages. |
party in "{party A's role}" role differs from party in [{party B's roles}] roles: {party A as JSON} {party B as JSON} |
⬆️ |
file_worker upgrades the file from OCDS 1.0. |
Potential data loss. See OCDS Kit’s upgrade command. |
Various |
🗜 |
record_compiler or release_compiler extends the release schema. |
An OCDS extension is not retrievable or is invalid UTF-8, JSON or ZIP. Any merge rules from the extension aren’t applied. |
|
🗜 |
record_compiler or release_compiler creates a compiled release. |
An array contains objects with the same ID. Potential data loss, if the duplicates differ. |
|
🗜 |
record_compiler finds many records with the same OCID. |
Only one record is compiled for each OCID. Potential data loss, if the duplicates differ. |
|
🗜 |
record_compiler finds releases without a |
Only dated releases are compiled. Potential data loss, if the undated releases differ. |
ERROR-level notes¶
|
On |
Occurs when |
Interpretation |
|
---|---|---|---|---|
|
📑 |
Kingfisher Collect fails to retrieve a URL. (api_loader) #366 |
FileError item |
|
|
📑 |
file_worker fails to load the file to the database. |
A user deleted the file before it was loaded. |
RabbitMQ message |
|
📑 |
file_worker fails to load the file to the database. |
The data source doesn’t conform to OCDS, or the spider has a bug to fix. |
RabbitMQ message |
|
📑 |
file_worker fails to load the file to the database. |
Set a |
RabbitMQ message |
|
🗜 |
record_compiler or release_compiler fails to create a compiled release. |
The data source doesn’t conform to OCDS. |
Other compilation notes¶
These notes on the 🗜 compiled collection are written by record_compiler and prefixed by one of:
OCID {ocid} has ## linked releases among ## dated releases and ## releases.
OCID {ocid} has ## releases, all undated.
OCID {ocid} has 0 releases.
In other words, the record contains either some linked releases, only undated releases or no releases.
In these cases, it’s possible that the data source’s merge routine isn’t correct: that is, the compiled release doesn’t represent individual releases.
code |
note |
Occurs when |
Interpretation |
---|---|---|---|
|
|
Compiling records |
The record’s releases are all linked, which is fine. A publisher-generated compiled release is used. |
|
|
Compiling records |
A publisher-generated compiled release is used. |
|
|
Compiling records |
A publisher-generated compiled release is used. |
|
|
Compiling records |
The record is absent from the compiled collection. |