With limited time and resources, prioritizing what data to publish is critically important and the first task in making data publicly available. Prioritizing what data to publish requires a balancing between which datasets are of the highest value and which datasets are closest to being ready for publication.
Start by reviewing which of your datasets are of the highest value:
- Public Interest: Which datasets would be of the greatest interest to the public? Which datasets are most frequently requested or PRA’d?
- Supports Department Goals: Which datasets could inform my agency’s current legislative or policy priorities? Which datasets would simplify our work or support our efforts to engage the citizens, researchers, or other government officials?
- Geography: Is the dataset describing an important geography and does it do so at an appropriate level of analysis (that is, always better to release, for example, census block level data than county data)?
- Timeliness: Does the dataset cover an important, recent, and relevant time?
- Frequency of updates: Is this data updated frequently? If not, is the data still valuable when it is “stale”?
After reviewing which datasets are of the highest value in the agency, consider what datasets are closest to being ready for publications:
- Completeness: Is the dataset complete? Are the holes in the dataset easily fixed and, if not, is the dataset still valuable?
- Accuracy: Is the dataset accurate? Can I describe the level of accuracy (or inaccuracy) in the metadata and dictionary?
- Metadata: Can I easily create metadata for this dataset (e.g. items like title, author, geographic area of focus, frequency of update, etc)?
- Data dictionary: Can I describe every column in the dataset and the values that are contained within it?
- Privacy: Does the dataset contain private data that should not be published? If the data has been anonymized, how easily can the data be re-identified? Read the CHHS Data De-Identification Guidelines.
- Machine-readable format: Is the dataset in a machine-readable format or can it easily be edited to be in one?
These are some of the considerations you should consider being choosing what datasets to publish. The rest of this guide will help you publish your data. For more information refer to the CHHS Open Data Handbook
The next step is to make sure that your data is formatted properly for data publication, otherwise go back to the guide contents page