The main purpose of this paper is to provide documentation for the databases used in the study “Effects of a Severence Tax on Oil Produced in California.” As a case study, it explains potential problems that may come up in trying to put a large dataset into a meaningful, usable form. It also explains basic steps utilized to prevent some of the potential problems. The original data sources and the evaluations of their accuracies are discussed. The paper also explains the data processing steps, descriptions of the variables in the final datasets, and indicates how to get access to these datasets. The final section provides general comments and summarizes the experience of dealing with a large dataset.
Author: R. Yılmaz Argüden