Help:What is a Dataset

From Ruisdael Observatory Data Catalog
Revision as of 08:56, 10 December 2024 by Marc Schleiss (talk | contribs)

The catalog can be used to index many different types of datasets. The text below gives important information about the main types of datasets and what should/shouldn't be indexed.

Acceptable Data Types

  • Model outputs: The output of numerical weather prediction models, such as DALES, HARMONIE and other types of numerical simulations, reanalyses and forecasts. This category partially overlaps with the "Derived/Processed data" category.
  • In-situ observations: these are datasets collected directly at the location of interest.
  • Remote-sensing observations: (Datasets collected from a distance, typically using satellites or ground-based remote sensors.)
  • Derived/Processed Data: highly processed datasets such as blended satellite and in-situ data, physical retrievals, reanalyses and higher-level datasets based on other datasets.
  • Geospatial Data: datasets that include spatial information about the Earth's surface, sub-surface and atmosphere. For example GIS layers, coordinate systems, grids, projections etc..

What should not be indexed?

  • Software: numerical weather models, toolboxes, etc. should not be indexed.
  • Data analysis scripts: For example, Python notebooks, Matlab, R or C scripts should not be indexed. The latter can be documented and distributed through the Ruisdael github page.
  • Articles: peer-reviewed papers, technical reports, presentations and other forms of written/oral publications should not be indexed in the catalog.
  • datasets that do not have any link to the Ruisdael project.