Ruisdael Data Catalog - PoC - Ruisdael Science Day 2024: Difference between revisions

From Ruisdael Observatory Data Catalog
(Created page with "<span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span> <span style="font-size:large">https://tinyurl.com/ys6mu7p9</span> [https://tud365-my.sharepoint.com/:v:/r/personal/cvannoord_tudelft_nl/Documents/Opnamen/Presentation%20proof%20of%20concept%20Ruisdael%20data%20catalog-20240906_100307-Meeting%20Recording.mp4?csf=1&web=1&e=IzqEaR PoC presentation recording - 2024.09.06] __TOC__ =Goals of Data Catalog= * '''index and describe''' every da...")
 
 
(7 intermediate revisions by the same user not shown)
Line 1: Line 1:
<span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span>


[[File:rcfrupys-400.png|200px]]


<span style="font-size:large">tinyurl.com/rcfrupys</span>


<span style="font-size:large">https://tinyurl.com/ys6mu7p9</span>
<span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span>


[https://tud365-my.sharepoint.com/:v:/r/personal/cvannoord_tudelft_nl/Documents/Opnamen/Presentation%20proof%20of%20concept%20Ruisdael%20data%20catalog-20240906_100307-Meeting%20Recording.mp4?csf=1&web=1&e=IzqEaR PoC presentation recording - 2024.09.06]
__TOC__


__TOC__
=Goals of Data Catalog proof-of-concept=


=Goals of Data Catalog=
Considering that '''''data is the main deliverable''''' of the Ruisdael Observatory,<br/>
the Data Catalog attempts to:


* '''index and describe''' every dataset produced within Ruisdael Observatory, independently from where it is stored or published
* '''index and describe''' datasets produced within Ruisdael Observatory, independently from where it is stored or published - embracing heterogeneity
* '''discover where each dataset is published/stored''' (via [[Property:servedBy]] which points to [[:Category:DataService]] instances)
* '''find where each dataset is published/stored'''<ref>via [[Property:servedBy]] which points to [[:Category:DataService]] instances</ref>
* '''identify the contact-point of each dataset''', special relevant for unpublished datasets (via [[Property:contactPoint]] which points to [[:Category:Foaf:Person]] instances)
* '''identify the contact-point of each dataset''', special relevant for unpublished datasets <ref>via [[Property:contactPoint]] which points to [[:Category:Foaf:Person]] instances</ref>
* '''link campaigns to datasets'''
* '''leverage individual contributions''' onto a collaborative infrastructure


=Wiki: collaborative editing content management system=
=Wiki: collaborative editing content management system=
Line 23: Line 27:
* anyone with or without an account the view the content
* anyone with or without an account the view the content
* forms and templates help guiding users' edits
* forms and templates help guiding users' edits
* semantic properties and categories organize and describe content in machine-readable ways, allowing for structured queries and dynamic content aggregation  
* semantic properties and categories organize and describe content in machine-readable ways, allowing for parametric search and dynamic content aggregation  
* edit-history function allows reverting pages' content to previous versions
* edit-history function allows reverting pages' content to previous versions


Line 74: Line 78:
'''Via search forms''': {{#queryformlink:form=DatasetSearch|link text=Search Datasets}}'''
'''Via search forms''': {{#queryformlink:form=DatasetSearch|link text=Search Datasets}}'''


=Linking Media=
'''Linking Media (images, PDFs, short videos) to Dataset, Data Services, Campaigns, Organizations'''
'''Via File: pages''' ie. [[File:campaign_data_handling_protocol.pdf]]) "Edit with Form" button
=Getting Started=
'''New users can request accounts via [https://github.com/ruisdael-observatory/Ruisdael-Data-Catalog/issues/new?template=WikiAccount.yml Github issues]''' or email to Marc Schleiss <M.A.Schleiss@tudelft.nl> or Niels Jansen <N.H.Jansen@tudelft.nl>.
'''[[Help:Catalog editing]]''' contains information on how to edit and add content to the wiki
'''[[Help:Data Catalog Schema]]''' includes descriptions of the Categories & Properties, forms and templates used to implement the catalog
'''Discussions and development take place over [https://github.com/orgs/ruisdael-observatory/projects/1/ github  Ruisdael Data Catalog project]'''
'''Issues can be reported in https://github.com/ruisdael-observatory/Ruisdael-Data-Catalog/issues'''
''[[Help:Wiki administration]]''' includes information on how to maintain this wiki (for admins)


=References=
=References=

Latest revision as of 13:27, 24 September 2024

rcfrupys-400.png

tinyurl.com/rcfrupys

https://ruisdael-catalog.citg.tudelft.nl/

Goals of Data Catalog proof-of-concept

Considering that data is the main deliverable of the Ruisdael Observatory,
the Data Catalog attempts to:

  • index and describe datasets produced within Ruisdael Observatory, independently from where it is stored or published - embracing heterogeneity
  • find where each dataset is published/stored[1]
  • identify the contact-point of each dataset, special relevant for unpublished datasets [2]
  • link campaigns to datasets
  • leverage individual contributions onto a collaborative infrastructure

Wiki: collaborative editing content management system

suitable to build and maintain a data catalog

  • out-of-the-box content management system - Wiki
  • anyone with an account can enter and edit content
  • anyone with or without an account the view the content
  • forms and templates help guiding users' edits
  • semantic properties and categories organize and describe content in machine-readable ways, allowing for parametric search and dynamic content aggregation
  • edit-history function allows reverting pages' content to previous versions

The software: Mediawiki (MW)

  • Open Source Software
  • Powers all Wikipedia instances and Wikidata
  • Markup language - easily converted to/from Markdown, HTML, ...
  • API
  • ecosystem of extensions: ie Page Forms VisualEditor SMW

The extension: Semantic Mediawiki (SMW)

  • Semantic annotations via property::value pairs and classes (categories)
  • User defined semantic properties, classes (categories), queries, controlled-vocabularies
  • RDF export

Data Catalog structure (schema)

schema-ruisdael-data-catalog.png

The wiki (Data Catalog) contents are structure loosely around DCAT-AP [3]

It uses of terms from the following vocabularies:

Organized, around the following categories:


More details in Help:Data_Catalog_Schema

Entering Data

From Main_Page#Ruisdael_Datasets

Create Dataset page with Form:Dataset

Or "Edit with Form" an existing page, ie Micro_Rain_Radar_(Metek)_at_Cabauw

Displaying / Searching data

Via query based, dynamic tables ie. Main Page. Or other outputs formats, ie. Datasets Map


Via search forms: Search Datasets


Linking Media

Linking Media (images, PDFs, short videos) to Dataset, Data Services, Campaigns, Organizations

Via File: pages ie. File:campaign data handling protocol.pdf) "Edit with Form" button

Getting Started

New users can request accounts via Github issues or email to Marc Schleiss <M.A.Schleiss@tudelft.nl> or Niels Jansen <N.H.Jansen@tudelft.nl>.


Help:Catalog editing contains information on how to edit and add content to the wiki


Help:Data Catalog Schema includes descriptions of the Categories & Properties, forms and templates used to implement the catalog


Discussions and development take place over github Ruisdael Data Catalog project


Issues can be reported in https://github.com/ruisdael-observatory/Ruisdael-Data-Catalog/issues


Help:Wiki administration' includes information on how to maintain this wiki (for admins)

References

  1. via Property:servedBy which points to Category:DataService instances
  2. via Property:contactPoint which points to Category:Foaf:Person instances
  3. "DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe." https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap
  4. https://www.w3.org/TR/vocab-dcat-3/
  5. https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
  6. http://xmlns.com/foaf/spec/