Ruisdael Data Catalog - PoC - presentation 2024.09.06: Difference between revisions

From Ruisdael Observatory Data Catalog
No edit summary
No edit summary
 
(14 intermediate revisions by the same user not shown)
Line 1: Line 1:
[[File:ys6mu7p9-400.png|300px]]
<span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span>




[[File:ys6mu7p9-400.png|250px]]




<span style="font-size:large">https://tinyurl.com/ys6mu7p9</span>


==Goals of Data Catalog==
[https://tud365-my.sharepoint.com/:v:/r/personal/cvannoord_tudelft_nl/Documents/Opnamen/Presentation%20proof%20of%20concept%20Ruisdael%20data%20catalog-20240906_100307-Meeting%20Recording.mp4?csf=1&web=1&e=IzqEaR PoC presentation recording - 2024.09.06]
 
__TOC__
 
=Goals of Data Catalog=


* '''index and describe''' every dataset produced within Ruisdael Observatory, independently from where it is stored or published
* '''index and describe''' every dataset produced within Ruisdael Observatory, independently from where it is stored or published
* '''discover where each dataset is published/stored''' (via [[Property:servedBy]])
* '''discover where each dataset is published/stored''' (via [[Property:servedBy]] which points to [[:Category:DataService]] instances)
* '''identify the contact-point of each dataset''', special relevant for unpublished datasets (via [[Property:contactPoint]])
* '''identify the contact-point of each dataset''', special relevant for unpublished datasets (via [[Property:contactPoint]] which points to [[:Category:Foaf:Person]] instances)
 
=Wiki: collaborative editing content management system=
 
'''suitable to build and maintain a data catalog'''
 
* out-of-the-box content management system - Wiki
* anyone with an account can enter and edit content
* anyone with or without an account the view the content
* forms and templates help guiding users' edits
* semantic properties and categories organize and describe content in machine-readable ways, allowing for structured queries and dynamic content aggregation
* edit-history function allows reverting pages' content to previous versions
 
==The software: ''[https://www.mediawiki.org/wiki/MediaWiki Mediawiki (MW)]''==
* Open Source Software
* Powers all Wikipedia instances and Wikidata
* Markup language - easily converted to/from Markdown, HTML, ...
* API
* ecosystem of extensions: ie [https://www.mediawiki.org/wiki/Extension:Page_Forms Page Forms] [https://www.mediawiki.org/wiki/Extension:VisualEditor VisualEditor] [https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki SMW]
 
==The extension: ''[https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki Semantic Mediawiki (SMW)]''==
 
* Semantic annotations via ''property::value'' pairs and classes (''categories'')
* User defined semantic properties, classes (''categories''), queries, controlled-vocabularies
* RDF export
 
=Data Catalog structure (schema)=
 
[[File:schema-ruisdael-data-catalog.png|800px]]
 
The wiki (Data Catalog) contents are structure loosely around DCAT-AP <ref>"DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe." https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap</ref>
 
It uses of terms from the following vocabularies:
* Data Catalogue vocabulary (DCAT)<ref>https://www.w3.org/TR/vocab-dcat-3/</ref>: [[MediaWiki:Smw_import_dcat]]
* Dublin Core Metadata Terms(DCMI-Terms)<ref>https://www.dublincore.org/specifications/dublin-core/dcmi-terms/</ref>: [[MediaWiki:Smw_import_dcterms]]
* Friend of a Friend (FOAF)<ref>http://xmlns.com/foaf/spec/</ref>: [[MediaWiki:Smw_import_foaf]]
 
Organized, around the following categories:
* [[:Category:Dataset]]
* [[:Category:DataService]]
* [[:Category:Foaf:Person]]
* [[:Category:Organization]]
* [[:Category:Campaign]]
 
 
More details in [[Help:Data_Catalog_Schema]]
=Entering Data=
 
From [[Main_Page#Ruisdael_Datasets]]
 
'''Create Dataset page with [[Form:Dataset]]'''
 
'''Or "Edit with Form"''' an existing page, ie [[Micro_Rain_Radar_(Metek)_at_Cabauw]]
 
=Displaying / Searching data=
 
'''Via query based, dynamic tables''' ie. [[Main Page]]. Or other outputs formats, ie. [[Datasets Map]]'''


==Wiki as a collaborative online environment suitable to collective build and maintain a data catalog==


==Data Catalog structures (schema)==
'''Via search forms''': {{#queryformlink:form=DatasetSearch|link text=Search Datasets}}'''


==Entering Data==


==Displaying / Searching data==
=References=

Latest revision as of 12:27, 6 September 2024

https://ruisdael-catalog.citg.tudelft.nl/


ys6mu7p9-400.png


https://tinyurl.com/ys6mu7p9

PoC presentation recording - 2024.09.06

Goals of Data Catalog

  • index and describe every dataset produced within Ruisdael Observatory, independently from where it is stored or published
  • discover where each dataset is published/stored (via Property:servedBy which points to Category:DataService instances)
  • identify the contact-point of each dataset, special relevant for unpublished datasets (via Property:contactPoint which points to Category:Foaf:Person instances)

Wiki: collaborative editing content management system

suitable to build and maintain a data catalog

  • out-of-the-box content management system - Wiki
  • anyone with an account can enter and edit content
  • anyone with or without an account the view the content
  • forms and templates help guiding users' edits
  • semantic properties and categories organize and describe content in machine-readable ways, allowing for structured queries and dynamic content aggregation
  • edit-history function allows reverting pages' content to previous versions

The software: Mediawiki (MW)

  • Open Source Software
  • Powers all Wikipedia instances and Wikidata
  • Markup language - easily converted to/from Markdown, HTML, ...
  • API
  • ecosystem of extensions: ie Page Forms VisualEditor SMW

The extension: Semantic Mediawiki (SMW)

  • Semantic annotations via property::value pairs and classes (categories)
  • User defined semantic properties, classes (categories), queries, controlled-vocabularies
  • RDF export

Data Catalog structure (schema)

schema-ruisdael-data-catalog.png

The wiki (Data Catalog) contents are structure loosely around DCAT-AP [1]

It uses of terms from the following vocabularies:

Organized, around the following categories:


More details in Help:Data_Catalog_Schema

Entering Data

From Main_Page#Ruisdael_Datasets

Create Dataset page with Form:Dataset

Or "Edit with Form" an existing page, ie Micro_Rain_Radar_(Metek)_at_Cabauw

Displaying / Searching data

Via query based, dynamic tables ie. Main Page. Or other outputs formats, ie. Datasets Map


Via search forms: Search Datasets


References

  1. "DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe." https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap
  2. https://www.w3.org/TR/vocab-dcat-3/
  3. https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
  4. http://xmlns.com/foaf/spec/