Ruisdael Data Catalog - PoC - presentation 2024.09.06: Difference between revisions
Andre Castro (talk | contribs) No edit summary |
Andre Castro (talk | contribs) No edit summary |
||
(14 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
<span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span> | |||
[[File:ys6mu7p9-400.png|250px]] | |||
<span style="font-size:large">https://tinyurl.com/ys6mu7p9</span> | |||
==Goals of Data Catalog | [https://tud365-my.sharepoint.com/:v:/r/personal/cvannoord_tudelft_nl/Documents/Opnamen/Presentation%20proof%20of%20concept%20Ruisdael%20data%20catalog-20240906_100307-Meeting%20Recording.mp4?csf=1&web=1&e=IzqEaR PoC presentation recording - 2024.09.06] | ||
__TOC__ | |||
=Goals of Data Catalog= | |||
* '''index and describe''' every dataset produced within Ruisdael Observatory, independently from where it is stored or published | * '''index and describe''' every dataset produced within Ruisdael Observatory, independently from where it is stored or published | ||
* '''discover where each dataset is published/stored''' (via [[Property:servedBy]]) | * '''discover where each dataset is published/stored''' (via [[Property:servedBy]] which points to [[:Category:DataService]] instances) | ||
* '''identify the contact-point of each dataset''', special relevant for unpublished datasets (via [[Property:contactPoint]]) | * '''identify the contact-point of each dataset''', special relevant for unpublished datasets (via [[Property:contactPoint]] which points to [[:Category:Foaf:Person]] instances) | ||
=Wiki: collaborative editing content management system= | |||
'''suitable to build and maintain a data catalog''' | |||
* out-of-the-box content management system - Wiki | |||
* anyone with an account can enter and edit content | |||
* anyone with or without an account the view the content | |||
* forms and templates help guiding users' edits | |||
* semantic properties and categories organize and describe content in machine-readable ways, allowing for structured queries and dynamic content aggregation | |||
* edit-history function allows reverting pages' content to previous versions | |||
==The software: ''[https://www.mediawiki.org/wiki/MediaWiki Mediawiki (MW)]''== | |||
* Open Source Software | |||
* Powers all Wikipedia instances and Wikidata | |||
* Markup language - easily converted to/from Markdown, HTML, ... | |||
* API | |||
* ecosystem of extensions: ie [https://www.mediawiki.org/wiki/Extension:Page_Forms Page Forms] [https://www.mediawiki.org/wiki/Extension:VisualEditor VisualEditor] [https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki SMW] | |||
==The extension: ''[https://www.semantic-mediawiki.org/wiki/Semantic_MediaWiki Semantic Mediawiki (SMW)]''== | |||
* Semantic annotations via ''property::value'' pairs and classes (''categories'') | |||
* User defined semantic properties, classes (''categories''), queries, controlled-vocabularies | |||
* RDF export | |||
=Data Catalog structure (schema)= | |||
[[File:schema-ruisdael-data-catalog.png|800px]] | |||
The wiki (Data Catalog) contents are structure loosely around DCAT-AP <ref>"DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe." https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap</ref> | |||
It uses of terms from the following vocabularies: | |||
* Data Catalogue vocabulary (DCAT)<ref>https://www.w3.org/TR/vocab-dcat-3/</ref>: [[MediaWiki:Smw_import_dcat]] | |||
* Dublin Core Metadata Terms(DCMI-Terms)<ref>https://www.dublincore.org/specifications/dublin-core/dcmi-terms/</ref>: [[MediaWiki:Smw_import_dcterms]] | |||
* Friend of a Friend (FOAF)<ref>http://xmlns.com/foaf/spec/</ref>: [[MediaWiki:Smw_import_foaf]] | |||
Organized, around the following categories: | |||
* [[:Category:Dataset]] | |||
* [[:Category:DataService]] | |||
* [[:Category:Foaf:Person]] | |||
* [[:Category:Organization]] | |||
* [[:Category:Campaign]] | |||
More details in [[Help:Data_Catalog_Schema]] | |||
=Entering Data= | |||
From [[Main_Page#Ruisdael_Datasets]] | |||
'''Create Dataset page with [[Form:Dataset]]''' | |||
'''Or "Edit with Form"''' an existing page, ie [[Micro_Rain_Radar_(Metek)_at_Cabauw]] | |||
=Displaying / Searching data= | |||
'''Via query based, dynamic tables''' ie. [[Main Page]]. Or other outputs formats, ie. [[Datasets Map]]''' | |||
== | '''Via search forms''': {{#queryformlink:form=DatasetSearch|link text=Search Datasets}}''' | ||
= | =References= |
Latest revision as of 12:27, 6 September 2024
https://ruisdael-catalog.citg.tudelft.nl/
PoC presentation recording - 2024.09.06
Goals of Data Catalog
- index and describe every dataset produced within Ruisdael Observatory, independently from where it is stored or published
- discover where each dataset is published/stored (via Property:servedBy which points to Category:DataService instances)
- identify the contact-point of each dataset, special relevant for unpublished datasets (via Property:contactPoint which points to Category:Foaf:Person instances)
Wiki: collaborative editing content management system
suitable to build and maintain a data catalog
- out-of-the-box content management system - Wiki
- anyone with an account can enter and edit content
- anyone with or without an account the view the content
- forms and templates help guiding users' edits
- semantic properties and categories organize and describe content in machine-readable ways, allowing for structured queries and dynamic content aggregation
- edit-history function allows reverting pages' content to previous versions
The software: Mediawiki (MW)
- Open Source Software
- Powers all Wikipedia instances and Wikidata
- Markup language - easily converted to/from Markdown, HTML, ...
- API
- ecosystem of extensions: ie Page Forms VisualEditor SMW
The extension: Semantic Mediawiki (SMW)
- Semantic annotations via property::value pairs and classes (categories)
- User defined semantic properties, classes (categories), queries, controlled-vocabularies
- RDF export
Data Catalog structure (schema)
The wiki (Data Catalog) contents are structure loosely around DCAT-AP [1]
It uses of terms from the following vocabularies:
- Data Catalogue vocabulary (DCAT)[2]: MediaWiki:Smw_import_dcat
- Dublin Core Metadata Terms(DCMI-Terms)[3]: MediaWiki:Smw_import_dcterms
- Friend of a Friend (FOAF)[4]: MediaWiki:Smw_import_foaf
Organized, around the following categories:
More details in Help:Data_Catalog_Schema
Entering Data
From Main_Page#Ruisdael_Datasets
Create Dataset page with Form:Dataset
Or "Edit with Form" an existing page, ie Micro_Rain_Radar_(Metek)_at_Cabauw
Displaying / Searching data
Via query based, dynamic tables ie. Main Page. Or other outputs formats, ie. Datasets Map
Via search forms: Search Datasets
References
- ↑ "DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe." https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap
- ↑ https://www.w3.org/TR/vocab-dcat-3/
- ↑ https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
- ↑ http://xmlns.com/foaf/spec/