Ruisdael Data Catalog - PoC - Ruisdael Science Day 2024: Difference between revisions
Andre Castro (talk | contribs) (Created page with "<span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span> <span style="font-size:large">https://tinyurl.com/ys6mu7p9</span> [https://tud365-my.sharepoint.com/:v:/r/personal/cvannoord_tudelft_nl/Documents/Opnamen/Presentation%20proof%20of%20concept%20Ruisdael%20data%20catalog-20240906_100307-Meeting%20Recording.mp4?csf=1&web=1&e=IzqEaR PoC presentation recording - 2024.09.06] __TOC__ =Goals of Data Catalog= * '''index and describe''' every da...") |
Andre Castro (talk | contribs) |
||
(7 intermediate revisions by the same user not shown) | |||
Line 1: | Line 1: | ||
[[File:rcfrupys-400.png|200px]] | |||
<span style="font-size:large">tinyurl.com/rcfrupys</span> | |||
<span style="font-size: | <span style="font-size:larger">https://ruisdael-catalog.citg.tudelft.nl/</span> | ||
__TOC__ | |||
=Goals of Data Catalog proof-of-concept= | |||
Considering that '''''data is the main deliverable''''' of the Ruisdael Observatory,<br/> | |||
the Data Catalog attempts to: | |||
* '''index and describe''' | * '''index and describe''' datasets produced within Ruisdael Observatory, independently from where it is stored or published - embracing heterogeneity | ||
* ''' | * '''find where each dataset is published/stored'''<ref>via [[Property:servedBy]] which points to [[:Category:DataService]] instances</ref> | ||
* '''identify the contact-point of each dataset''', special relevant for unpublished datasets | * '''identify the contact-point of each dataset''', special relevant for unpublished datasets <ref>via [[Property:contactPoint]] which points to [[:Category:Foaf:Person]] instances</ref> | ||
* '''link campaigns to datasets''' | |||
* '''leverage individual contributions''' onto a collaborative infrastructure | |||
=Wiki: collaborative editing content management system= | =Wiki: collaborative editing content management system= | ||
Line 23: | Line 27: | ||
* anyone with or without an account the view the content | * anyone with or without an account the view the content | ||
* forms and templates help guiding users' edits | * forms and templates help guiding users' edits | ||
* semantic properties and categories organize and describe content in machine-readable ways, allowing for | * semantic properties and categories organize and describe content in machine-readable ways, allowing for parametric search and dynamic content aggregation | ||
* edit-history function allows reverting pages' content to previous versions | * edit-history function allows reverting pages' content to previous versions | ||
Line 74: | Line 78: | ||
'''Via search forms''': {{#queryformlink:form=DatasetSearch|link text=Search Datasets}}''' | '''Via search forms''': {{#queryformlink:form=DatasetSearch|link text=Search Datasets}}''' | ||
=Linking Media= | |||
'''Linking Media (images, PDFs, short videos) to Dataset, Data Services, Campaigns, Organizations''' | |||
'''Via File: pages''' ie. [[File:campaign_data_handling_protocol.pdf]]) "Edit with Form" button | |||
=Getting Started= | |||
'''New users can request accounts via [https://github.com/ruisdael-observatory/Ruisdael-Data-Catalog/issues/new?template=WikiAccount.yml Github issues]''' or email to Marc Schleiss <M.A.Schleiss@tudelft.nl> or Niels Jansen <N.H.Jansen@tudelft.nl>. | |||
'''[[Help:Catalog editing]]''' contains information on how to edit and add content to the wiki | |||
'''[[Help:Data Catalog Schema]]''' includes descriptions of the Categories & Properties, forms and templates used to implement the catalog | |||
'''Discussions and development take place over [https://github.com/orgs/ruisdael-observatory/projects/1/ github Ruisdael Data Catalog project]''' | |||
'''Issues can be reported in https://github.com/ruisdael-observatory/Ruisdael-Data-Catalog/issues''' | |||
''[[Help:Wiki administration]]''' includes information on how to maintain this wiki (for admins) | |||
=References= | =References= |
Latest revision as of 13:27, 24 September 2024
tinyurl.com/rcfrupys
https://ruisdael-catalog.citg.tudelft.nl/
Goals of Data Catalog proof-of-concept
Considering that data is the main deliverable of the Ruisdael Observatory,
the Data Catalog attempts to:
- index and describe datasets produced within Ruisdael Observatory, independently from where it is stored or published - embracing heterogeneity
- find where each dataset is published/stored[1]
- identify the contact-point of each dataset, special relevant for unpublished datasets [2]
- link campaigns to datasets
- leverage individual contributions onto a collaborative infrastructure
Wiki: collaborative editing content management system
suitable to build and maintain a data catalog
- out-of-the-box content management system - Wiki
- anyone with an account can enter and edit content
- anyone with or without an account the view the content
- forms and templates help guiding users' edits
- semantic properties and categories organize and describe content in machine-readable ways, allowing for parametric search and dynamic content aggregation
- edit-history function allows reverting pages' content to previous versions
The software: Mediawiki (MW)
- Open Source Software
- Powers all Wikipedia instances and Wikidata
- Markup language - easily converted to/from Markdown, HTML, ...
- API
- ecosystem of extensions: ie Page Forms VisualEditor SMW
The extension: Semantic Mediawiki (SMW)
- Semantic annotations via property::value pairs and classes (categories)
- User defined semantic properties, classes (categories), queries, controlled-vocabularies
- RDF export
Data Catalog structure (schema)
The wiki (Data Catalog) contents are structure loosely around DCAT-AP [3]
It uses of terms from the following vocabularies:
- Data Catalogue vocabulary (DCAT)[4]: MediaWiki:Smw_import_dcat
- Dublin Core Metadata Terms(DCMI-Terms)[5]: MediaWiki:Smw_import_dcterms
- Friend of a Friend (FOAF)[6]: MediaWiki:Smw_import_foaf
Organized, around the following categories:
More details in Help:Data_Catalog_Schema
Entering Data
From Main_Page#Ruisdael_Datasets
Create Dataset page with Form:Dataset
Or "Edit with Form" an existing page, ie Micro_Rain_Radar_(Metek)_at_Cabauw
Displaying / Searching data
Via query based, dynamic tables ie. Main Page. Or other outputs formats, ie. Datasets Map
Via search forms: Search Datasets
Linking Media
Linking Media (images, PDFs, short videos) to Dataset, Data Services, Campaigns, Organizations
Via File: pages ie. File:campaign data handling protocol.pdf) "Edit with Form" button
Getting Started
New users can request accounts via Github issues or email to Marc Schleiss <M.A.Schleiss@tudelft.nl> or Niels Jansen <N.H.Jansen@tudelft.nl>.
Help:Catalog editing contains information on how to edit and add content to the wiki
Help:Data Catalog Schema includes descriptions of the Categories & Properties, forms and templates used to implement the catalog
Discussions and development take place over github Ruisdael Data Catalog project
Issues can be reported in https://github.com/ruisdael-observatory/Ruisdael-Data-Catalog/issues
Help:Wiki administration' includes information on how to maintain this wiki (for admins)
References
- ↑ via Property:servedBy which points to Category:DataService instances
- ↑ via Property:contactPoint which points to Category:Foaf:Person instances
- ↑ "DCAT Application profile for data portals in Europe (DCAT-AP) is a specification based on the Data Catalogue vocabulary (DCAT) for describing public sector datasets in Europe." https://op.europa.eu/nl/web/eu-vocabularies/dcat-ap
- ↑ https://www.w3.org/TR/vocab-dcat-3/
- ↑ https://www.dublincore.org/specifications/dublin-core/dcmi-terms/
- ↑ http://xmlns.com/foaf/spec/