# Blessed be the Fruit - Project Documentation
<a href="https://orsolamborrini.github.io/blessedfruit/">Blessed be the Fruit</a> is a project developed by Maddalena Ghiotto, Chloe PapaDidpoulou, and Orsola Maria Borrini for the final exam of the course <a href="https://www.unibo.it/it/didattica/insegnamenti/insegnamento/2022/424645">"Open Access and Digital Ethics"</a> held by professor Monica Palmirani within the <a href="https://corsi.unibo.it/2cycle/DigitalHumanitiesKnowledge">Digital Humanities and Digital Knowledge Master Degree</a> (University of Bologna), during the A.Y. 2022/2023.

## Introduction
Many existing studies, mainly focused on the US, have been relating teen pregnancy to a variety of socioeconomic factors that may influence it. Amongst them, low income and poverty, education levels, race or ethnicity and, finally, religion.<br> In this project we wanted to move the focus away from the US, one of the industrialized countries in which teen pregnancy and birth rates are the highest, to Italy, and study whether there could be a relation between education, religion observance and <b>pregnancy rates</b> in the Mediterranean country.

As the headquarters of the Catholic Church reside in Vatican City, enclaved in Rome, the relationship between Italians and this Church surely is particularly strong. However, according to <a href="https://www.gesis.org/en/eurobarometer-data-service/survey-series/standard-special-eb/study-overview/eurobarometer-904-za7556-december-2018">Eurostat’s Eurobarometer survey</a> in 2018, 85.6% of Italy’s population is Christian, while 2.6% is a follower of other religions and 11.7% are non-religious. As we wanted to analyse any possible influence on pregnancy rates in young women with no distinction between the various faiths, we have decided to discard the special relation with the Catholic Church and have considered <b>general religious observance</b>.

Nevertheless, correlation is not causation, and there are surely many other factors contributing to pregnancy rates in young women: therefore, we have also included the <b>education level</b>, considering the early leavers from higher education (aged 18 to 24).


## Scenario

[<b>Istat</b>](https://www.istat.it/)  is the Italian National Institute of Statistics, the main producer of official statistics in the service of citizens and policy-makers. It is structured in several different databases that allow for browsing and downloading the data produced by the institute for free.<br>
Specifically, we have used both general and specific databases:
<ul>
<li><a href="http://dati.istat.it/?lang=en"><b>I.Stat</b></a>, a datawarehouse organised by theme, presented in multidimensional tables and with a wide range of standard metadata</li>
<li><a href="https://esploradati.istat.it/databrowser/#/"><b>IstatData</b></a>, the new database into which all I.Stat content will be gradually migrated (until the data transfer is completed, the two systems will coexist)</li>
<li><a href="https://demo.istat.it/?l=en"><b>Demo - Demography in figures</b></a>, providing official data on resident population in the Italian municipalities and information on main demographic phenomena</li>
</ul>
To make our project as accessible as possible even in the future we have preferred IstatData over I.Stat when possible.


### Statement of responsibility
Team member | Task | Contact
--- | --- | ---
Maddalena Ghiotto | Project Ideation — Data retrieval — Mashup datasets — Technical analysis — RDF assertion of the metadata — Sustainability of the update | [contact](maddalena.ghiotto@studio.unibo.it)
Chloe Papadopoulou | Project ideation — Data retrieval — Ethical analysis — Visualizations | [contact](chloi.papadopoulou@studio.unibo.it)
Orsola Maria Borrini | Project ideation — Data retrieval — Mashup datasets — Quality and legal analyses — Website development | [contact](orsolamaria.borrini@studio.unibo.it)


## Original and mashup datasets
The project comprises the use of <b>16 different datasets</b>, between source ones and mashup ones.

The <b>7 source datasets</b> have been downloaded in .csv format from different databases belonging to Istat:

Id | Dataset | Description (factor of interest) | Provenience | Link / Path
--- | --- | --- | --- | --- 
D1 | Population estimates 2002-2019 by age and sex at Jan 1st | POPULATION | demo | [Link](https://demo.istat.it/app/?i=RIC&l=en)
D2 | Resident population by age, sex and marital status on 1st January 2022 | POPULATION | demo | [Link](https://demo.istat.it/app/?i=POS&l=en)
D3 | Aspects of daily life: Religious observances - regions and type of municipality | RELIGION | I.Stat | [Link](http://dati.istat.it/index.aspx?queryid=24349)
D4 | Mother - Age and citizenship | PREGNANCY | IstatData | [Link](https://esploradati.istat.it/databrowser/#/en/dw/categories/IT1,POP,1.0/POP_BIRTHFERT/DCIS_NATI1/DCIS_NATI1_PARENT_CHARACT/IT1,25_74_DF_DCIS_NATI1_8,1.0)
D5 | Spontaneous abortions - resignation from the place of the event: Age of women - prov. | PREGNANCY | I.Stat | [Link](http://dati.istat.it/index.aspx?queryid=29218)
D6 | Induced abortions - Migration: Events by region of residence of the woman and region of intervention | PREGNANCY | I.Stat | [Link](http://dati.istat.it/index.aspx?queryid=7098)
D7 | Early leavers from education and training - aged 18 to 24 - previous regulation (until 2020) | EDUCATION | I.Stat | [Link](http://dati.istat.it/Index.aspx?DataSetCode=DCCV_ESL_UNT2020)

During the <b>download phase</b>, we have manually filtered out everything that was not of interest for our research, keeping only the data strictly related to our research question (we have, for example, discarded any information related to marital status in the datasets regarding population).

Still, the source datasets went through an additional <b>clean up phase</b> in which we discarded duplicate (e.g., columns with different names and values but referring to the same information) and irrelevant data and, when necessary, added missing "coded data" to allow for an easier management of the datasets.

Finally, we proceeded with the <b>mashup phase</b>, creating the final three main mashup datasets used to answer our research question. As with source and clean datasets, we distinguished between the three years of our time span of interest: in this way, we ended up with 9 final mashup datasets (three for each factor of interest).

Id | Dataset | Description (factor of interest) | Original source datasets | Year
--- | --- | --- | --- | ---
MD1_17 | Religious observance in each region | RELIGION - % of religious observance in each region (over the total population) | D1, D3 | 2017
MD1_18 | Religious observance in each region | RELIGION - % of religious observance in each region (over the total population) | D2, D3 | 2018
MD1_19 | Religious observance in each region | RELIGION - % of religious observance in each region (over the total population) | D2, D3 | 2019
MD2_17 | Pregnancy rates in young women in each region | PREGNANCY - % of pregnancies in young women (15-25) in each region (over the total population of young women aged 15-25) | D4, D5, D6 | 2017
MD2_18 | Pregnancy rates in young women in each region | PREGNANCY - % of pregnancies in young women (15-25) in each region (over the total population of young women aged 15-25) | D4, D5, D6 | 2018
MD2_19 | Pregnancy rates in young women in each region | PREGNANCY - % of pregnancies in young women (15-25) in each region (over the total population of young women aged 15-25) | D4, D5, D6 | 2019
MD3_17 | (Higher) education rates in young women in each region | EDUCATION - % of women early leavers (18-24) in each region (over the total population) | D1, D7 | 2017
MD3_18 | (Higher) education rates in young women in each region | EDUCATION - % of women early leavers (18-24) in each region (over the total population) | D2, D7 | 2018
MD3_19 | (Higher) education rates in young women in each region | EDUCATION - % of women early leavers (18-24) in each region (over the total population) | D2, D7 | 2019

The code and more detailed documentation for the clean up and mashup phases is freely donwloadable and can be found in `documentation > CLEAN.ipynb` and `documentation > MASHUP.ipynb`.

## Quality analysis
Following the Italian <a href="https://docs.italia.it/italia/daf/lg-patrimonio-pubblico/it/stabile/aspettiorg.html#qualita-dei-dati"><b>National Guidelines</b></a> ("Linee guida nazionali per la valorizzazione del patrimonio informativo pubblico"), developed in the context of the Data & Analytics Framework project by AgID and the Digital Transformation Team, we have performed a quality analysis of our source datasets to ensure their <b>good condition</b> and their <b>suitability</b> for the intended use.
Specifically, there are four main factors to look for when analysing data quality:
<ul>
<li><b>Accuracy (syntactic and semantic)</b>: the data and its attributes correctly represent the real value of the concept or event they refer to</li>
<li><b>Coherence</b>: the data and its attributes do not present any contradictions with respect to other data in the context of use by the administration owner</li>
<li><b>Completeness</b>: the data are exhaustive for what concerns every expected value and with respect to the related entities (sources) that contribute to the definition of the procedure</li>
<li><b>Timeliness (or promptness of updating)</b>: the data and its attributes refer to the "correct time" (up to date) with respect to the procedure they refer to</li>
</ul>

The following table showcases the quality of each of the source datasets and highlights possible flaws.
Id | Accuracy | Coherence | Completeness | Timeliness
--- | --- | --- | --- | --- 
D1 - Population 2017 | Satisfied | Satisfied | Satisfied | Satisfied
D2 - Population 2018, 2019 | Satisfied | Satisfied | Satisfied | Satisfied
D3 - Religious observance | Satisfied | Satisfied | Satisfied | Satisfied
D4 - Live births | Satisfied | Satisfied | Satisfied | Satisfied
D5 - Spontaneous abortions | Satisfied | Satisfied | Not satisfied | Satisfied
D6 - Induced abortions | Satisfied | Satisfied | Not satisfied | Satisfied
D7 - Early leavers from education | Satisfied | Satisfied | Satisfied | Satisfied


## Legal analysis
The legal analysis of the source datasets, fundamental to obtain <b>sustainability over time</b> of the production process and of the publication of datasets and to guarantee a <b>balanced service</b>  in compliance with the public function and with individual rights, was carried out using a reference checklist consisting of a series of binary questions regarding the topics of:
<ul>
<li><b>Privacy issues</b></li>
<li><b>IPR policy</b></li>
<li><b>Licences</b></li>
<li><b>Limitations on public access</b></li>
<li><b>Economical conditions</b></li>
<li><b>Temporal aspects</b></li>
</ul>

### Privacy Issues
To check: | D1 - Population 2017 | D2 - Population 2018, 2019 | D3 - Religious observance | D4 - Live births | D5 - Spontaneous abortions | D6 - Induced abortions | D7 - Early leavers from education
--- | --- | --- | --- | --- | --- | --- | --- 
Is the dataset free of any personal data as defined in the Regulation (EU) 2016/679? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Is the dataset free of any indirect personal data that could be used for identifying the natural person? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Is the dataset free of any particular personal data (art. 9 GDPR)? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Is the dataset free of any information that combined with common data available in the web, could identify the person? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Is the dataset free of any information related to human rights (e.g., refugees, witness protection, etc.) | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you use a tool for calculating the range of the risk of deanonymization? | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed 
Are you using geolocalization capabilities? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you check that the open data platform respect all the privacy regulations? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Did you know who is, in your open data platform, the Controller and Processor of the privacy data of the system? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Have you checked the privacy regulation of the country where the dataset are physically stored? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you have non-personal data? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 

### Intellectual Property Rights
To check: | D1 - Population 2017 | D2 - Population 2018, 2019 | D3 - Religious observance | D4 - Live births | D5 - Spontaneous abortions | D6 - Induced abortions | D7 - Early leavers from education
--- | --- | --- | --- | --- | --- | --- | --- 
Have you created and generated the dataset? | No | No | No | No | No | No | No 
Are you the owner of the dataset? | No | No | No | No | No | No | No 
Is the dataset free from third party licenses or patents? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Have you checked if there are any limitations in your national legal system for releasing some kind of datasets with open license? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 

### Licences
To check: | D1 - Population 2017 | D2 - Population 2018, 2019 | D3 - Religious observance | D4 - Live births | D5 - Spontaneous abortions | D6 - Induced abortions | D7 - Early leavers from education
--- | --- | --- | --- | --- | --- | --- | --- 
Did you release the dataset with an open data licence? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you include the clause: "In any case the dataset can't be used for re-identifying the person"? | No | No | No | No | No | No | No 
Did you release the API (in case you have it) with an open source license? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Have you checked that the open data/API platform licence regime is in compliance with your IPR policy? | Yes | Yes | Yes | Yes | Yes | Yes | Yes


### Limitations on public access
To check: | D1 - Population 2017 | D2 - Population 2018, 2019 | D3 - Religious observance | D4 - Live births | D5 - Spontaneous abortions | D6 - Induced abortions | D7 - Early leavers from education
--- | --- | --- | --- | --- | --- | --- | --- 
Did you check that the dataset concerns your institutional competences, scope and finality? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you check the limitations for the publication stated by your national legislation or by the EU directives? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you check if there are some limitations connected to the international relations, public security or national defence? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Did you check if there are some limitations concerning the public interest? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you check the international law limitations? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
Did you check the INSPIRE law limitations for the spatial data? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 


### Economical conditions
To check: | D1 - Population 2017 | D2 - Population 2018, 2019 | D3 - Religious observance | D4 - Live births | D5 - Spontaneous abortions | D6 - Induced abortions | D7 - Early leavers from education
--- | --- | --- | --- | --- | --- | --- | --- 
Did you check that the dataset could be released for free? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you check if there are some agreements with some other partners in order to release the dataset with a reasonable price? | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed 
Did you check if the open data platform terms of service include a clause of "non liability agreement" regarding the dataset and API provided? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
In case you decide to release the dataset to a reasonable price did you check if the limitation imposed by the new directive 2019/1024/EU are respected? | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed 
In case you decide to release the dataset to a reasonable price did you check the e-Commerce directive and regulation? | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed


### Temporary aspects
To check: | D1 - Population 2017 | D2 - Population 2018, 2019 | D3 - Religious observance | D4 - Live births | D5 - Spontaneous abortions | D6 - Induced abortions | D7 - Early leavers from education
--- | --- | --- | --- | --- | --- | --- | --- 
Did you have a temporary policy for updating the dataset? | No | Yes | Yes | Yes | Yes | Yes | Yes 
Did you have some mechanism for informing the end-user that the dataset is updated at a given time to avoid mis-usage and so potential risk of damage? | Yes | Yes | Yes | Yes | Yes | Yes | Yes 
Did you check if the dataset for some reason cannot be indexed by the research engines (e.g., Google, Yahoo, etc.)? | Yes | Yes | Yes | Yes | Yes | Yes | Yes
In case of personal data, do you have a reasonable technical mechanism for collecting request of deletion (e.g., right to be forgotten)? | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed | Not needed

## Ethical analysis


For the ethical analysis of our data, we considered the <a href=https://dataethics.eu/data-ethics-principles/>Data Ethics Principles and
Guidelines</a> and the Odi Project detailed <a href="https://theodi2022.wpengine.com/article/the-data-ethics-canvas-2021/#1674123020653-5e9f001c-5eb8" target="_blank">canvas</a> for evaluating the ethical aspects of our data processing.


          
Since both our source and mashup datasets contain Information provided exclusively by the Italian National
              Institute of Statistics, we initially focused on analysing the fairness of data collection and management by ISTAT, and then established guidelines for addressing ethical concerns when processing the source
              datasets for our project.
          
            

  <h3>Data Ethics Principles</h3>
  <ul>
<b>Human being at the center:</b>
As stated on the institution's <a href="https://www4.istat.it/en/about-istat"
  target="_blank">website</a>, ISTAT's policy is aligned with Ethical and legislative principles,
        with the primary aim to publish and communicate effectively statistical information and results of
        analyses conducted in order
        to foster awareness of Italy's conditions and to
        improve decision-making processes on the part of private subjects and public institutions. Furthermore,
        ISTAT is committed
        to conducting methodological, applied research with the aim of improving statistical production
        processes
        and
    enhancing Italy's statistical literacy.

<b>Transparency</b>: The data gathered is transparently managed, with ample <a href="https://siqual.istat.it/SIQual/elencoCompleto.dodocumentation" target="_blank">
      <b>documentation</a></b> made available to final users in order to avoid misenterpretations, regarding
    the data collection methods and the significance of the dataset terms licenses and policies to avoid
        misinterpretations.


<b>Accountability:</b>
    Istat quality policy is coherent with the European framework
    developed by Eurostat,
    adhering to the principles of the <a
      href="https://ec.europa.eu/eurostat/web/quality/european-quality-standards/european-statistics-code-of-practice"
      target="_blank">
      <b>European Statistics Code of Practice</b></a>, that ensures and strengthens both accountability
    and
    governance of
the European Statistical System and the National Statistical Systems inside it.


<li> <b>Individual data protection </b>
    Istat's datasets are anonymised, and as stated in their their <b><a
        href="  https://www.istat.it/it/censimenti/imprese/normativa-e-privacy" target="_blank">
        regulations
        and privacy page</b></a>, the institute respects the privacy of respondents, protects the
    confidentiality of the data
    that it gathers and
    carries
    out its activities in a transparent, independent manner.
    It is clearly stated that the information collected is protected by statistical
    confidentiality
    (Article 9 of Legislative Decree No. 322/1989) and subject to the legislation on the protection of
    personal data (Regulation (EU) 2016/679, Legislative Decree
    n. 196/2003, Legislative Decree n. 101/2018).
</ul>

  <h3>Ethical concerns and their management</h3>

Despite ISTAT's compliance with the ethics principles of data collection and management, the team placed
special importance
to the ethical handling of the source information given the great sensitivity its contents.

Data related to age, residence, religious observance and reproductive health are indeed highly sensible
and
the ethical aspect of their handling was carefully considered through the following steps:

<ul>


<li>Data  <b>integrity</b> and
<b>privacy</b> are furtherly respected
        through the process
        of aggregating source dataset values and providing
        them in percentage values, in order to avoid any correlation with real individuals.</li>

<li> Clear <b>boundaries</b>
 were set by intentionally omitting out available data by ISTAT, such as the
        citizenship
        distinction in all of our data, which could lead to possible discriminatory behavior.</li>

<li>The team's<b> purpose </b>was to discover the existence of
        possible patterns and not make any inferences
        or
        interpretations.</li>
<li>All relevant <b>documentation</b> regarding the data
        processing for creating the mashup datasets and visualisations is provided in our github website.</p>
      </li>
</ul>



## Technical analysis

All source dataset have been evaluated based on the <b><a href="https://docs.italia.it/italia/daf/lg-patrimonio-pubblico/it/stabile/modellometadati.html" target="_blank">metadata model</a> provided by AGID</b> that classifies metadata quality on a range of 4 levels according to two factors: <i>data-metadata bond</i> and <i>detail level</i>

Source datasets:
Id | Provenience | Format | Metadata | URI | Licence
--- | --- | --- | --- | --- | ---
D1 | [demo](https://demo.istat.it/?l=en) | .csv, .xlsx, .pdf | Level 2: A weak data-metadata bond since an external <b>pdf</b> with additional information and methodology reports is accessible; Dataset detail level, information are shared by all dataset data |  [Link](https://demo.istat.it/app/?i=RIC&l=en) | CC BY 3.0
D2 | [demo](https://demo.istat.it/?l=en) | .csv, .xlsx, .pdf | Level 1: Not provided |  [Link](https://demo.istat.it/app/?i=POS&l=en) | CC BY 3.0
D3 | [I.Stat](http://dati.istat.it/?lang=en) |  .csv, .xlsx, .px, .xml | Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable.<br> Level 2: Additional metadata to provide transparent information about sources and methodologies are available in a separated [webpage](https://siqual.istat.it/SIQual/visualizza.do?id=0058000), accessible through a sidebar menu |  [Link](http://dati.istat.it/index.aspx?queryid=24349) | CC BY 3.0
D4 | [IstatData](https://esploradati.istat.it/databrowser/#/) | .json, .xml, .xlsx, .csv | Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable. | [Link](https://esploradati.istat.it/databrowser/#/en/dw/categories/IT1,POP,1.0/POP_BIRTHFERT/DCIS_NATI1/DCIS_NATI1_PARENT_CHARACT/IT1,25_74_DF_DCIS_NATI1_8,1.0) | CC BY 3.0
D5 | [I.Stat](http://dati.istat.it/?lang=en) | .csv, .xlsx, .px, .xml | Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable.<br> Level 2: Additional metadata to provide transparent information about sources and methodologies are available in a separated [webpage](https://siqual.istat.it/SIQual/visualizza.do?id=5000132&refresh=true&language=EN), accessible through a sidebar menu |  [Link](http://dati.istat.it/index.aspx?queryid=29218) | CC BY 3.0
D6 | [I.Stat](http://dati.istat.it/?lang=en) | .csv, .xlsx, .px, .xml | Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable.<br> Level 2: Additional metadata to provide transparent information about sources and methodologies are available in a separated [webpage](https://siqual.istat.it/SIQual/visualizza.do?id=0038900&refresh=true&language=EN), accessible through a sidebar menu |  [Link](http://dati.istat.it/index.aspx?queryid=7098) | CC BY 3.0
D7 | [I.Stat](http://dati.istat.it/?lang=en) | .csv, .xlsx, .px, .xml | Level 4: An SDMX structured file is downloadable with a strong data-metadata bond and a datum-level detail of description. They are machine readable.<br> Level 2: Additional metadata to provide transparent information about sources and methodologies are available in a separated [webpage](https://siqual.istat.it/SIQual/sintesi.do?id=5000098), accessible through a sidebar menu |  [Link](http://dati.istat.it/Index.aspx?DataSetCode=DCCV_ESL_UNT2020) | CC BY 3.0

Mashup datasets:
Id | Creation date | Format | Metadata | URI | Licence
--- | --- | --- | --- | --- | ---
MD1 | creation_date | .csv | Provided | [MD1_17](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD1_17.csv), [MD1_18](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD1_18.csv), [MD1_19](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD1_19.csv) | CC BY 4.0
MD2 | creation_date | .csv | Provided | [MD2_17](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD2-PERC-2017.csv), [MD2_18](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD2-PERC-2018.csv), [MD2_19](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD2-PERC-2019.csv) | CC BY 4.0
MD3 | creation_date | .csv | Provided | [MD3_17](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD3_17.csv), [MD3_18](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD3_18.csv), [MD3_19](https://github.com/OrsolaMBorrini/blessedfruit/blob/main/data/mashupDS/MD3_19.csv) | CC BY 4.0

## Visualizations

The visualisation of the final datasets includes:
<li> Data processing, in order to create json files with the data organised in the appropriate manner.</li>
<li> Data visualisation using different graphs and various libraries.</li>
    
In detail, the following libraries were used:
<li> leaflet.js - for creating the choropleth maps - <a href= "https://github.com/Leaflet/Leaflet/blob/main/LICENSE">BSD 2-Clause "Simplified" License</a></li>
<li> plotly.js - for the bar and bubble charts - <a href="https://github.com/plotly/plotly.js/blob/master/LICENSE">MIT License</a></li>
<li> amcharts - for the pie charts regarding pregnancy statistics - <a href="https://github.com/amcharts/amcharts5/blob/master/packages/shared/LICENSE">linkware
                license</a>.</li>


<br>
The code for creating the json files can be found in our github repository's <a href="https://github.com/OrsolaMBorrini/blessedfruit/tree/main/visualisations">scripts section</a>;

The javascript files for creating and displaying the visualisations can be found in the <a href="https://github.com/OrsolaMBorrini/blessedfruit/tree/main/assets/js"> assets folder</a> of our website.

## RDF Assertion of the metadata

All produced mashup dataset have been thoroughly described with metadata, following the specification of <a href="https://docs.italia.it/italia/daf/linee-guida-cataloghi-dati-dcat-ap-it/it/stabile/index.html" target="_blank"><b>DCAT-AP_IT</b></a> standard as recommended by <b>AGID's public information heritage valorization guidelines</b>.<br><br>
Since all our datasets contain data of specific national interest and are derived from Istat datasets, which is an italian public research institution we decided to adopt <b>DCAT-AP_IT</b> (2016), the national standard. <br>Although it is based on the first version of the european standard DCAT and has more constraints, compared to the more flexible and recent european standard <a href="https://www.w3.org/TR/vocab-dcat-2/" target="_blank">DCAT-AP 2.0</a>, we considered DCAT-AP_IT the more suitable standard for our mashuo datasets: on the one hand, because in the italian public sector <b>an increasing number of Public Administrations are adopting DCAT-AP_IT</b></span>; on the other hand, because this allowed us to follow more detailed national guidelines and therfore <b>ensure interoperability and harmonization with other data on a national level</b>.
            
Moreover:<br>

* To describe in full transparency the sources and the activities underlying the creation of our mashup datasets we adopted <a href="https://www.w3.org/TR/prov-o/" target="_blank"><b>PROV-O - the provenance ontology</b></a> as strongly recommended on a european level and also allowed by DCAT-AP_IT <a href="https://docs.italia.it/italia/daf/lg-patrimonio-pubblico/it/stabile/modellometadati.html" target="_blank">metadata model</a>
* Since all our mashup datasets are series containing individual datasets for each year (2017, 2018, 2019) and only DCAT-AP 3.0 currently provides a <code>dcat:DatasetSeries</code> with related properties, again, we followed AGID's <a href="https://docs.italia.it/italia/daf/lg-patrimonio-pubblico/it/stabile/modellometadati.html" target="_blank">metadata model</a> instruction about how to handle relationships between datasets.<br>

We emphasized individual elements of the serie and created, inside each mashup dataset RDF assertion, a triple with a Serie Dataset subject, connected through the Dublin Core property <code>dct:type</code> to the value &lt;http://inspire.ec.europa.eu/metadata-codelist/ResourceType/series&gt; .<br> 

Then we specified which dataset belonged to the Serie by means of <code>dct:hasPart</code> property.<br>

Finally, every individual yearly Mashup dataset, it's connected in its turn with the related Serie by
means of <code>dct:isPartOf</code>

Find the downloadable RDF assertions on <a href="https://orsolamborrini.github.io/blessedfruit/" target="_blank">Blessed be the fruit</a>

## Sustainability of the update

The source datasets developed for "Blessed be the fruit" are provided exclusively by the Italian National Institute of Statistics (Istat), which maintains them in its various databases. Given the current situation of I.Stat, which content will be soon moved to IstatData, the URIs provided in this project for the source datasets D3, D5, D6, and D7 will eventually become obsolete .

However, "Blessed be the fruit" is the final project developed for the "Open Access and Digital Ethics" course (a.a. 2022/2023) within the Digital Humanities and Digital Knowledge Masters Degree (University of Bologna), and, as such is not actively maintained and will not be updated in the future.