<?xml version="1.0" encoding="utf-8" standalone="yes" ?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom">
  <channel>
    <title>Posts | Green Deal Data Observatory</title>
    <link>https://greendeal.dataobservatory.eu/post/</link>
      <atom:link href="https://greendeal.dataobservatory.eu/post/index.xml" rel="self" type="application/rss+xml" />
    <description>Posts</description>
    <generator>Wowchemy (https://wowchemy.com)</generator><language>en-us</language><lastBuildDate>Wed, 09 Nov 2022 11:46:00 +0100</lastBuildDate>
    <image>
      <url>https://greendeal.dataobservatory.eu/media/icon_hu15ef3b829c0a4063327dbf09185a10cc_70008_512x512_fill_lanczos_center_3.png</url>
      <title>Posts</title>
      <link>https://greendeal.dataobservatory.eu/post/</link>
    </image>
    
    <item>
      <title>New Data Curators Wanted</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-11-09_become_data_curator/</link>
      <pubDate>Wed, 09 Nov 2022 11:46:00 +0100</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-11-09_become_data_curator/</guid>
      <description>&lt;p&gt;A data curator is a contributor in our open collaboration who will be named as a co-creator of tidy, standardized, reusable, FAIR, datasets in his/her field of expertise.  Our curators help us vocalize the needs of their domain, be it data-driven beekeeping, or detecting algorithmic biases of recommender systems, and evaluates if the data that we come up with is directly usable and actionable. A data curator is a similar co-author as a “contributor” to open source software or a co-author of a journal article.&lt;/p&gt;
&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#boost-your-career-without-a-conflict-of-interest&#34;&gt;Boost your career without a conflict of interest&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;

  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#how-to-become-a-data-curator&#34;&gt;How to become a data curator?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#find-inspiration-from-other-contributors&#34;&gt;Find inspiration from other contributors&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#why-data-observatories&#34;&gt;Why data observatories?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#good-to-know&#34;&gt;Good to know&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#watch-our-2-min-introduction&#34;&gt;Watch Our 2-min Introduction&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#vote-reprex-&#34;&gt;Vote Reprex :)&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;

&lt;h2 id=&#34;boost-your-career-without-a-conflict-of-interest&#34;&gt;Boost your career without a conflict of interest&lt;/h2&gt;
&lt;p&gt;Being a data curator does not mean a commercial affiliation with any observatory partners, it is an affiliation to jointly create intellectual property.  All our data curators are identified by their ORCiD ideas and named as co-creators in the open science repositories where we make our data available.&lt;/p&gt;
&lt;p&gt;We create CC0 data that can be used for commercial, academic, and policy purposes.
However, we want to honor the intellectual investment into a shared intellectual property by&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; delaying the release (for remaining competitive in academic publishing, if our curator is using the data in new articles; or NGOs for their campaign)&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; creating hybrid assets for commercial users where some elements, particularly the ones that use their proprietary data, may not become open data.&lt;/li&gt;
&lt;/ul&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-to-become-a-data-curator-you-do-not-need-to-be-a-data-scientist-a-statistician-or-a-data-engineer-we-are-looking-for-professionals-researchers-or-citizen-scientists-who-are-interested-in-data-and-its-visualization-and-its-potential-to-form-the-basis-of-informed-business-or-policy-decisions-and-to-provide-scientific-or-legal-evidence-our-ideal-curators-share-a-passion-for-data-driven-evidence-or-visualizations-and-have-a-strong-subjective-idea-about-the-data-that-would-inform-them-in-their-work-more-storieshttpscuratorsdataobservatoryeuinspirationhtml&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;To become a data curator, you do not need to be a data scientist, a statistician, or a data engineer. We are looking for professionals, researchers, or citizen scientists who are interested in data and its visualization, and its potential to form the basis of informed business or policy decisions and to provide scientific or legal evidence. Our ideal curators share a passion for data-driven evidence or visualizations, and have a strong, subjective idea about the data that would inform them in their work. [More stories:](https://curators.dataobservatory.eu/inspiration.html)&#34; srcset=&#34;
               /media/img/blogposts_2022/schmidt_pain_index_hud3f80cc147c2f7adb46afab3af6a506c_252346_e52e4bf4ebb66de2017e05631751b533.webp 400w,
               /media/img/blogposts_2022/schmidt_pain_index_hud3f80cc147c2f7adb46afab3af6a506c_252346_78606c8ad7b3d6c683e968255240eb6d.webp 760w,
               /media/img/blogposts_2022/schmidt_pain_index_hud3f80cc147c2f7adb46afab3af6a506c_252346_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/schmidt_pain_index_hud3f80cc147c2f7adb46afab3af6a506c_252346_e52e4bf4ebb66de2017e05631751b533.webp&#34;
               width=&#34;760&#34;
               height=&#34;380&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      To become a data curator, you do not need to be a data scientist, a statistician, or a data engineer. We are looking for professionals, researchers, or citizen scientists who are interested in data and its visualization, and its potential to form the basis of informed business or policy decisions and to provide scientific or legal evidence. Our ideal curators share a passion for data-driven evidence or visualizations, and have a strong, subjective idea about the data that would inform them in their work. &lt;a href=&#34;https://curators.dataobservatory.eu/inspiration.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;More stories:&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;h1 id=&#34;fair&#34;&gt;FAIR: Findable, Accessible, Interoperable, and Reusable Digital Assets&lt;/h1&gt;
&lt;p&gt;Our observatories do not &lt;em&gt;only&lt;/em&gt; work with open data.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-we-want-to-make-data-findable-interoperable-to-be-used-in-many-applications-accessible-and-eventually-reusable-fairhttpswwwgo-fairorgfair-principles-but-that-does-not-mean-that-all-data-used-must-be-free--creating-and-especially-regularly-updating-high-quality-data-assets-requires-plenty-of-intellectual-and-monetary-investment&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We want to make data findable, interoperable (to be used in many applications), accessible, and eventually reusable ([FAIR](https://www.go-fair.org/fair-principles/)), but that does not mean that all data used must be free.  Creating and especially regularly updating high-quality data assets requires plenty of intellectual and monetary investment.&#34; srcset=&#34;
               /media/img/logos/go_fair_hu82cc98f87d90836633a3f79ca5da135b_354091_9118d9a294dbb1c4dc45d41f8a9e30a9.webp 400w,
               /media/img/logos/go_fair_hu82cc98f87d90836633a3f79ca5da135b_354091_8cc792113d369e6e2dcf38f58a42cbcb.webp 760w,
               /media/img/logos/go_fair_hu82cc98f87d90836633a3f79ca5da135b_354091_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/logos/go_fair_hu82cc98f87d90836633a3f79ca5da135b_354091_9118d9a294dbb1c4dc45d41f8a9e30a9.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      We want to make data findable, interoperable (to be used in many applications), accessible, and eventually reusable (&lt;a href=&#34;https://www.go-fair.org/fair-principles/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;FAIR&lt;/a&gt;), but that does not mean that all data used must be free.  Creating and especially regularly updating high-quality data assets requires plenty of intellectual and monetary investment.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;We gladly add commercially available data to our observatory if we can share a large enough subset that our peer-reviewers can attest to the data’s high quality, usability, and actionability.&lt;/p&gt;
&lt;h2 id=&#34;how-to-become-a-data-curator&#34;&gt;How to become a data curator?&lt;/h2&gt;
















&lt;figure  id=&#34;figure-our-handbook-for-curators-a-bit-of-a-work-in-progress-but-the-onboarding-processhttpscuratorsdataobservatoryeuonboardinghtml-is-clear-do-not-worry-if-you-do-not-use-github-it-is-not-necessary-but-we-story-and-co-create-our-assets-including-the-curators-handbook-on-this-digital-co-working-place&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our handbook for curators a bit of a work in progress, but the [onboarding process](https://curators.dataobservatory.eu/onboarding.html) is clear. Do not worry if you do not use GitHub, it is not necessary, but we story and co-create our assets, including the curator&amp;#39;s handbook on this digital co-working place.&#34; srcset=&#34;
               /media/img/screenshots/curators_handbook_huef55d2fed0e639025b1c6d353d865d8d_220858_7335ec2e8cd460e0e5b0dd0ac54a5328.webp 400w,
               /media/img/screenshots/curators_handbook_huef55d2fed0e639025b1c6d353d865d8d_220858_37c2621cb494a331a31400030919a138.webp 760w,
               /media/img/screenshots/curators_handbook_huef55d2fed0e639025b1c6d353d865d8d_220858_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/screenshots/curators_handbook_huef55d2fed0e639025b1c6d353d865d8d_220858_7335ec2e8cd460e0e5b0dd0ac54a5328.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Our handbook for curators a bit of a work in progress, but the &lt;a href=&#34;https://curators.dataobservatory.eu/onboarding.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;onboarding process&lt;/a&gt; is clear. Do not worry if you do not use GitHub, it is not necessary, but we story and co-create our assets, including the curator&amp;rsquo;s handbook on this digital co-working place.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; This is an &lt;a href=&#34;https://curators.dataobservatory.eu/onboarding.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;open book&lt;/a&gt; that we co-create on GitHub, and if you find any roadblocks, you do not understand something, or have a better idea on how to illustrate or explain things, just make a for to this &lt;a href=&#34;https://github.com/dataobservatory-eu/curators/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;repo&lt;/a&gt;, improve it, add new photos, and send us a pull request. (You need an invite first for editing!)&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Here is a starter &lt;a href=&#34;https://github.com/dataobservatory-eu/new-contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;repository&lt;/a&gt; on GitHub. Not mandatory, but if you use GitHub, start here.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;In a nutshell:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Please read the &lt;a href=&#34;https://www.contributor-covenant.org/version/2/1/code_of_conduct/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;entire covenant
here&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; We need a very brief biography. Name, affiliation, education details, one-line and short biography. Please, send back this &lt;a href=&#34;https://raw.githubusercontent.com/dataobservatory-eu/new-contributors/main/biography/bio_template.txt&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;bio_template.txt text
file&lt;/a&gt;. If you know markdown, use &lt;a href=&#34;https://github.com/dataobservatory-eu/new-contributors/blob/main/biography/_index.md&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;this
version&lt;/a&gt;.
The files are identical, but your word processor may not know how to
open an .md file.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Your &lt;a href=&#34;https://orcid.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ORCiD&lt;/a&gt; to resolve ambiguity with
similarly named people. You may use different library or publication
service IDs, such as Google Scholar, Publeon, etc, you may provide
them, too, but we do need an ORCiD ID, because most of the EU open
science infrastructure and the R ecosystem uses this one. If you do
not have it, please create one—it only takes a few minutes. Please
add it to the bio_template.txt.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Your LinkedIn ID, add it to the bio_template.txt.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; You should follow our file naming conventions, and avoid the use
of special characters in any file names at all times: &lt;space&gt;, &lt;code&gt;$&lt;/code&gt;,
&lt;code&gt;:&lt;/code&gt;,&lt;code&gt;;&lt;/code&gt;,&lt;code&gt;,&lt;/code&gt;,&lt;code&gt;.&lt;/code&gt;, &lt;code&gt;&amp;quot;&lt;/code&gt;, &lt;code&gt;&#39; tick&lt;/code&gt; or backtick.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; You must send a ile picture that is at least 500px wide (jpg or png format.) It can be bigger, and preferably not a very &amp;ldquo;narrow&amp;rdquo; cut, as all avatars will be behind a circular mask (see &lt;a href=&#34;https://greendeal.dataobservatory.eu/#partners&#34;&gt;other curators&lt;/a&gt;.)&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;find-inspiration-from-other-contributors&#34;&gt;Find inspiration from other contributors&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-08-data-curator-karel-volckaert/&#34;&gt;Credibility is Enhanced Through Cross Links Between Different Data from Different Domains&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;post/2021-06-10-founder-daniel-antal/&#34;&gt;Open Data is Like Gold in the Mud Below the Chilly Waves of Mountain Rivers&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;post/2021-06-09-team-annette-wong/&#34;&gt;Educate and Train Data Admirers that Data is not&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-08-developer-botond-vitos/&#34;&gt;Developing an Open API is the Right Direction&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-07-data-curator-pyry-kantanen/&#34;&gt;Comparing Data to Oil is a Cliché: Crude Oil Has to Go Through a Number of Steps and Pipes Before it Becomes Useful&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-07-introducing-suzan-sidal/&#34;&gt;We Need More Reliable Datasets on the Urban Heat Resilience and Disaster Risk Reduction&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;why-data-observatories&#34;&gt;Why data observatories?&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Our &lt;code&gt;data observatories&lt;/code&gt; (platform products) cover our R&amp;amp;D and platform costs while giving us access to an expanding range of prime clients. We use 21-st century open-source data engineering solutions, a decentralized data governance method, and web 3.0 technologies to avoid conflicts of interest and prevent the data Sisyphus of error-prone human data wrangling.  There is little competition on this service level (there are about 60 UN/EU/OECD recognized data observatories, and almost all of them are managed by a different operator.)  This layer is already monetized, and we have proven success. Our unique advantage is a combination of legal and technological skills: understanding legally open data, web 3.0, and data modeling, and the ability to participate in the open-source statistical /scientific software creator community.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We create &lt;code&gt;open-source software applications&lt;/code&gt; that fuel our data observatories with unprocessed, open, linked data. We create software for the R statistical environment, which is used in both official statistics and in many business and academic organizations. The production of R software components is a competitive field, but we believe that our position is strong: the vast majority of R packages are lightly or not at all serviced because of the lack of financing.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
















&lt;figure  id=&#34;figure-reprex-produces-open-source-scientific-softwarehttpsreprexnlreleases-and-various-collaborative-data-engineering-infrastructures-to-get-legally-open-governmental-data-and-open-science-data-in-a-timely-usable-format-to-ecological-researchers-and-ecotech-innovators&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Reprex produces [open-source scientific software](/https://reprex.nl/#releases), and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, and ecotech innovators.&#34; srcset=&#34;
               /media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_56c9b4c03a282adb587dce3e55b03854.webp 400w,
               /media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_48db3882e8585d62f0962a3ef76c04e4.webp 760w,
               /media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_1200x1200_fit_q75_h2_lanczos_2.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_56c9b4c03a282adb587dce3e55b03854.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Reprex produces &lt;a href=&#34;https://greendeal.dataobservatory.eu/https://reprex.nl/#releases&#34;&gt;open-source scientific software&lt;/a&gt;, and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, and ecotech innovators.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;
&lt;p&gt;We provide &lt;code&gt;bespoke analytics solutions&lt;/code&gt; to our institutional partners in our data observatories. Such bespoke solutions iterate over our existing software components, helping us design better applications within an ever-expanding ecosystem. Providing tailored data-science services would require a large organization without a clear focus. We provide these services on an ad-hoc basis only among institutional partners and users of our data observatories. In these circles, which are often prime clients, we face little or no competition because we are trusted partners and data and solution providers. This is a key to our revenue and market growth.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We develop high-value &lt;code&gt;software-as-service applications&lt;/code&gt; that leverage our data observatory assets and our software solution into a novel, commercially valuable uses. Our applications are built around our family of open-source software and generalize our bespoke analytics solutions. We are in a late prototype phase where we already have some revenue and are trying to prepare for scaling up at the correct price with three of our applications. All of our applications are entering into highly competitive market segments. We are building on our ‘unfair’ advantage that we are bundling our solutions with data that is not accessible to competitors, and we can test them in the protected ecosystems of our observatories.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;good-to-know&#34;&gt;Good to know&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a href=&#34;https://www.go-fair.org/fair-principles/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;FAIR Principles&lt;/a&gt;:
improve the Findability, Accessibility, Interoperability, and Reuse
of digital assets.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a href=&#34;https://support.datacite.org/docs/datacite-metadata-schema-44&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;DataCite&lt;/a&gt;:
A persistent, standardized approach to access, identification,
sharing, and re-use of datasets—this is our favored way of
describing data for future use according to the FAIR principles.
Many EU open science repositories will ask your publications with
this documentation.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Biblatex is a standard text file used by citation engines,
bibliography management tool, and in scientific publication
templates. (See for example the Overleaf &lt;a href=&#34;https://www.overleaf.com/learn/latex/Articles/Getting_started_with_BibLaTeX&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Biblatex
tutorial&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;input disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Dublin Core is an older international standard than DataCite,
but the two standards greatly overlap. Dublin Core was originally
developed by libraries. You often may need to fill out Dublin Core
properties for publication.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;watch-our-2-min-introduction&#34;&gt;Watch Our 2-min Introduction&lt;/h2&gt;

&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
  &lt;iframe src=&#34;https://www.youtube.com/embed/bgp-n55TKCk&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;⚙️/ Subtitles/ 🇳🇱 🇬🇧 🇧🇦 🇨🇿 🇭🇺 🇩🇪 🇱🇹 🇫🇷 🇸🇰 🇪🇸 🇹🇷 + Catalan.&lt;/p&gt;
&lt;h2 id=&#34;vote-reprex-&#34;&gt;Vote Reprex :)&lt;/h2&gt;
&lt;p&gt;Go to &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2022-11-07_vote_reprex/&#34;&gt;Cast your vote for The Hague Innovators challenge 2022!&lt;/a&gt; and choose Reprex :)&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Big Data for All: Building Collaborative Data Observatories</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-11-03_ehv_innovation_cafe/</link>
      <pubDate>Thu, 03 Nov 2022 17:30:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-11-03_ehv_innovation_cafe/</guid>
      <description>&lt;p&gt;Reprex&amp;rsquo;s co-founder, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/daniel_antal&#34;&gt;Daniel Antal&lt;/a&gt; talked in the &lt;a href=&#34;https://www.ehvinnovationcafe.org/past-events/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eindhoven Innovation Café&lt;/a&gt; about these issues. You can watch the recorded version of the the livestream that starts at 5 minutes and 22 seconds:&lt;/p&gt;

&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
  &lt;iframe src=&#34;https://www.youtube.com/embed/kM54gAAbHY0&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;&lt;em&gt;This is a past event&lt;/em&gt;. Check out our forthcoming &lt;a href=&#34;https://greendeal.dataobservatory.eu/#talks&#34;&gt;events&lt;/a&gt; or write to &lt;a href=&#34;https://www.linkedin.com/in/antaldaniel/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;
  &lt;i class=&#34;fab fa-linkedin  pr-1 fa-fw&#34;&gt;&lt;/i&gt; Daniel Antal&lt;/a&gt;  or to &lt;a href=&#34;https://keybase.io/antaldaniel&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;
  &lt;i class=&#34;fab fa-keybase  pr-1 fa-fw&#34;&gt;&lt;/i&gt; antaldaniel&lt;/a&gt;. Or send an &lt;a href=&#34;https://greendeal.dataobservatory.eu/contact/&#34;&gt;
  &lt;i class=&#34;fas fa-envelope  pr-1 fa-fw&#34;&gt;&lt;/i&gt; email&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;the-event-invitation-text-and-links&#34;&gt;The event invitation text and links&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;Big data and AI creates inequalities&lt;/code&gt;. It puts historically marginalized people, like ethnic minorities, and womxn, at a disadvantage. Because AI and checking on AI require plenty of data, usually only giant corporations, the wealthiest governments, and university entities can make it work for them. Reprex is a Hague-based, international startup that wants to impact various sustainable development goals by enabling smaller organizations to join their smaller datasets, use open data, create linked available data, and collaboratively make a change.&lt;/p&gt;
&lt;p&gt;Reprex is a finalist for the &lt;code&gt;Hague Innovation Award&lt;/code&gt; for impact startup (please 🙏, &lt;a href=&#34;https://reprex.nl/post/2022-10-29_reprex-talk-to-all/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;vote for us&lt;/a&gt;!). Daniel Antal, one of the co-founders, will talk about their approach to building an international coalition of music organizations to pool data and challenge data monopolies using organizational techniques, a collaboration ethos, and data from the open-source developer world.&lt;/p&gt;
&lt;p&gt;Using the example of independent music creators, who often find themselves in a position where it is more expensive to claim their money from global platforms, he will talk about how to reduce inequalities in the world of big data and AI with collaboration on web 3.0. In the Q&amp;amp;A he will take questions on how to apply their know-how, and generally linked open data to other art+tech or creative segments or problems for which everybody is too small, like meeting the Paris Accord greenhouse gas targets bit by bit, small company by small company.&lt;/p&gt;
&lt;h2 id=&#34;in-the-qa-we-can-discuss-many-things&#34;&gt;In the Q&amp;amp;A, we can discuss many things&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; How can Reprex help an individual creator in music, or in fashion and design, or any other area?&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; What sort of help it can give to researchers, research institutes, specialist consultancies, law firms, and other knowledge-based actors?&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;What sort of partners is &lt;a href=&#34;https://reprex.nl/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Reprex&lt;/a&gt; looking for in &lt;code&gt;Eindhoven&lt;/code&gt;?&lt;/p&gt;
&lt;h2 id=&#34;check-out-our-projects&#34;&gt;Check out our projects&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; and &lt;a href=&#34;https://music.dataobservatory.eu/project/listen-local/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Listen Local&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; &lt;a href=&#34;https://ccsi.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Cultural &amp;amp; Creative Sectors and Industries Observatory&lt;/a&gt; and short call for potential partners.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; G&lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;reen Deal Data Observatory&lt;/a&gt; and simple, connected, financial and sustainability reporting for creative enterprises and others&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;reprex-the-impact-startup&#34;&gt;Reprex: the impact startup&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Check out our accomplishments since the foundation in 2020&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>BeeSage: Data-driven Beekeeping for Productivity and Sustainability</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-10-31_beesage/</link>
      <pubDate>Mon, 31 Oct 2022 12:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-10-31_beesage/</guid>
      <description>&lt;p&gt;&lt;strong&gt;BeeSage&lt;/strong&gt; is an early stage startup which is contesting the &lt;a href=&#34;https://www.impactcity.nl/en/service/the-hague-innovators-challenge/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;The Hague Innovators Award&lt;/a&gt; in the pre-startup category. They are evangelizing data-driven beekeeping for productivity and sustainability, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/reprex&#34;&gt;Reprex&lt;/a&gt; is an impact scale-up in the other category of this competition for more mature startups that is building collaborative, open scholarly data infrastructure, so-called data observatories to support the needs of various data-driven policy, business, or scientific innovation.&lt;/p&gt;
&lt;p&gt;We met in the &lt;code&gt;ImpactCity&lt;/code&gt; initiative of The Hague, and we&amp;rsquo;d like to build on The Hague impact startup ecosystem by combining our respective strengths so we decided to join forces! We both want to win in The Hague Innovators’ Challenge in 2022, but we only compete for the votes of the audience. Through our cooperation, we would like to increase the viability of BeeSage in the pre-startup category and increase the value proposition of Reprex in the startup category to the jury in ImpactFest.&lt;/p&gt;
















&lt;figure  id=&#34;figure-remote-team-of-engineers-in-the-netherlands-latvia-and-portugal&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Remote team of engineers in the Netherlands, Latvia and Portugal.&#34; srcset=&#34;
               /media/img/blogposts_2022/beesage_team_huedf14d05dcfb520f96938797295da472_403435_8f2ba27f10054a4e4a3174144083d5e5.webp 400w,
               /media/img/blogposts_2022/beesage_team_huedf14d05dcfb520f96938797295da472_403435_109d4c6d8ed41d11124419fc9460ddcb.webp 760w,
               /media/img/blogposts_2022/beesage_team_huedf14d05dcfb520f96938797295da472_403435_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/beesage_team_huedf14d05dcfb520f96938797295da472_403435_8f2ba27f10054a4e4a3174144083d5e5.webp&#34;
               width=&#34;760&#34;
               height=&#34;570&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Remote team of engineers in the Netherlands, Latvia and Portugal.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;BeeSage modular beehive monitoring system boosts productivity and honey yield to benefit the Earth through data-driven pollination. Their software and hardware such as Smart Beehive Scales help beekeepers mitigate risks and enable informed decisions, while turning every apiary into a weather station.&lt;/p&gt;
&lt;p&gt;They are also building &lt;code&gt;HiveMap&lt;/code&gt; as a data analytics software, which enables beekeeper associations, environmental organizations and other stakeholders to turn beehive and remove sensing data into valuable insights. This product can be greatly enhanced through the latest data from Europe’s Copernicus satellites, from meteorological and air pollution sources. This is where Reprex’s &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt; comes into the picture.&lt;/p&gt;
















&lt;figure  id=&#34;figure-beesage-is-building-hivemap-as-a-data-analytics-software&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;BeeSage is building HiveMap as a data analytics software.&#34; srcset=&#34;
               /media/img/blogposts_2022/BeeSage_hivemap_hu42f2acc367849ed681f57722b67374e5_2528808_b300b2b2a02bbada627b29cee94a0042.webp 400w,
               /media/img/blogposts_2022/BeeSage_hivemap_hu42f2acc367849ed681f57722b67374e5_2528808_f71d20f22daca28bd41fca303e42b8f1.webp 760w,
               /media/img/blogposts_2022/BeeSage_hivemap_hu42f2acc367849ed681f57722b67374e5_2528808_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/BeeSage_hivemap_hu42f2acc367849ed681f57722b67374e5_2528808_b300b2b2a02bbada627b29cee94a0042.webp&#34;
               width=&#34;760&#34;
               height=&#34;458&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      BeeSage is building HiveMap as a data analytics software.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Beekeepers associations, small research groups, or Ecotech startups cannot afford to build a full team of data engineers, data scientists, and statisticians to tap into components of raw inflation data (to find honey price or food value chain data), to create validated and continuously maintained data pipelines from environmental satellites, the data warehouses of Eurostat and the European Environmental Agency. They cannot hire small-area statisticians and ecological regression experts to create ecological and key business indicators for small tracts of land that are directly relevant to the health of a honeybee colony.&lt;/p&gt;
&lt;p&gt;Reprex produces open-source scientific software, and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, beekeeper associations, and Ecotech startups like BeeSage. The European Union legally opened up vast arrays of scientific and governmental data sources for commercial and scientific reuse, but investment into re-processing and validating data that was originally collected for a different primary cause requires plenty of private investment.&lt;/p&gt;
















&lt;figure  id=&#34;figure-reprex-produces-open-source-scientific-softwarehttpsreprexnlreleases-and-various-collaborative-data-engineering-infrastructures-to-get-legally-open-governmental-data-and-open-science-data-in-a-timely-usable-format-to-ecological-researchers-and-ecotech-innovators&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Reprex produces [open-source scientific software](/https://reprex.nl/#releases), and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, and ecotech innovators.&#34; srcset=&#34;
               /media/img/package_screenshots/regions_package_20221101_16x9_hu243b93523dd1d220293cadf15db845f9_151150_843634b2cd40dbbd4f49d648f820996d.webp 400w,
               /media/img/package_screenshots/regions_package_20221101_16x9_hu243b93523dd1d220293cadf15db845f9_151150_199372ba733c18e0f3a1d7c7ba06cdf5.webp 760w,
               /media/img/package_screenshots/regions_package_20221101_16x9_hu243b93523dd1d220293cadf15db845f9_151150_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/package_screenshots/regions_package_20221101_16x9_hu243b93523dd1d220293cadf15db845f9_151150_843634b2cd40dbbd4f49d648f820996d.webp&#34;
               width=&#34;760&#34;
               height=&#34;376&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Reprex produces &lt;a href=&#34;https://greendeal.dataobservatory.eu/https://reprex.nl/#releases&#34;&gt;open-source scientific software&lt;/a&gt;, and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, and ecotech innovators.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Reprex’s data observatories, particularly the &lt;a href=&#34;https://greendeal.dataobservatory.eu/#slider&#34;&gt;Green Deal Data Observatory&lt;/a&gt; are public-private partnerships that foster the collective collection, processing, peer-review, and reuse of novel big data, like BeeSage’s beehive data, and reusable statistical and environmental data. We hope to place the permanent institution of this PPP in the Hague, which is already the &lt;a href=&#34;https://thehague.com/businessagency/the-hague-the-winner-world-smart-city-award-2021&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;World&amp;rsquo;s Smartest City&lt;/a&gt;, and which wants to remain a global centre of excellence of peace, justice, and sustainability in the era of big data and AI.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Reprex: Big Data For All</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-11-07_vote_reprex/</link>
      <pubDate>Mon, 31 Oct 2022 12:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-11-07_vote_reprex/</guid>
      <description>&lt;p&gt;&lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/reprex&#34;&gt;Reprex&lt;/a&gt; is the Hague-based impact startup developing decentralized, modern, web 3.0-compatible data observatories. Our mission is to fulfill parts of the SDG 16 and 17 goals: based on the open collaboration method of open-source software development and open knowledge management, we would like to enable impact makers to contribute to other SDG goals by making AI and big data work for them.&lt;/p&gt;
&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#watch-our-2-min-introduction&#34;&gt;Watch Our 2-min Introduction&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#how-to-vote&#34;&gt;How To Vote?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#what-do-we-do-in-the-european-green-deal&#34;&gt;What do we do in the European Green Deal?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#product&#34;&gt;Product&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#plans-in-the-hague&#34;&gt;Plans in The Hague?&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;

&lt;h2 id=&#34;watch-our-2-min-introduction&#34;&gt;Watch Our 2-min Introduction&lt;/h2&gt;

&lt;div style=&#34;position: relative; padding-bottom: 56.25%; height: 0; overflow: hidden;&#34;&gt;
  &lt;iframe src=&#34;https://www.youtube.com/embed/bgp-n55TKCk&#34; style=&#34;position: absolute; top: 0; left: 0; width: 100%; height: 100%; border:0;&#34; allowfullscreen title=&#34;YouTube Video&#34;&gt;&lt;/iframe&gt;
&lt;/div&gt;

&lt;p&gt;⚙️/ Subtitles/ 🇳🇱 🇬🇧 🇧🇦 🇨🇿 🇭🇺 🇩🇪 🇱🇹 🇫🇷 🇸🇰 🇪🇸 🇹🇷 + Catalan.&lt;/p&gt;
&lt;h2 id=&#34;how-to-vote&#34;&gt;How To Vote?&lt;/h2&gt;
&lt;p&gt;Go to &lt;a href=&#34;https://www.impactcity.nl/en/cast-your-vote-for-the-hague-innovators-challenge-2022/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Cast your vote for The Hague Innovators challenge 2022!&lt;/a&gt; and choose Reprex :)&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-make-sure-you-select-reprex-and-write-in-your-email-it-is-safe-here-you-need-to-tick--im-not-a-robot--to-be-able-to-select-companies-further-instructions---herepost2022-10-29_reprex-talk-to-all--magyarul-ittimpactcitymagyar&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Make sure you select **Reprex** and write in your email (it is safe here.) You need to tick `✅ I&amp;#39;m not a robot&amp;#39;  to be able to select companies. Further instructions 🇬🇧  [here](/post/2022-10-29_reprex-talk-to-all/) 🇭🇺 [magyarul itt](/impactcity/magyar/).&#34; srcset=&#34;
               /media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_9af93cd3518481eb5d2084340f6fa303.webp 400w,
               /media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_7683cb60880f0a034952606eaecff611.webp 760w,
               /media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_9af93cd3518481eb5d2084340f6fa303.webp&#34;
               width=&#34;760&#34;
               height=&#34;380&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Make sure you select &lt;strong&gt;Reprex&lt;/strong&gt; and write in your email (it is safe here.) You need to tick `✅ I&amp;rsquo;m not a robot&amp;rsquo;  to be able to select companies. Further instructions 🇬🇧  &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2022-10-29_reprex-talk-to-all/&#34;&gt;here&lt;/a&gt; 🇭🇺 &lt;a href=&#34;https://greendeal.dataobservatory.eu/impactcity/magyar/&#34;&gt;magyarul itt&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;h2 id=&#34;what-do-we-do-in-the-european-green-deal&#34;&gt;What do we do in the European Green Deal?&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; The Green Deal Data Observatory was our first test to bring our information management, data access, and processing know-how to new data sources (such as environmental satellite data, hydrological data, etc.) after &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;music&lt;/a&gt;.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; Due to the mainstreaming of SDG and ESG reporting, many features are developed with our music partners and overlap with the Competition Data Observatory.&lt;/li&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; We are looking for reliable data that can be processed into computational antitrust and policy evidence in relationship with SDG 11 to enhance inclusive and sustainable urbanization, SDG 12 to ensure sustainable consumption and production patterns, SDG 13 to take urgent action to combat climate change and its impacts within the partnership approach of SDG 17.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;product&#34;&gt;Product&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Our &lt;code&gt;data observatories&lt;/code&gt; (platform products) cover our R&amp;amp;D and platform costs while giving us access to an expanding range of prime clients. We use 21-st century open-source data engineering solutions, a decentralized data governance method, and web 3.0 technologies to avoid conflicts of interest and prevent the data Sisyphus of error-prone human data wrangling.  There is little competition on this service level (there are about 60 UN/EU/OECD recognized data observatories, and almost all of them are managed by a different operator.)  This layer is already monetized, and we have proven success. Our unique advantage is a combination of legal and technological skills: understanding legally open data, web 3.0, and data modeling, and the ability to participate in the open-source statistical /scientific software creator community.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We create &lt;code&gt;open-source software applications&lt;/code&gt; that fuel our data observatories with unprocessed, open, linked data. We create software for the R statistical environment, which is used in both official statistics and in many business and academic organizations. The production of R software components is a competitive field, but we believe that our position is strong: the vast majority of R packages are lightly or not at all serviced because of the lack of financing.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
















&lt;figure  id=&#34;figure-reprex-produces-open-source-scientific-softwarehttpsreprexnlreleases-and-various-collaborative-data-engineering-infrastructures-to-get-legally-open-governmental-data-and-open-science-data-in-a-timely-usable-format-to-ecological-researchers-and-ecotech-innovators&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Reprex produces [open-source scientific software](/https://reprex.nl/#releases), and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, and ecotech innovators.&#34; srcset=&#34;
               /media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_56c9b4c03a282adb587dce3e55b03854.webp 400w,
               /media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_48db3882e8585d62f0962a3ef76c04e4.webp 760w,
               /media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_1200x1200_fit_q75_h2_lanczos_2.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/reprex_comet_white_6x4_hude94dbe45b764017a0281a2d3b53aa2f_47062_56c9b4c03a282adb587dce3e55b03854.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Reprex produces &lt;a href=&#34;https://greendeal.dataobservatory.eu/https://reprex.nl/#releases&#34;&gt;open-source scientific software&lt;/a&gt;, and various collaborative data engineering infrastructures to get legally open governmental data and open science data in a timely, usable format to ecological researchers, and ecotech innovators.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;
&lt;p&gt;We provide &lt;code&gt;bespoke analytics solutions&lt;/code&gt; to our institutional partners in our data observatories. Such bespoke solutions iterate over our existing software components, helping us design better applications within an ever-expanding ecosystem. Providing tailored data-science services would require a large organization without a clear focus. We provide these services on an ad-hoc basis only among institutional partners and users of our data observatories. In these circles, which are often prime clients, we face little or no competition because we are trusted partners and data and solution providers. This is a key to our revenue and market growth.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We develop high-value &lt;code&gt;software-as-service applications&lt;/code&gt; that leverage our data observatory assets and our software solution into a novel, commercially valuable uses. Our applications are built around our family of open-source software and generalize our bespoke analytics solutions. We are in a late prototype phase where we already have some revenue and are trying to prepare for scaling up at the correct price with three of our applications. All of our applications are entering into highly competitive market segments. We are building on our ‘unfair’ advantage that we are bundling our solutions with data that is not accessible to competitors, and we can test them in the protected ecosystems of our observatories.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;plans-in-the-hague&#34;&gt;Plans in The Hague?&lt;/h2&gt;
&lt;p&gt;Our message is simple: &lt;code&gt;doing business and doing good&lt;/code&gt; for the city of the Hague means a vote for Reprex.   We would like to win the Hague Innovators Challenge in 2022 because we believe we could multiply our growth in partnership with the Hague. We have a significant budget to develop our observatories, and our company is already located in the Hague, in Apollo 14—but most of our team members, not to mention the observatory’s non-data personnel are not based in our beautiful and smart city. The observatories are important platforms for our growth, and they could create a lot more jobs and impact in the city than in our startup company.  Should we win the prize, we would spend the 25,000 euros on one thing: to develop our observatories into a real public-private partnership in the Hague, with a permanent office in Apollo 14 or the Hague Humanity Hub.&lt;/p&gt;
&lt;p&gt;Reprex’s data observatories, particularly the &lt;a href=&#34;https://greendeal.dataobservatory.eu/#slider&#34;&gt;Green Deal Data Observatory&lt;/a&gt; are public-private partnerships that foster the collective collection, processing, peer-review, and reuse of novel big data, like BeeSage’s beehive data, and reusable statistical and environmental data. We hope to place the permanent institution of this PPP in the Hague, which is already the &lt;a href=&#34;https://thehague.com/businessagency/the-hague-the-winner-world-smart-city-award-2021&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;World&amp;rsquo;s Smartest City&lt;/a&gt;, and which wants to remain a global centre of excellence of peace, justice, and sustainability in the era of big data and AI.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Please Choose Reprex in The Hague Innovators Award Online Vote</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-10-29_reprex-talk-to-all/</link>
      <pubDate>Sat, 29 Oct 2022 16:17:00 +0200</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-10-29_reprex-talk-to-all/</guid>
      <description>&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#how-to-vote&#34;&gt;How to vote?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#how-to-share-the-word&#34;&gt;How to share the word?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#why-vote-for-us&#34;&gt;Why vote for us?&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#get-in-touch&#34;&gt;Get in touch&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;

&lt;p&gt;🇭🇺 &lt;a href=&#34;https://greendeal.dataobservatory.eu/impactcity/magyar/&#34;&gt;magyarul&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;how-to-vote&#34;&gt;How to vote?&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Go to &lt;a href=&#34;https://www.impactcity.nl/en/cast-your-vote-for-the-hague-innovators-challenge-2022/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Cast your vote for The Hague Innovators challenge 2022!&lt;/a&gt;&lt;/li&gt;
&lt;/ol&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-make-sure-you-select-reprex-and-write-in-your-email-it-is-safe-here-you-need-to-tick--im-not-a-robot--to-be-able-to-select-companies&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Make sure you select **Reprex** and write in your email (it is safe here.) You need to tick `✅ I&amp;#39;m not a robot&amp;#39;  to be able to select companies.&#34; srcset=&#34;
               /media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_9af93cd3518481eb5d2084340f6fa303.webp 400w,
               /media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_7683cb60880f0a034952606eaecff611.webp 760w,
               /media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/ImpactCity_cast_your_vote_hub222ddc6a4fe6b20adc397d88e79d9e9_136396_9af93cd3518481eb5d2084340f6fa303.webp&#34;
               width=&#34;760&#34;
               height=&#34;380&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Make sure you select &lt;strong&gt;Reprex&lt;/strong&gt; and write in your email (it is safe here.) You need to tick `✅ I&amp;rsquo;m not a robot&amp;rsquo;  to be able to select companies.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;ol start=&#34;2&#34;&gt;
&lt;li&gt;Your vote is not final yet, you &lt;strong&gt;must click on a confirmation link&lt;/strong&gt; to prove that it was you who voted. Go to your email. (Your email is only recorded to avoid double voting, they will not add your address to any marketing databases.)&lt;/li&gt;
&lt;/ol&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-please-make-sure-you-chose-reprexhttpsreprexnl-and-click-on-link-to-the-confirmation-your-vote&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Please make sure you chose [Reprex](https://reprex.nl/) and **click on link** to the confirmation your vote.&#34; srcset=&#34;
               /media/img/blogposts_2022/ImpactCity_vote_confirmation_hud3514e4badb7b690e3ae86d1c669c41a_59924_99383fd8efbf4e8b6f09bee2076f5be5.webp 400w,
               /media/img/blogposts_2022/ImpactCity_vote_confirmation_hud3514e4badb7b690e3ae86d1c669c41a_59924_c05ad4bc953009e15cb6c185aaf55b4c.webp 760w,
               /media/img/blogposts_2022/ImpactCity_vote_confirmation_hud3514e4badb7b690e3ae86d1c669c41a_59924_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/ImpactCity_vote_confirmation_hud3514e4badb7b690e3ae86d1c669c41a_59924_99383fd8efbf4e8b6f09bee2076f5be5.webp&#34;
               width=&#34;760&#34;
               height=&#34;380&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Please make sure you chose &lt;a href=&#34;https://reprex.nl/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Reprex&lt;/a&gt; and &lt;strong&gt;click on link&lt;/strong&gt; to the confirmation your vote.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;You receive a text that &lt;strong&gt;your code is recorded&lt;/strong&gt; and you have nothing else to do. (Your address is safe with the municipality of The Hague.)&lt;/li&gt;
&lt;/ol&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-thank-you-very-much&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Thank you very much!&#34; srcset=&#34;
               /media/img/blogposts_2022/ImpactCity_je_stem_bevestigd_hu7f10e9d37bedd3ae59b386937c84018f_50555_dfc2df92ace08a5a3c83690d810a1f8a.webp 400w,
               /media/img/blogposts_2022/ImpactCity_je_stem_bevestigd_hu7f10e9d37bedd3ae59b386937c84018f_50555_c189db6dec9384caccd2cdda9fa1dd7c.webp 760w,
               /media/img/blogposts_2022/ImpactCity_je_stem_bevestigd_hu7f10e9d37bedd3ae59b386937c84018f_50555_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/ImpactCity_je_stem_bevestigd_hu7f10e9d37bedd3ae59b386937c84018f_50555_dfc2df92ace08a5a3c83690d810a1f8a.webp&#34;
               width=&#34;608&#34;
               height=&#34;304&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Thank you very much!
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;h2 id=&#34;how-to-share-the-word&#34;&gt;How to share the word?&lt;/h2&gt;
&lt;ol&gt;
&lt;li&gt;Please &lt;strong&gt;share our video message&lt;/strong&gt; on &lt;a href=&#34;https://www.youtube.com/watch?v=bgp-n55TKCk&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;YouTube&lt;/a&gt; among your colleagues and friends.&lt;/li&gt;
&lt;/ol&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-by-pressing-the--and-choosing-subtitles-you-can-choose-your-language-if-you-are-there-please-leave-a--too-&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;By pressing the ⚙️ and choosing `subtitles` you can choose your language. If you are there, please leave a 👍, too :)&#34; srcset=&#34;
               /media/img/blogposts_2022/Reprex_video_use_captions_hu23b119c32278da78c3e9ff5cca354004_228165_693efb34017bd658b05c857b0f65c42e.webp 400w,
               /media/img/blogposts_2022/Reprex_video_use_captions_hu23b119c32278da78c3e9ff5cca354004_228165_13f78dd8b1a6a2f40197dd2973a214a5.webp 760w,
               /media/img/blogposts_2022/Reprex_video_use_captions_hu23b119c32278da78c3e9ff5cca354004_228165_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/Reprex_video_use_captions_hu23b119c32278da78c3e9ff5cca354004_228165_693efb34017bd658b05c857b0f65c42e.webp&#34;
               width=&#34;655&#34;
               height=&#34;465&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      By pressing the ⚙️ and choosing &lt;code&gt;subtitles&lt;/code&gt; you can choose your language. If you are there, please leave a 👍, too :)
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;The message is the message! We are an ethical data and AI company, and one of our topics is detecting if algorithms are biased towards the English language speakers. We want to teach the computer to understand small languages, and of course, everyone, who is under-represented in data: womxn, former colonial nations.&lt;/p&gt;
&lt;p&gt;🇳🇱 🇬🇧 🇧🇦 🇨🇿 🇭🇺 🇩🇪 🇱🇹 🇫🇷 🇸🇰 🇪🇸 🇹🇷 + Catalan.&lt;/p&gt;
&lt;p&gt;2 &lt;strong&gt;Retweet&lt;/strong&gt; our appeal from one of our observatory Twitter accounts. For music audiences:
&lt;blockquote class=&#34;twitter-tweet&#34;&gt;&lt;p lang=&#34;en&#34; dir=&#34;ltr&#34;&gt;💜 We ask you, humbly, to support us with a vote or by sharing our appeal. We&amp;#39;re part of the &lt;a href=&#34;https://twitter.com/hashtag/opensource?src=hash&amp;amp;ref_src=twsrc%5Etfw&#34;&gt;#opensource&lt;/a&gt;, &lt;a href=&#34;https://twitter.com/hashtag/opendata?src=hash&amp;amp;ref_src=twsrc%5Etfw&#34;&gt;#opendata&lt;/a&gt;, and &lt;a href=&#34;https://twitter.com/hashtag/openscience?src=hash&amp;amp;ref_src=twsrc%5Etfw&#34;&gt;#openscience&lt;/a&gt; movement that depends on your support to stay online and thriving, but many of our users or simply look the other way🤦🏻‍♀️&lt;a href=&#34;https://t.co/Qcdh7saPpW&#34;&gt;https://t.co/Qcdh7saPpW&lt;/a&gt;&lt;/p&gt;&amp;mdash; Competition Data Observatory (@CompDataObs) &lt;a href=&#34;https://twitter.com/CompDataObs/status/1589246698001686529?ref_src=twsrc%5Etfw&#34;&gt;November 6, 2022&lt;/a&gt;&lt;/blockquote&gt;
&lt;script async src=&#34;https://platform.twitter.com/widgets.js&#34; charset=&#34;utf-8&#34;&gt;&lt;/script&gt;
&lt;/p&gt;
&lt;p&gt;The same message forgeneral cultural audiences: &lt;a href=&#34;https://twitter.com/CultDataObs/status/1587482559851761664&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;@CultDataObs&lt;/a&gt;; for music audiences: &lt;a href=&#34;https://twitter.com/DigitalMusicObs/status/1587480876383887369&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;@DigitalMusicObs&lt;/a&gt;, for green audeinces &lt;a href=&#34;https://twitter.com/GreenDealObs/status/1587513316699668482&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;@GreenDealObs&lt;/a&gt;&lt;/p&gt;
&lt;ol start=&#34;3&#34;&gt;
&lt;li&gt;
&lt;p&gt;Like our &lt;a href=&#34;%28https://www.linkedin.com/posts/reprexbv_the-hague-innovators-2022-reprex-activity-6993244940323430400-Z5dD%29&#34;&gt;LinkedIn page&lt;/a&gt; and share our appeal.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Or just &lt;strong&gt;send the link to this post&lt;/strong&gt; from the browser your colleagues and friends.&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;h2 id=&#34;why-vote-for-us&#34;&gt;Why vote for us?&lt;/h2&gt;
&lt;p&gt;We are finalists in The Hague Innovation Awards with a product offering and a message that big data and AI should work for all: ethnic minorities, small nations, small languages, womxn.  We are measuring why certain artists are not getting recommended and paid on global streaming platforms, or why NGOs do not find the right data about fighting greenwashing.  We want to help small businesses, civil society organizations, and NGOs who cannot hire a data engineer and a data scientist to fight data monopolies. Who cannot defend themselves from the dark patterns of greedy algorithms?&lt;/p&gt;
&lt;h2 id=&#34;get-in-touch&#34;&gt;Get in touch&lt;/h2&gt;
&lt;p&gt;Check out our events or write to &lt;a href=&#34;https://www.linkedin.com/in/antaldaniel/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;
  &lt;i class=&#34;fab fa-linkedin  pr-1 fa-fw&#34;&gt;&lt;/i&gt; Daniel Antal&lt;/a&gt;  or to &lt;a href=&#34;https://keybase.io/antaldaniel&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;
  &lt;i class=&#34;fab fa-keybase  pr-1 fa-fw&#34;&gt;&lt;/i&gt; antaldaniel&lt;/a&gt;. Or send an &lt;a href=&#34;https://greendeal.dataobservatory.eu/contact/&#34;&gt;
  &lt;i class=&#34;fas fa-envelope  pr-1 fa-fw&#34;&gt;&lt;/i&gt; email&lt;/a&gt;. Thank you!&lt;/p&gt;
&lt;iframe style=&#34;border-radius:12px&#34; src=&#34;https://open.spotify.com/embed/track/316FLnQsKc6j6d9IJCMBLH?utm_source=generator&amp;theme=0&#34; width=&#34;100%&#34; height=&#34;352&#34; frameBorder=&#34;0&#34; allowfullscreen=&#34;&#34; allow=&#34;autoplay; clipboard-write; encrypted-media; fullscreen; picture-in-picture&#34; loading=&#34;lazy&#34;&gt;&lt;/iframe&gt;
</description>
    </item>
    
    <item>
      <title>How to Find Locations for Things That Save Waste Heat?</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-10-24_thermowatt/</link>
      <pubDate>Mon, 24 Oct 2022 16:18:00 +0200</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-10-24_thermowatt/</guid>
      <description>&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-peter-rosbjerg-the-loss-of-winterhttpswwwflickrcomphotospeterrosbjerg4249419898&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Peter Rosbjerg: [The Loss of Winter...](https://www.flickr.com/photos/peterrosbjerg/4249419898/).&#34; srcset=&#34;
               /media/img/blogposts_2022/4249419898_2ed064f29c_o_hu6d2c265ebc69c1e462b98b86c8da9bbb_1902290_3f47b478d1eaf75125b2338485ed3881.webp 400w,
               /media/img/blogposts_2022/4249419898_2ed064f29c_o_hu6d2c265ebc69c1e462b98b86c8da9bbb_1902290_ef7d2e97f4ba92e240e47fd97d2fbf43.webp 760w,
               /media/img/blogposts_2022/4249419898_2ed064f29c_o_hu6d2c265ebc69c1e462b98b86c8da9bbb_1902290_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/4249419898_2ed064f29c_o_hu6d2c265ebc69c1e462b98b86c8da9bbb_1902290_3f47b478d1eaf75125b2338485ed3881.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Peter Rosbjerg: &lt;a href=&#34;https://www.flickr.com/photos/peterrosbjerg/4249419898/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;The Loss of Winter&amp;hellip;&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;Europe is preparing for the coldest winter since the second world war.  We must conserve energy, and use every particle of gas, sunshine, and wind to heat our homes, schools, and hospitals.  This is a great moment to give new inventors who want to save wasted energy a chance.  Our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt; would like to help Thermowatt in identifying the best possible city locations to implement their ground breaking technology that turns the heat stranded in our sewage networks into city heating. Thermowatt is addressing one of the most mind blowing sources of wastes that we cause and don’t utilize in a city context.&lt;/p&gt;
&lt;p&gt;Washing machines, showers, and even floor-wiping buckets are full of water that is significantly and constantly warmer than the cold European winter. We know well from our studies in elementary school physics that it is at least theoretically possible to utilize the heat going down the sink from our houses as heat. Maybe it is even possible to conserve it for future use.  This is how geothermal energy works for heat and power generation. This is exactly what heat pumps do in many houses. But how could we install heat pumps into an 8-story-high residential building in the Hague?  How could we save this wasted heat?&lt;/p&gt;
&lt;p&gt;Converting the inner energy of lukewarm or warm water into hot water is theoretically possible, the question is, what would be necessary to make this economic in everyday life?  Saving the energy wasted from a washing machine or a shower would be most likely to succeed if we would not need to convert to electricity (the conversion always leads to a loss of much energy due to the inefficiency of the conversion) and use the energy of the warm water for heating. We need to find places where there is an abundant use of lukewarm water in the sewage and there is a stable need for heat nearby. It also helps if the potential buyer has long-term contracting credibility. To install a pump that will, drop by drop, save energy from lukewarm water will need years of operation to turn economically profitable.&lt;/p&gt;
&lt;p&gt;The inventor of Thermowatt says that his invention will pump back hot water from the sewage if the sewage mainline is not far from the location of the waste (the sewage is not yet cold) and the buyer of the energy is nearby.  Searching for such locations is exactly what our Green Deal Data Observatory wants to facilitate.  We want to find house complexes in Europe that need heating on many days (for example, a hospital in a relatively Nordic country) close to a sewage system that is close enough to industry or residential zones with a large quantity of lukewarm, low-heat water.  This is not a simple Google search!&lt;/p&gt;
















&lt;figure  id=&#34;figure-thermowatt-in-the-budapest-sewage-works-see-image-galleryhttpswwwthermowatthureferencesfovarosi-csatornazasi-muvek-zrt-asztalos-sandor-utcai-telephelye-budapest-viii-kerulet-where-can-we-find-places-for-these-_things_&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Thermowatt in the Budapest Sewage Works. See [image gallery](https://www.thermowatt.hu/references/fovarosi-csatornazasi-muvek-zrt-asztalos-sandor-utcai-telephelye-budapest-viii-kerulet). Where can we find places for these _things_?&#34; srcset=&#34;
               /media/img/blogposts_2022/1000x750_Kerepesi7_hu90e86328da8eb13e7a23df04b83aeb8e_74889_920e8589ea9f9d79ddfda948157ca6b1.webp 400w,
               /media/img/blogposts_2022/1000x750_Kerepesi7_hu90e86328da8eb13e7a23df04b83aeb8e_74889_1f91605e0842a7ff4b31c7bbf2b6ecb7.webp 760w,
               /media/img/blogposts_2022/1000x750_Kerepesi7_hu90e86328da8eb13e7a23df04b83aeb8e_74889_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/1000x750_Kerepesi7_hu90e86328da8eb13e7a23df04b83aeb8e_74889_920e8589ea9f9d79ddfda948157ca6b1.webp&#34;
               width=&#34;760&#34;
               height=&#34;570&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Thermowatt in the Budapest Sewage Works. See &lt;a href=&#34;https://www.thermowatt.hu/references/fovarosi-csatornazasi-muvek-zrt-asztalos-sandor-utcai-telephelye-budapest-viii-kerulet&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;image gallery&lt;/a&gt;. Where can we find places for these &lt;em&gt;things&lt;/em&gt;?
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;To find ideal sites for Thermowatt, we would need to search for buildings and pipelines, not maps or documents. The Internet of Things is for connecting buildings and pipes that have sensors and chips.  The semantic search on web 3.0 is finding buildings and pipelines even if they do not have chips, or sensors.  The idea of the semantic web, or the web 3.0 is to connect any well-described “thing”, from heat day statistical tables to building documentation to urban plans of sewage systems into a single, searchable web.&lt;/p&gt;
&lt;p&gt;In the web 3.0, a thing can be anything that is properly documented: a table, an e-book or a printed book, photographs or a building, or the description of a building in a city cadastre.   The web 1.0 way would be to google for building databases, heating day data, and sewage pipeline data all over Europe, by accessing those databases, placing them into the Thermowatt’s database, then making a SQL query for the locations of certain buildings matching the environmental profile.  The web 3.0 is to search for things, such as hospital buildings, whose coordinates match a certain environmental profile, in many databases.&lt;/p&gt;
















&lt;figure  id=&#34;figure-searching-for-buildings-in-labskadasternlhttpslabskadasternlsparql&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Searching for buildings in [labs.kadaster.nl](https://labs.kadaster.nl/sparql/).&#34; srcset=&#34;
               /media/img/blogposts_2022/SPARQL_endpoint_kadaster_hued049f980c9a0289a3d5512d1d42e22c_81515_127b906a2e05a54d196e67718f034c03.webp 400w,
               /media/img/blogposts_2022/SPARQL_endpoint_kadaster_hued049f980c9a0289a3d5512d1d42e22c_81515_8f56202d23048133a09c210f7261ce45.webp 760w,
               /media/img/blogposts_2022/SPARQL_endpoint_kadaster_hued049f980c9a0289a3d5512d1d42e22c_81515_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/SPARQL_endpoint_kadaster_hued049f980c9a0289a3d5512d1d42e22c_81515_127b906a2e05a54d196e67718f034c03.webp&#34;
               width=&#34;760&#34;
               height=&#34;364&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Searching for buildings in &lt;a href=&#34;https://labs.kadaster.nl/sparql/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;labs.kadaster.nl&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;The web 3.0 way is to link together existing databases, including databases that fall under the EU Open Data Directive, because such databases can be re-used for commercial purposes for free. The European Union wants to boost the efficiency of business innovation by making all data assets that were originally financed by the taxpayers—including companies, like Thermowatt itself&amp;ndash;, available for commercial use after the government has used them for its own purposes.  If the government has placed all hospitals, and sewage pipeline data into databases, why not open it up for Thermowatt? And why not in a way that avoids database building costs for Thermowatt, which is, itself, a small, innovative eco-tech company without a data science or large IT team. The web 3.0 makes links to databases, just like you would link to the websites of each and every hospital building where you would like to pitch this installation.&lt;/p&gt;
















&lt;figure  id=&#34;figure-sthe-semantic-web-compared-to-the-traditional-web-by-arbeckhttpscommonswikimediaorgwikifilethe_semantic_web_compared_to_the_traditional_websvg&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;SThe Semantic Web Compared To The Traditional Web by [Arbeck](https://commons.wikimedia.org/wiki/File:The_Semantic_Web_Compared_To_The_Traditional_Web.svg).&#34; srcset=&#34;
               /media/img/blogposts_2022/semantic_web_compared_to_traditional_web_hu79f06a16cebf70f436af364b39866335_387122_c234bf171fd555c43f389440db1dd2e3.webp 400w,
               /media/img/blogposts_2022/semantic_web_compared_to_traditional_web_hu79f06a16cebf70f436af364b39866335_387122_05f0eee99596ac53f0792c03cb1aeedf.webp 760w,
               /media/img/blogposts_2022/semantic_web_compared_to_traditional_web_hu79f06a16cebf70f436af364b39866335_387122_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/semantic_web_compared_to_traditional_web_hu79f06a16cebf70f436af364b39866335_387122_c234bf171fd555c43f389440db1dd2e3.webp&#34;
               width=&#34;760&#34;
               height=&#34;333&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      SThe Semantic Web Compared To The Traditional Web by &lt;a href=&#34;https://commons.wikimedia.org/wiki/File:The_Semantic_Web_Compared_To_The_Traditional_Web.svg&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Arbeck&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;The Semantic Web Compared To The Traditional Web&lt;/p&gt;
&lt;p&gt;&amp;ldquo;The semantic web is the future of the internet and always will be,”, joked, Peter Norvig, director of research at Google said almost 20 years after the invention of the world wide web, which took only 5 years to become from concept a global hit.. While the connection of webpages with the hypertext url link, or the http(s), was an instant success in 1994, connecting databases with the RDF standards has proven to be a much more difficult task. But it is happening.  The European Union releases more and more datasets in this format, and researchers and startups like Reprex are offering cheaper and easier open-source tools to build RDF-compatible ‘dataset resources’.&lt;/p&gt;
&lt;p&gt;Linking together datasets in a way that they can be searched by meaning (‘give me a building close enough to a sewage main pipe in the Hague’, ‘now find me similar buildings in relatively cool cities with many heating days in the Netherlands… in the Benelux…. In Europe’). Google, an early evangelist of the web 3.0 as much as the web 1.0 15 years earlier, started to commercially release it under the name ‘Google knowledge graph’, allowing users to &lt;a href=&#34;https://blog.google/products/search/introducing-knowledge-graph-things-not/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;search for a thing instead of a thing&lt;/a&gt;.  Between 1997 and 2004 it took about 7 years to make Google search engine the global leader in finding strings in the web 1.0. The commercialization of the web 3.0 is taking a slower pace, but around 2019-2020 it became a mainstream technology for large organizations.  The mission of Reprex is to make knowledge graphs available for small companies, even civil society actors with cooperation in the data observatories.&lt;/p&gt;
&lt;p&gt;The aim of the Green Deal Data Observatory is to create such a knowledge graph that connects datasets from Eurostat, the European Environmental Agency, national and city cadastres, Wikipedia, Open Street Data, and all sorts of places in a way that ecotech companies like Thermowatt can search for their next site and start saving energy loss in Europe.&lt;/p&gt;
&lt;p&gt;Creating a knowledge graph is always an open collaboration: we never know what new datasets, blueprints, photos, and heatmaps will be available on web 3.0 in the future.  The G&lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;reen Deal Data Observatory&lt;/a&gt; is creating a knowledge graph that serves business and policy purposes related to the European Green Deal.  Reprex, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/thermowatt/&#34;&gt;Thermowatt&lt;/a&gt;, &lt;a href=&#34;https://cmbp.hu/?lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Concorde MB Partners&lt;/a&gt;, and &lt;a href=&#34;https://www.bluedoorconsulting.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Blue Door Consulting&lt;/a&gt; are inviting district heating companies, facility operators, sewage utilities,s and cities to join our knowledge graph, and start looking for new locations where we can stop the energy loss in the cold winter of 2022/2023.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>stacodelists: use standard, language-independent variable codes to help international data interoperability and machine reuse in R</title>
      <link>https://greendeal.dataobservatory.eu/post/2022-06-29-statcodelists/</link>
      <pubDate>Wed, 29 Jun 2022 08:12:00 +0100</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2022-06-29-statcodelists/</guid>
      <description>&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-visit-the-documentation-website-of-statcodelists-on-statcodelistsdataobservatoryeuhttpsstatcodelistsdataobservatoryeu&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Visit the documentation website of statcodelists on [statcodelists.dataobservatory.eu/](https://statcodelists.dataobservatory.eu/).&#34; srcset=&#34;
               /media/img/blogposts_2022/statcodelists_website_huef7e1379be389a62e3a47c5a8502e55c_102481_0b514d80337ede30bff4c26cee6a6f11.webp 400w,
               /media/img/blogposts_2022/statcodelists_website_huef7e1379be389a62e3a47c5a8502e55c_102481_1416f7a0950b1cecac8097850d995432.webp 760w,
               /media/img/blogposts_2022/statcodelists_website_huef7e1379be389a62e3a47c5a8502e55c_102481_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2022/statcodelists_website_huef7e1379be389a62e3a47c5a8502e55c_102481_0b514d80337ede30bff4c26cee6a6f11.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Visit the documentation website of statcodelists on &lt;a href=&#34;https://statcodelists.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;statcodelists.dataobservatory.eu/&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;!-- badges: start --&gt;
&lt;p&gt;&lt;a href=&#34;https://dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://img.shields.io/badge/ecosystem-dataobservatory.eu-3EA135.svg&#34; alt=&#34;dataobservatory&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;!-- badges: end --&gt;
&lt;p&gt;The goal of &lt;code&gt;statcodelists&lt;/code&gt; is to promote the reuse and exchange of statistical information and related metadata with making the internationally standardized SDMX code lists available for the R user. SDMX – the &lt;a href=&#34;https://sdmx.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Statistical Data and Metadata eXchange&lt;/a&gt; has been published as an ISO International Standard (ISO 17369). The metadata definitions, including the codelists are updated regularly according to the standard. The authoritative version of the code lists made available in this package is &lt;a href=&#34;https://sdmx.org/?page_id=3215/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://sdmx.org/?page_id=3215/&lt;/a&gt;.&lt;/p&gt;
&lt;details class=&#34;spoiler &#34;  id=&#34;spoiler-1&#34;&gt;
  &lt;summary&gt;Click to expand table of contents of the post&lt;/summary&gt;
  &lt;p&gt;&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#purpose&#34;&gt;Purpose&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#installation&#34;&gt;Installation&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#code-of-conduct&#34;&gt;Code of Conduct&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;h2 id=&#34;purpose&#34;&gt;Purpose&lt;/h2&gt;
&lt;p&gt;Cross-domain concepts in the SDMX framework describe concepts relevant to many, if not all, statistical domains. SDMX recommends using these concepts whenever feasible in SDMX structures and messages to promote the reuse and exchange of statistical information and related metadata between organisations.&lt;/p&gt;
&lt;p&gt;Code lists are predefined sets of terms from which some statistical coded concepts take their values. SDMX cross-domain code lists are used to support cross-domain concepts. What are these cross-domain coded concepts?&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Geographical codes, like &lt;code&gt;NL&lt;/code&gt;:  the Netherlands in the &lt;a href=&#34;https://statcodelists.dataobservatory.eu/reference/CL_AREA.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;CL_AREA&lt;/a&gt; code list.&lt;/li&gt;
&lt;li&gt;Standard industry codes &lt;code&gt;J631&lt;/code&gt; for Data processing, hosting and related activities in Europe. (&lt;a href=&#34;https://statcodelists.dataobservatory.eu/reference/CL_ACTIVITY_NACE2.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;NACE Rev 2&lt;/a&gt; in Europe, beware, it is &lt;code&gt;J592&lt;/code&gt;in Australia and New Zealand, see &lt;a href=&#34;https://statcodelists.dataobservatory.eu/reference/CL_ACTIVITY_ANZSIC06.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;CL_ACTIVITY_ANZSIC06&lt;/a&gt;.)&lt;/li&gt;
&lt;li&gt;Occupations, like &lt;code&gt;OC2521&lt;/code&gt; for &lt;code&gt;Database designers and administrators&lt;/code&gt; in &lt;a href=&#34;https://statcodelists.dataobservatory.eu/reference/CL_OCCUPATION.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;CL_OCCUPATIONS&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Time fomatting standards, like &lt;code&gt;CCYY&lt;/code&gt; for annual data series in &lt;a href=&#34;https://statcodelists.dataobservatory.eu/reference/CL_TIME_FORMAT.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;CL_TIME_FORMAT&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Check out the available codlists on the &lt;a href=&#34;https://statcodelists.dataobservatory.eu/reference/index.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;package homepage&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;The use of common code lists will help users to work even more efficiently, easing the maintenance of and reducing the need for mapping systems and interfaces delivering data and metadata to them. A very obvious advantage of using the code systems is that you can retrieve data from national sources indifferent of the natural language used in North Macedonia, Japan, the U.S. or the Netherlands. While the data labels may change to be locally human-readable, computers and geeks can read the codes and understand them immediately. Provided that they use the standard codes.&lt;/p&gt;
&lt;p&gt;Our data observatories are rolling out SDMX coding across all datasets to help data ingestion and interoperability, data findability and data reuse. &lt;code&gt;statcodelists&lt;/code&gt; can help the use of standard SDMX codes in your R workflow&amp;ndash;both for downloading data from statistical agencies and to produce publication-ready datasets that the rest of the world (and even APIs) will understand.&lt;/p&gt;
&lt;h2 id=&#34;installation&#34;&gt;Installation&lt;/h2&gt;
&lt;p&gt;You can install &lt;code&gt;statcodelists&lt;/code&gt; from CRAN:&lt;/p&gt;
&lt;div class=&#34;highlight&#34;&gt;&lt;pre tabindex=&#34;0&#34; class=&#34;chroma&#34;&gt;&lt;code class=&#34;language-fallback&#34; data-lang=&#34;fallback&#34;&gt;&lt;span class=&#34;line&#34;&gt;&lt;span class=&#34;cl&#34;&gt;install.packages(&amp;#34;statcodelists&amp;#34;)
&lt;/span&gt;&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;&lt;/div&gt;&lt;p&gt;Further recommended code values for expressing general statistical concepts like &lt;code&gt;not applicable&lt;/code&gt;, etc., can be found in section &lt;code&gt;Generic codes&lt;/code&gt; of the &lt;a href=&#34;https://sdmx.org/?page_id=4345&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Guidelines for the creation and management of SDMX Cross-Domain Code Lists&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;For further codelists used by reliable statistical agency but not harmonized on SDMX level please consult the &lt;a href=&#34;https://registry.sdmx.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;SDMX Global Registry&lt;/a&gt; &lt;a href=&#34;https://registry.sdmx.org/items/codelist.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Codelists&lt;/a&gt; page.&lt;/p&gt;
&lt;p&gt;The creator of this package is not affiliated with SDMX, and this package was has not been endorsed by SDMX.&lt;/p&gt;
&lt;h2 id=&#34;code-of-conduct&#34;&gt;Code of Conduct&lt;/h2&gt;
&lt;p&gt;Please note that the &lt;code&gt;statcodelists&lt;/code&gt; project is released with a &lt;a href=&#34;https://contributor-covenant.org/version/2/1/CODE_OF_CONDUCT.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Contributor Code of Conduct&lt;/a&gt;. By contributing to this project, you agree to abide by its terms.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>100,000 Opinions on the Most Pressing Global Problem</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-11-19_global_problem/</link>
      <pubDate>Thu, 25 Nov 2021 09:41:00 +0100</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-11-19_global_problem/</guid>
      <description>&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-a-reprezentative-sample-of-n100793-from-5-years-on-the-most-serious-global-problem-get-the-tidy-dataset-from-our-repositoryhttpszenodoorgrecord5711962yz9fnhvmjra-or-api&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;A reprezentative sample of n=100793 from 5 years on the most serious global problem. Get the tidy dataset from [our repository](https://zenodo.org/record/5711962#.YZ9fNHvMJRA) or API.&#34; srcset=&#34;
               /media/img/blogposts_2021/global_problem_1_climate_change_5_plots_hue8b7ea28ffb9d0df039569ac96f076be_37305_4a8b0d559d16fda0b316f86641bb328a.webp 400w,
               /media/img/blogposts_2021/global_problem_1_climate_change_5_plots_hue8b7ea28ffb9d0df039569ac96f076be_37305_86610edc39505a8c207c1542e1f57369.webp 760w,
               /media/img/blogposts_2021/global_problem_1_climate_change_5_plots_hue8b7ea28ffb9d0df039569ac96f076be_37305_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/global_problem_1_climate_change_5_plots_hue8b7ea28ffb9d0df039569ac96f076be_37305_4a8b0d559d16fda0b316f86641bb328a.webp&#34;
               width=&#34;760&#34;
               height=&#34;604&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      A reprezentative sample of n=100793 from 5 years on the most serious global problem. Get the tidy dataset from &lt;a href=&#34;https://zenodo.org/record/5711962#.YZ9fNHvMJRA&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;our repository&lt;/a&gt; or API.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;Imagine if you could compare data easily from surveys taken about climate change from all European countries, maybe even from other continents, from different years? If you could work with a sample of not only n=1000, n=4000, or n=10,000 but n=100,000? What type of granularity it would give you about the perception of climate change or supported policy measures?  That is exactly what our survey harmonization software allows for you to do.&lt;/p&gt;
&lt;p&gt;You can use and verify our software: it is a perfectly documented, open source, peer-reviewed scientific software. But for most users, a bit too difficult to handle.  This is why we are building the Green Deal Data Observatory as a user-centered  API around the software.  The Green Deal Data Observatory is processing climate-change related data from variuos survey, sensory, satellite data sources, and places them into tidy, easy-to-import datasets and visualizations.&lt;/p&gt;
&lt;p&gt;Survey harmonization means various social science, statistical and data processing steps to make data comparable and joinable from various questionnaire answers taken in different countries, languages, and years. To demonstrate the power of retrospective survey harmonization, we have made an indicator, visualizations and a data animation from more than a hundred nationally representative surveys, which asked more than 137,000 Europeans about what they considered to be the single most serious problem facing the world as a whole?&lt;/p&gt;
&lt;p&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://greendeal.dataobservatory.eu/media/gif/global_problem_1_climate_change_800.gif&#34; alt=&#34;&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;p&gt;Survey data harmonization refers to procedures that improve the data comparability or the possibility to make policy or scientific comparisons between data from surveys conducted in different countries or in different years. Our &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt; software helps this tedious, laborous, difficult data processing task.&lt;/p&gt;
&lt;p&gt;The result is stunning compared to a survey of 1000, 4000 or even 10,000 people.  In this video we have harmonized the answers from more than 137,000 Europeans surveyed in more than 20 languages. As you can see in the data animation, people got more and more concerned about climate change&amp;hellip; until Covid struck.&lt;/p&gt;
&lt;p&gt;Our data shows that more urban and higher educated people tend to be more and more concerned about climate change. Concern is higher and higher as younger and younger people are asked. (Our data source, the Eurobaromter survey is asking Europeans from the age of 15.)&lt;/p&gt;
&lt;p&gt;There are huge national differences in Europe: people in the countries that we defined as Nordic (Scandinavia and Finland) are much more serious about climate change than the rest of the continent. It also matters when was the question asked: between 2013-2019 anxiety over the climate has been growing rapidly, but it peaked in 2019.  In 2020, the Covid pandemic has altered the problem map of the European population, with ‘infectious diseases’ other important global problems. But apart from the time of asking the question, and the place of asking, there are important patterns emerging all over Europe which are shared regardless of the time and place.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-our-classification-tree-model-shows-what-factors-play-an-important-role-in-determining-if-somebody-believes-that-climate-change-is-the-most-important-global-problem&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our classification tree model shows what factors play an important role in determining if somebody believes that climate change is the most important global problem.&#34; srcset=&#34;
               /media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_24ef626f078c444c7a44764722c56df9.webp 400w,
               /media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_69493fe6457b3ad3ac727ca24112cfe8.webp 760w,
               /media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_24ef626f078c444c7a44764722c56df9.webp&#34;
               width=&#34;760&#34;
               height=&#34;597&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Our classification tree model shows what factors play an important role in determining if somebody believes that climate change is the most important global problem.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;People with no formal education rarely think that climate change is the most important global problem. People with secondary school education care less than people with tertiary education, and people with tertiary education or a bachelor&amp;rsquo;s degree care less than people who have a university degree or who are committed to life-long learning. This effect is further emphasized by level of urbanization: the more urbanized are the respondents, the more likely they think that climate change is the single most important problem facing humanity. (Urban people tend to have higher education levels, too.)&lt;/p&gt;
&lt;p&gt;Another important factor is age: the younger the respondent, the more likely to believe that climate change is the single most important problem.&lt;/p&gt;
&lt;p&gt;One takeaway is that generally, people&amp;rsquo;s climate awareness is rising: Europeans tend to be more urbanized and more educated, and this works in favor of recognizing this problem.  The coming younger generations are also more aware of climate change. Yet, as Covid-19 shows, a global trauma can alter the picture quickly.&lt;/p&gt;
&lt;p&gt;Using the &lt;a href=&#34;https://christophm.github.io/interpretable-ml-book/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;implemented machine learning&lt;/a&gt; R software package of Christoph Molnar, we calculated the importance of various socio-demography variables in predicting who will think that climate change is the most important problem facing us.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-out-of-the-variables-we-investigated-time-spent-in-education-is-the-most-important-factor-contributing-to-climate-awareness-closely-followed-by-the-time-when-the-question-was-asked&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Out of the variables we investigated, time spent in education is the most important factor contributing to climate awareness, closely followed by the time when the question was asked.&#34; srcset=&#34;
               /media/img/blogposts_2021/importance_global_problem_1_climate_change_hu29659d24aa62dce8b30a2e07c8a07ec1_11276_c79653e7ee7d989591bfcf532a407f54.webp 400w,
               /media/img/blogposts_2021/importance_global_problem_1_climate_change_hu29659d24aa62dce8b30a2e07c8a07ec1_11276_e6deba47b95b9e7497f154f668273f04.webp 760w,
               /media/img/blogposts_2021/importance_global_problem_1_climate_change_hu29659d24aa62dce8b30a2e07c8a07ec1_11276_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/importance_global_problem_1_climate_change_hu29659d24aa62dce8b30a2e07c8a07ec1_11276_c79653e7ee7d989591bfcf532a407f54.webp&#34;
               width=&#34;760&#34;
               height=&#34;597&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Out of the variables we investigated, time spent in education is the most important factor contributing to climate awareness, closely followed by the time when the question was asked.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;The importance of age, time, and even the time spent in education (age of leaving formal education) show that there is very significant change over time. Unfortunately, this change is not monotonous, until 2019 climate awareness was growing by this indicator, then it declined due to Covid.&lt;/p&gt;
&lt;p&gt;If you would ask a European citizen about the most important global problem today, the following decision tree would help guessing if she or he would reply &amp;ldquo;climate change&amp;rdquo;.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-our-classification-tree-model-shows-what-factors-play-an-important-role-in-determining-if-somebody-believes-that-climate-change-is-the-most-important-global-problem&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our classification tree model shows what factors play an important role in determining if somebody believes that climate change is the most important global problem.&#34; srcset=&#34;
               /media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_24ef626f078c444c7a44764722c56df9.webp 400w,
               /media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_69493fe6457b3ad3ac727ca24112cfe8.webp 760w,
               /media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/CART_global_problem_1_climate_change_hu52e8dbc1c769947e5e070575639ef30f_15643_24ef626f078c444c7a44764722c56df9.webp&#34;
               width=&#34;760&#34;
               height=&#34;597&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Our classification tree model shows what factors play an important role in determining if somebody believes that climate change is the most important global problem.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;The education level, the age, and the question of asking are very important variables, and so is the fact if the respondent has at least one child.  Interestingly, parents are less likely to be concerned about climate change then other people. In other words, the children are more concerned than their parents.&lt;/p&gt;
&lt;h2 id=&#34;get-our-data&#34;&gt;Get our data&lt;/h2&gt;
&lt;p&gt;You can always rely on our API to import directly the latest, best data, but if you want to be sure, you can use our &lt;a href=&#34;https://zenodo.org/record/5711962#.YZ9fNHvMJRA&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regular backups&lt;/a&gt; on Zenodo. Zenodo is an open science repository managed by CERN and supported by the European Union. On Zenodo, you can find an authoritative copy of our indicator (and its previous versions) with a digital object identifier, in this case, &lt;a href=&#34;https://doi.org/10.5281/zenodo.5711962&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;10.5281/zenodo.5711962&lt;/a&gt;. These datasets will be preserved for decades, and nobody can manipulate them. You cannot accidentally overwrite them, and we have no backdoor to modify them.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Are you a data user? Give us some feedback! Shall we do some further automatic data enhancements with our datasets? Document with different metadata? Link more information for business, policy, or academic use? Please  give us any &lt;a href=&#34;https://reprex.nl/#contact&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;feedback&lt;/a&gt;!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>How We Add Value to Public Data With Imputation and Forecasting?</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-11-08-indicator_value_added/</link>
      <pubDate>Mon, 08 Nov 2021 10:00:00 +0100</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-11-08-indicator_value_added/</guid>
      <description>&lt;p&gt;Public data sources are often plagued by missng values. Naively you may think that you can ignore them, but think twice: in most cases, missing data in a table is not missing information, but rather malformatted information. This approach of ignoring or dropping missing values will not be feasible or robust when you want to make a beautiful visualization, or use data in a business forecasting model, a machine learning (AI) applicaton, or a more complex scientific model. All of the above require complete datasets, and naively discarding missing data points amounts to an excessive waste of information. In this example we are continuing the example a not-so-easy to find public dataset.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-in-the-previous-blogpost-we-explained-how-we-added-value-with-documenting-the-data-following-the-fair-principle-and-with-the-professional-curatorial-work-of-placing-the-data-in-context-and-linking-it-to-other-information-sources-that-are-not-depending-on-the-english-language-and-can-connect-our-radio-dataset-to-other-data-books-publications-regardless-if-they-are-described-in-english-or-in-german-or-slovak-photo-atmospheric-research-observatory-south-pole-antarctica-photo-noaahttpsunsplashcomphotoswwvd4wxrx38&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;In the previous blogpost we explained how we added value with documenting the data following the *FAIR* principle and with the professional curatorial work of placing the data in context, and linking it to other information sources that are not depending on the English language, and can connect our radio dataset to other data, books, publications, regardless if they are described in English, or in German, or Slovak. Photo: Atmospheric Research Observatory, South Pole, Antarctica Photo: [NOAA](https://unsplash.com/photos/WWVD4wXRX38).&#34; srcset=&#34;
               /media/img/blogposts_2021/noaa-WWVD4wXRX38-unsplash-edited_huc1de598e48bcf2ca9302064c36ee3048_2297404_13a19cc7308f7f90fb71ae2c524e8fe6.webp 400w,
               /media/img/blogposts_2021/noaa-WWVD4wXRX38-unsplash-edited_huc1de598e48bcf2ca9302064c36ee3048_2297404_4c70859ff3bfdb7160714dc07c4d5305.webp 760w,
               /media/img/blogposts_2021/noaa-WWVD4wXRX38-unsplash-edited_huc1de598e48bcf2ca9302064c36ee3048_2297404_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/noaa-WWVD4wXRX38-unsplash-edited_huc1de598e48bcf2ca9302064c36ee3048_2297404_13a19cc7308f7f90fb71ae2c524e8fe6.webp&#34;
               width=&#34;760&#34;
               height=&#34;504&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      In the previous blogpost we explained how we added value with documenting the data following the &lt;em&gt;FAIR&lt;/em&gt; principle and with the professional curatorial work of placing the data in context, and linking it to other information sources that are not depending on the English language, and can connect our radio dataset to other data, books, publications, regardless if they are described in English, or in German, or Slovak. Photo: Atmospheric Research Observatory, South Pole, Antarctica Photo: &lt;a href=&#34;https://unsplash.com/photos/WWVD4wXRX38&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;NOAA&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;Completing missing datapoints requires statistical production information (why might the data be missing?) and data science knowhow (how to impute the missing value.) If you do not have a good statistician or data scientist in your team, you will need high-quality, complete datasets. This is what our automated data observatories provide.&lt;/p&gt;
&lt;h2 id=&#34;why-is-data-missing&#34;&gt;Why is data missing?&lt;/h2&gt;
&lt;p&gt;International organizations offer many statistical products, but usually they are on an ‘as-is’ basis. For example, Eurostat is the world’s premiere statistical agency, but it has no right to overrule whatever data the member states of the European Union, and some other cooperating European countries give to them. And they cannot force these countries to hand over data if they fail to do so. As a result, there will be many data points that are missing, and often data points that have wrong (obsolete) descriptions or geographical dimensions. We will show the geographical aspect of the problem in a separate blogpost; for now, we only focus on missing data.&lt;/p&gt;
&lt;p&gt;Some countries have only recently started providing data to the Eurostat umbrella organization, and it is likely that you will find few datapoints for North Macedonia or Bosnia-Herzegovina. Other countries provide data with some delay, and the last one or two years are missing. And there are gaps in some countries’ data, too.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-see-the-authoritative-copy-of-the-datasethttpszenodoorgrecord4775787yyqevmdmliu&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;See the authoritative copy of the [dataset](https://zenodo.org/record/4775787#.YYqevmDMLIU).&#34; srcset=&#34;
               /media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_a4f175ef26eb4fd64901b7fec564a2d4.webp 400w,
               /media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_99b295653ecf8ec6dbf89153a188c1fa.webp 760w,
               /media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_a4f175ef26eb4fd64901b7fec564a2d4.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      See the authoritative copy of the &lt;a href=&#34;https://zenodo.org/record/4775787#.YYqevmDMLIU&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;dataset&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;This is a headache if you want to use the data in some machine learning application or in a multiple or panel regression model. You can, of course, discard countries or years where you do not have full data coverage, but this approach usually wastes too much information&amp;ndash;if you work with 12 years, and only one data point is available, you would be discarding an entire country’s 11-years’ worth of data. Another option is to estimate the values, or otherwise impute the missing data, when this is possible with reasonable precision. This is where things get tricky, and you will likely need a statistician or a data scientist onboard.&lt;/p&gt;
&lt;h2 id=&#34;what-can-we-improve&#34;&gt;What can we improve?&lt;/h2&gt;
&lt;p&gt;Consider that the data is only missing from one year for a particular country, 2015. The naive solution would be to omit 2015 or the country at hand from the dataset. This is pretty destructive, because we know a lot about the R&amp;amp;D allocations in this country and in this year! But leaving 2015 blank will not look good on a chart, and will make your machine learning application or your regression model stop.&lt;/p&gt;
&lt;p&gt;A statistician or an innovation expert will tell you that you know more-or-less the missing information: the total allocation was most likely not zero in that year.  With some statistical or innovation, or public finance specific knowledge you will use the 2014, or 2016 value, or a combination of the two and keep the country and year in the dataset.&lt;/p&gt;
&lt;p&gt;Our improved dataset added backcasted (using the best time series model fitting the country&amp;rsquo;s actually present data), forecasted (again, using the best time series model), and approximated data (using linear approximation.) In a few cases, we add the last or next known value.  To give a few quantiative indicators about our work:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Increased number of observations: 29.2%&lt;/li&gt;
&lt;li&gt;Reduced missing values: -26.4%&lt;/li&gt;
&lt;li&gt;Increased non-missing subset for regression or AI: +64.7%&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;If your organization is working with panel (longitudional multiple) regressions or various machine learning applications, then your team knows that not havint the +66.67% gain would be a deal-breaker in the choice of models and punctuality of estimates or KPIs or other quantiative products. And that they would spent about 90% of their data resources on achieving this +66.67% gain in usability.&lt;/p&gt;
&lt;p&gt;If you happen to work in an NGO, a business unit or a research institute that does not employ data scientists, then it is likely that you can never achieve this improvement, and you have to give up on a number of quantitative tools or visualizations. If you  have a data scientist onboard, that professional can use our work as a starting point.&lt;/p&gt;
&lt;h2 id=&#34;can-you-trust-our-data&#34;&gt;Can you trust our data?&lt;/h2&gt;
&lt;p&gt;We believe that you can trust our data better than the original public source. We use statistical expertise to find out why data may be missing. Often, it is present in a wrong location (for example, the name of a region changed.)&lt;/p&gt;
&lt;p&gt;If you are reluctant to use estimates, think about discarding known actual data from your forecast or visualization, because one data point is missing.  How do you provide more accurate information? By hiding known actual data, because one point is missing, or by using all known data and an estimate?&lt;/p&gt;
&lt;p&gt;Our codebooks and our API uses the &lt;a href=&#34;https://sdmx.org/?page_id=3215/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Statistical Data and Metadata eXchange&lt;/a&gt; documentation standards to clearly indicate which data is observed, which is missing, which is estimated, and of course, also how it is estimated.
This example highlights another important aspect of data trustworthiness. If you have a better idea, you can replace them with a better estimate.&lt;/p&gt;
&lt;p&gt;Our indicators come with standardized codebooks that do not only contain the descriptive metadata, but administrative metadata about the history of the indicator values. You will find very important information about the statistical method we used the fill in the data gaps, and even link the reliable, the peer-reviewed scientific, statistical software that made the calculations. For data scientists, we record the plenty of information about the computing environment, too-–this can come handy if your estimates need external authentication, or you suspect a bug.&lt;/p&gt;
&lt;h2 id=&#34;avoid-the-data-sisyphus&#34;&gt;Avoid the data Sisyphus&lt;/h2&gt;
&lt;p&gt;If you work in an academic institution, in an NGO or a consultancy, you can never be sure who downloaded the &lt;a href=&#34;http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=gba_nabsfin07&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GBARD by socioeconomic objectives (NABS 2007)&lt;/a&gt; Eurostat folder from Eurostat. Did they modify the dataset? Did they already make corrections with the missing data? What method did they use? To prevent many potential problems, you will likely download it again, and again, and again&amp;hellip;&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-see-our-the-data-sisyphushttpsreprexnlpost2021-07-08-data-sisyphus-blogpost&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;See our [The Data Sisyphus](https://reprex.nl/post/2021-07-08-data-sisyphus/) blogpost.&#34; srcset=&#34;
               /media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp 400w,
               /media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_a6eb1b13ff33a5c73aba34550964ff52.webp 760w,
               /media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      See our &lt;a href=&#34;https://reprex.nl/post/2021-07-08-data-sisyphus/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;The Data Sisyphus&lt;/a&gt; blogpost.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;We have a better solution. You can always rely on our API to import directly the latest, best data, but if you want to be sure, you can use our &lt;a href=&#34;https://zenodo.org/record/5652118#.YYhGOGDMLIU&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regular backups&lt;/a&gt; on Zenodo. Zenodo is an open science repository managed by CERN and supported by the European Union. On Zenodo, you can find an authoritative copy of our indicator (and its previous versions) with a digital object identifier, in this case, &lt;a href=&#34;https://doi.org/10.5281/zenodo.5661169&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;10.5281/zenodo.5661169&lt;/a&gt;. These datasets will be preserved for decades, and nobody can manipulate them. You cannot accidentally overwrite them, and we have no backdoor to modify them.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://doi.org/10.5281/zenodo.5661169&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://zenodo.org/badge/DOI/10.5281/zenodo.5661169.svg&#34; alt=&#34;DOI&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Are you a data user? Give us some feedback! Shall we do some further automatic data enhancements with our datasets? Document with different metadata? Link more information for business, policy, or academic use? Please  give us any &lt;a href=&#34;https://reprex.nl/#contact&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;feedback&lt;/a&gt;!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>How We Add Value to Public Data With Better Curation And Documentation?</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-11-08-indicator_findable/</link>
      <pubDate>Mon, 08 Nov 2021 09:00:00 +0100</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-11-08-indicator_findable/</guid>
      <description>&lt;p&gt;In this example, we show a simple indicator: the &lt;em&gt;Government Budget Allocations for R&amp;amp;D in Environment&lt;/em&gt; in many European countries. (In our &lt;em&gt;Digital Music Observatory&lt;/em&gt; we give a more relevant &lt;a href=&#34;https://music.dataobservatory.eu/post/2021-11-08-indicator_findable/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;example&lt;/a&gt; about the turnover of the radio industry in Europe.)&lt;/p&gt;
&lt;p&gt;This dataset comes from a public datasource, the data warehouse of the
European statistical agency, Eurostat. Yet it is not trivial to use:
unless you are familiar with the &lt;em&gt;nomenclature for the analysis and comparison of scientific programmes and budgets&lt;/em&gt; or the &lt;a href=&#34;https://www.oecd.org/sti/frascati-manual-2015-9789264239012-en.htm&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Frascati Manual&lt;/a&gt;, you will probably not find &lt;a href=&#34;http://appsso.eurostat.ec.europa.eu/nui/show.do?dataset=gba_nabsfin07&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;this dataset&lt;/a&gt; on the Eurostat website.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-the-raw-data-can-be-retrieved-gbard-by-socioeconomic-objectives-nabs-2007gba_nabsfin07-eurostat-folder-if-you-find-it&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;The raw data can be retrieved GBARD by socioeconomic objectives (NABS 2007)[gba_nabsfin07] Eurostat folder (if you find it.)&#34; srcset=&#34;
               /media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_a4f175ef26eb4fd64901b7fec564a2d4.webp 400w,
               /media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_99b295653ecf8ec6dbf89153a188c1fa.webp 760w,
               /media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/gbard_environment_expenditure_plot_hu092519695c5c8c0c293bf2a5eeefe580_292114_a4f175ef26eb4fd64901b7fec564a2d4.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      The raw data can be retrieved GBARD by socioeconomic objectives (NABS 2007)[gba_nabsfin07] Eurostat folder (if you find it.)
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;Our version of this statistical indicator is documented following the &lt;a href=&#34;https://www.go-fair.org/fair-principles/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;FAIR principles&lt;/a&gt;: our data assets
are findable, accessible, interoperable, and reusable. While the
Eurostat data warehouse partly fulfills these important data quality
expectations, we can improve them significantly. And we can also
improve the dataset, too, as we will show in the &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-11-06-indicator_value_added/&#34;&gt;next blogpost&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;findable-data&#34;&gt;Findable Data&lt;/h2&gt;
&lt;p&gt;Our data observatories add value by curating the data&amp;ndash;we bring this
indicator to light with a more descriptive name, and we place it in
context with our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt;.
While many people may need this dataset in the environmental policy organizations, NGOs, scientific journalists, or researchers, most of them has no training in the nomenclatures of scientific and R&amp;amp;D spending or public budget accounts. Our curated data observatories bring together many
available data around important domains. Our &lt;em&gt;Green Deal Data Observatory&lt;/em&gt;, for example, aims to form an ecosystem of climate policy and climate change mitigation data users and producers.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-we-added-descriptive-metadatahttpszenodoorgrecord5658849yyqicwdmliu-that-help-you-find-our-data-and-match-it-with-other-relevant-data-sources&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We [added descriptive metadata](https://zenodo.org/record/5658849#.YYqicWDMLIU) that help you find our data and match it with other relevant data sources.&#34; srcset=&#34;
               /media/img/blogposts_2021/zenodo_gbard_environment_expenditure_metadata_hu466af5eda667e61c992cbc3770f1c27b_194619_94393f82400c1139d76477a52a1af13a.webp 400w,
               /media/img/blogposts_2021/zenodo_gbard_environment_expenditure_metadata_hu466af5eda667e61c992cbc3770f1c27b_194619_2b0d6d8f077aaaeca31f7fc768a35e03.webp 760w,
               /media/img/blogposts_2021/zenodo_gbard_environment_expenditure_metadata_hu466af5eda667e61c992cbc3770f1c27b_194619_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/zenodo_gbard_environment_expenditure_metadata_hu466af5eda667e61c992cbc3770f1c27b_194619_94393f82400c1139d76477a52a1af13a.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      We &lt;a href=&#34;https://zenodo.org/record/5658849#.YYqicWDMLIU&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;added descriptive metadata&lt;/a&gt; that help you find our data and match it with other relevant data sources.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;We added descriptive metadata that help you find our data and match it
with other relevant data sources. For example, we add keywords and
standardized metadata identifiers from the Library of Congress Linked
Data Services, probably the world’s largest standardized knowledge
library description. This makes sure that you can find relevant data
about the same concept (&lt;a href=&#34;https://id.loc.gov/authorities/subjects/sh85044203.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;environmental protection&lt;/a&gt;)
besides our turnover data. This help unambigously connect our dataset
with other information source that use the same concept, but maybe
different keywords, such as &lt;em&gt;Protection of environment&lt;/em&gt;, or maybe &lt;em&gt;Umweltschutz&lt;/em&gt; in German, or &lt;em&gt;Ochrana životného prostredia&lt;/em&gt; in Slovak. Or avoid confusion with &lt;em&gt;Human environment&lt;/em&gt;.&lt;/p&gt;
&lt;h2 id=&#34;accessible-data&#34;&gt;Accessible Data&lt;/h2&gt;
&lt;p&gt;Our data is accessible in two forms: in &lt;code&gt;csv&lt;/code&gt; tabular format (which can be
read with Excel, OpenOffice, Numbers, SPSS and many similar spreadsheet
or statistical applications) and in &lt;code&gt;JSON&lt;/code&gt; for automated importing into
your databases. We can also provide our users with SQLite databases,
which are fully functional, single user relational databases.&lt;/p&gt;
&lt;p&gt;Tidy datasets are easy to manipulate, model and visualize, and have a
specific structure: each variable is a column, each observation is a
row, and each type of observational unit is a table. This makes the data
easier to clean, and far more easier to use in a much wider range of
applications than the original data we used. In theory, this is a simple objective,
yet we find that even governmental statistical agencies&amp;ndash;and even scientific
publications&amp;ndash;often publish untidy data. This poses a significant problem that implies
productivity loses: tidying data will require long hours of investment, and if
a reproducible workflow is not used, data integrity can also be compromised:
chances are that the process of tidying will overwrite, delete, or omit a data or a label.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-tidy-datasetshttpsr4dshadconztidy-datahtml-are-easy-to-manipulate-model-and-visualize-and-have-a-specific-structure-each-variable-is-a-column-each-observation-is-a-row-and-each-type-of-observational-unit-is-a-table&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;[Tidy datasets](https://r4ds.had.co.nz/tidy-data.html) are easy to manipulate, model and visualize, and have a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table.&#34; srcset=&#34;
               /media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_840d5597bab1e4d7c2b314453bf83608.webp 400w,
               /media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_f01845e0e6967cc9a3a2b53cf12edd0a.webp 760w,
               /media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/tidy-8_hub5468e0441f3c23e1be9aa13622e5d1a_299553_840d5597bab1e4d7c2b314453bf83608.webp&#34;
               width=&#34;760&#34;
               height=&#34;355&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      &lt;a href=&#34;https://r4ds.had.co.nz/tidy-data.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Tidy datasets&lt;/a&gt; are easy to manipulate, model and visualize, and have a specific structure: each variable is a column, each observation is a row, and each type of observational unit is a table.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;While the original data source, the Eurostat data warehouse is
accessible, too, we added value with bringing the data into a &lt;a href=&#34;https://www.jstatsoft.org/article/view/v059i10&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;tidy
format&lt;/a&gt;. Tidy data can
immediately be imported into a statistical application like SPSS or
STATA, or into your own database. It is immediately available for
plotting in Excel, OpenOffice or Numbers.&lt;/p&gt;
&lt;h2 id=&#34;interoperability&#34;&gt;Interoperability&lt;/h2&gt;
&lt;p&gt;Our data can be easily imported with, or joined with data from other internal or external sources.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-all-our-indicators-come-with-standardized-descriptive-metadata-and-statistical-processing-metadata-see-our-apihttpsapigreendealdataobservatoryeudatabasemetadata&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;All our indicators come with standardized descriptive metadata, and statistical (processing) metadata. See our [API](https://api.greendeal.dataobservatory.eu/database/metadata/) &#34; srcset=&#34;
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp 400w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_5807feecbd17bee02fd8c68fad87b1d7.webp 760w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      All our indicators come with standardized descriptive metadata, and statistical (processing) metadata. See our &lt;a href=&#34;https://api.greendeal.dataobservatory.eu/database/metadata/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;API&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;All our indicators come with standardized descriptive metadata,
following two important standards, the &lt;a href=&#34;https://dublincore.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Dublin Core&lt;/a&gt; and
&lt;a href=&#34;https://datacite.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;DataCite&lt;/a&gt;–implementing not only the mandatory,
but the recommended descriptions, too. This will make it far easier to
connect the data with other data sources, e.g. turnover with the number of radio broadcasting enterprises or radio stations within specific territories.&lt;/p&gt;
&lt;p&gt;Our passion for documentation standards and best practices goes much further: our data uses &lt;a href=&#34;https://sdmx.org/?page_id=3215/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Statistical Data and Metadata eXchange&lt;/a&gt; standardized codebooks, unit descriptions and other statistical and administrative metadata.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-we-participate-in-scientific-workhttpsreprexnlpublicationeuropean_visibilitiy_2021-related-to-data-interoperability&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We participate in [scientific work](https://reprex.nl/publication/european_visibilitiy_2021/) related to data interoperability.&#34; srcset=&#34;
               /media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_25232c9bd0c86814e3e3337261110ea4.webp 400w,
               /media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_93fa43b83c3a299d78a1afed7bc4f820.webp 760w,
               /media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/reports/european_visbility_publication_hu9fd9bf0ebbda97354d76a2e1b9589f6b_264884_25232c9bd0c86814e3e3337261110ea4.webp&#34;
               width=&#34;760&#34;
               height=&#34;506&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      We participate in &lt;a href=&#34;https://reprex.nl/publication/european_visibilitiy_2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;scientific work&lt;/a&gt; related to data interoperability.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;h2 id=&#34;reuse&#34;&gt;Reuse&lt;/h2&gt;
&lt;p&gt;All our datasets come with standardized information about reusabililty.
We add citation, attribution data, and licensing terms. Most of our
datasets can be used without commercial restriction after acknowledging
the source, but we sometimes work with less permissible data licenses.&lt;/p&gt;
&lt;p&gt;In the case presented here, we added further value to encourage re-use. In addition to tidying, we
significantly increased the usability of public data by handling
missing cases. This is the subject of our &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-11-06-indicator_value_added/&#34;&gt;next blogpost&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Are you a data user? Give us some feedback! Shall we do some further
automatic data enhancements with our datasets? Document with different
metadata? Link more information for business, policy, or academic use? Please
give us any &lt;a href=&#34;https://reprex.nl/#contact&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;feedback&lt;/a&gt;!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>The Data Sisyphus</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-07-08-data-sisyphus/</link>
      <pubDate>Thu, 08 Jul 2021 09:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-07-08-data-sisyphus/</guid>
      <description>&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-sisyphus-was-punished-by-being-forced-to-roll-an-immense-boulder-up-a-hill-only-for-it-to-roll-down-every-time-it-neared-the-top-repeating-this-action-for-eternity--this-is-the-price-that-project-managers-and-analysts-pay-for-the-inadequate-documentation-of-their-data-assets&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Sisyphus was punished by being forced to roll an immense boulder up a hill only for it to roll down every time it neared the top, repeating this action for eternity.  This is the price that project managers and analysts pay for the inadequate documentation of their data assets.&#34; srcset=&#34;
               /media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp 400w,
               /media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_a6eb1b13ff33a5c73aba34550964ff52.webp 760w,
               /media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/Sisyphus_Bodleian_Library_hu99f0c1d6c82963b9538437670b4d339d_1662894_cd48a6c374c9ff68a08abe79a6abf2f4.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Sisyphus was punished by being forced to roll an immense boulder up a hill only for it to roll down every time it neared the top, repeating this action for eternity.  This is the price that project managers and analysts pay for the inadequate documentation of their data assets.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;&lt;em&gt;When was a file downloaded from the internet?  What happened with it sense?  Are their updates? Did the bibliographical reference was made for quotations?  Missing values imputed?  Currency translated? Who knows about it – who created a dataset, who contributed to it?  Which is an intermediate format of a spreadsheet file, and which is the final, checked, approved by a senior manager?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Big data creates inequality and injustice. On aspect of this inequality is the cost of data processing and documentation – a greatly underestimated, and usually not reported cost item. In small organizations, where there are no separate data science and data engineering roles, data is usually supposed to be processed and documented by (junior) analysts or researchers.  This a very important source of the gap between Big Tech and them: the data usually ends up very expensive, ill-formatted, not readable by computers that use machine learning and AI. Usually the documentation steps are completely omitted.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;“Data is potential information, analogous to potential energy: work is required to release it.” &amp;ndash; Jeffrey Pomerantz&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Metadata, which is information about the history of the data, and information how it can be technically and legally reused, has a hidden cost. Cheap or low-quality external data comes with poor or no metadata, and small organizations lack the resources to add high-quality metadata to their datasets. However, this only perpetuates the problem.&lt;/p&gt;
&lt;h2 id=&#34;metadata-unbillable-hours&#34;&gt;The hidden cost item behind the unbillable hours&lt;/h2&gt;
&lt;p&gt;As we have shown with our research partners, such metadata problems are not unique to data analysis.  Independent artists and small labels are suffering on music or book sales platforms, because their copyrighted content is not well documented.  If you automatically document tens of thousands of songs or datasets, the documentation cost is very small per item. If you, do it manually, the cost may be higher than the expected revenue from the song, or the total cost of the dataset itself. (See our research consortiums&amp;rsquo; preprint paper: &lt;a href=&#34;https://dataandlyrics.com/publication/european_visibilitiy_2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Ensuring the Visibility and Accessibility of European Creative Content on the World Market: The Need for Copyright Data Improvement in the Light of New Technologies&lt;/a&gt;)&lt;/p&gt;
&lt;p&gt;In the short run, small consultancies, NGOs, or as a matter of fact, musicians, seem to logically give up on high-quality documentation and logging.  In the long run, this has two devastating consequences: computers, such as machine learning algorithms cannot read their documents, data, songs.  And as memory fades, the ill-documented resources need to be re-created, re-checked, reformatted.  Often, they are even hard to find on your internal server or laptop archive.&lt;/p&gt;
&lt;p&gt;Metadata is a hidden destroyer of the competitiveness of corporate or academic research, or independent content management.   It never quoted on external data vendor invoices, it is not planned as a cost item, because metadata, the description of a dataset, a document, a presentation, or song, is meaningless without the resource that it describes. You never buy metadata.  But if your dataset comes without proper metadata documentation, you are bound, like Sisyphus, to search for it, to re-arrange it, to check its currency units, its digits, its formatting.  Data analysts are reported to spend about 80% of their working hours on data processing and not data analysis &amp;ndash; partly, because data processing is a very laborious task that can be done by computers at a scale far cheaper, and partly because they do not know if the person who sat before them at the same desk has already performed these tasks, or if the person responsible for quality control checked for errors.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-uncut-diamonds-need-to-be-cut-polished-and-you-have-to-make-sure-that-they-come-from-a-legal-source-data-is-similar-it-needs-to-be-tidied-up-checked-and-documented-before-use-photo-dave-fischer&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Uncut diamonds need to be cut, polished, and you have to make sure that they come from a legal source. Data is similar: it needs to be tidied up, checked and documented before use. Photo: Dave Fischer.&#34; srcset=&#34;
               /media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_0317c281e0aba727eb8e1a81805de459.webp 400w,
               /media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_1470967ea871e5c3f6f247c839f6d52a.webp 760w,
               /media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/gems/Uncut-diamond_Edit_hu4573f19f53e1306ad88770fc5e491871_409761_0317c281e0aba727eb8e1a81805de459.webp&#34;
               width=&#34;760&#34;
               height=&#34;506&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Uncut diamonds need to be cut, polished, and you have to make sure that they come from a legal source. Data is similar: it needs to be tidied up, checked and documented before use. Photo: Dave Fischer.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;Undocumented data is hardly informative – it may be a page in a book, a file in an obsolete file format on a governmental server, an Excel sheet that you do not remember to have checked for updates.  Most data are useless, because we do not know how it can inform us, or we do not know if we can trust it.  The processing can be a daunting task, not to mention the most boring and often neglected documentation duties after the dataset is final and pronounced error-free by the person in charge of quality control.&lt;/p&gt;
&lt;h2 id=&#34;observatory-metadata-services&#34;&gt;Our observatory automatically processes and documents the data&lt;/h2&gt;
&lt;p&gt;The good news about documentation and data validation costs is that they can be shared.  If many users need GDP/capita data from all over the world in euros, then it is enough if only one entity, a data observatory, collects all GDP and population data expresed in dollars, korunas, and euros, and makes sure that the latest data is correctly translated to euros, and then correctly divided by the latest population figures. These task are error-prone,and should not be repeaeted by every data journalist, NGO employee, PhD student or junior analyst.  This is one of the services of our data observatory.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; The tidy data format means that the data has a uniform and clear data structure and semantics, therefore it can be automatically validated for many common errors and can be automatically documented by either our software or any other professional data science application. It is not as strict as the schema for a relational database, but it is strict enough to make, among other things, importing into a database easy.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; The descriptive metadata contains information on how to find the data, access the data, join it with other data (interoperability) and use it, and reuse it, even years from now. Among others, it contains file format information and intellectual property rights information.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; The processing metadata makes the data usable in strictly regulated professional environments, such as in public administration, law firms, investment consultancies, or in scientific research. We give you the entire processing history of the data, which makes peer-review or external audit much easier and cheaper.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; The authoritative copy is held at an independent repository, it has a globally unique identifier that protects you from accidental data loss, mixing up with unfinished an untested version.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-cutting-the-dataset-to-a-format-with-clear-semantics-and-documenting-it-with-the-fair-metadata-concep-exponentially-increases-the-value-of-data-it-can-be-publisehd-or-sold-at-a-premium-photo-andere-andrehttpscommonswikimediaorgwindexphpcurid4770037&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Cutting the dataset to a format with clear semantics and documenting it with the FAIR metadata concep exponentially increases the value of data. It can be publisehd or sold at a premium. Photo: [Andere Andre](https://commons.wikimedia.org/w/index.php?curid=4770037).&#34; srcset=&#34;
               /media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_27278366bdb30735ec3edb5dd68ce37b.webp 400w,
               /media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_2022c9c74076769b68c8f788b6835f99.webp 760w,
               /media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/gems/Diamond_Polisher_hu2b5ca0e8d1290dc6b290d6b4669a6259_449722_27278366bdb30735ec3edb5dd68ce37b.webp&#34;
               width=&#34;760&#34;
               height=&#34;506&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Cutting the dataset to a format with clear semantics and documenting it with the FAIR metadata concep exponentially increases the value of data. It can be publisehd or sold at a premium. Photo: &lt;a href=&#34;https://commons.wikimedia.org/w/index.php?curid=4770037&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Andere Andre&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;While humans are much better at analysing the information and human agency is required for trustworthy AI, computers are much better at processing and documenting data.  We apply to important concepts to our data service: we always process the data to the tidy format, we create an authoritative copy, and we always automatically add descriptive and processing metadata.&lt;/p&gt;
&lt;h2 id=&#34;value-of-metadata&#34;&gt;The value of metadata&lt;/h2&gt;
&lt;p&gt;Metadata is often more valuable and more costly to make than the data itself, yet it remains an elusive concept for senior or financial management.  Metadata is information about how to correctly use the data and has no value without the data itself.  Data acquisition, such as buying from a data vendor, or paying an opinion polling company, or external data consultants appears among the material costs, but metadata is never sold alone, and you do not see its cost.&lt;/p&gt;
&lt;p&gt;In most cases, the reason why &lt;a href=&#34;https://dataandlyrics.com/post/2021-06-18-gold-without-rush/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;there is no gold rush for open data&lt;/a&gt; is that fact that while the EU member states release billions of euros&amp;rsquo; worth data for free, or at very low cost, annually, it comes without proper metadata.&lt;/p&gt;
&lt;td style=&#34;text-align: center;&#34;&gt;















&lt;figure  id=&#34;figure-data-as-serviceservicesdata-as-servicereusable-legal-easy-to-import-interoperable-always-fresh-data-in-tidy-formats-with-a-modern-api-photo-edgar-sotohttpsunsplashcomphotosgb0bzgae1nk&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;[Data-as-Service](/services/data-as-service/)Reusable, legal, easy-to-import, interoperable, always fresh data in tidy formats with a modern API. Photo: [Edgar Soto](https://unsplash.com/photos/gb0BZGae1Nk).&#34; srcset=&#34;
               /media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_81b97d34c1ccb0eb3994b312d0747e63.webp 400w,
               /media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_b3ddf8e86873a66ce16e8636fadc3357.webp 760w,
               /media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/gems/edgar-soto-gb0BZGae1Nk-unsplash_hu885793c483f74753314f6c800c67a06f_204775_81b97d34c1ccb0eb3994b312d0747e63.webp&#34;
               width=&#34;760&#34;
               height=&#34;506&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      &lt;a href=&#34;https://greendeal.dataobservatory.eu/services/data-as-service/&#34;&gt;Data-as-Service&lt;/a&gt;&lt;/br&gt;&lt;/br&gt;Reusable, legal, easy-to-import, interoperable, always fresh data in tidy formats with a modern API. Photo: &lt;a href=&#34;https://unsplash.com/photos/gb0BZGae1Nk&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Edgar Soto&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;&lt;/td&gt;
&lt;p&gt;If the data source is cheap or has a low quality, you do not even get it.  If you do not have it, it will show up as a human resource cost in research (when your analysist or junior researcher are spending countless hours to find out the missing metadata information on the correct use of the data) or in sales costs (when you try to reuse a research, consulting or legal product and you have comb through your archive and retest elements again and again.)&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;input checked=&#34;&#34; disabled=&#34;&#34; type=&#34;checkbox&#34;&gt; The data, together with the descriptive and administrative metadata, and links to the use license and the authoritative copy can be found in our API. Try it out!&lt;/li&gt;
&lt;/ul&gt;
</description>
    </item>
    
    <item>
      <title>Including Indicators from Arab Barometer in Our Observatory</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-28-arabbarometer/</link>
      <pubDate>Mon, 28 Jun 2021 09:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-28-arabbarometer/</guid>
      <description>&lt;p&gt;&lt;em&gt;A new version of the retroharmonize R package – which is working with retrospective, ex post harmonization of survey data – was released yesterday after peer-review on CRAN. It allows us to compare opinion polling data from the Arab Barometer with the Eurobarometer and Afrorbarometer. This is the first version that is released in the rOpenGov community, a community of R package developers on open government data analytics and related topics.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Surveys are the most important data sources in social and economic
statistics – they ask people about their lives, their attitudes and
self-reported actions, or record data from companies and NGOs. Survey
harmonization makes survey data comparable across time and countries. It
is very important, because often we do not know without comparison if an
indicator value is &lt;em&gt;low&lt;/em&gt; or &lt;em&gt;high&lt;/em&gt;. If 40% of the people think that
&lt;em&gt;climate change is a very serious problem&lt;/em&gt;, it does not really tell us
much without knowing what percentage of the people answered this
question similarly a year ago, or in other parts of the world.&lt;/p&gt;
&lt;p&gt;With the help of Ahmed Shabani and Yousef Ibrahim, we created a third
case study after the
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/eurobarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer&lt;/a&gt;,
and
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/afrobarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Afrobarometer&lt;/a&gt;,
about working with the &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/arabbarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Arab
Barometer&lt;/a&gt;
harmonized survey data files.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Ex ante&lt;/em&gt; survey harmonization means that researchers design
questionnaires that are asking the same questions with the same survey
methodology in repeated, distinct times (waves), or across different
countries with carefully harmonized question translations. &lt;em&gt;Ex post&lt;/em&gt;
harmonizations means that the resulting data has the same variable
names, same variable coding, and can be joined into a tidy data frame
for joint statistical analysis. While seemingly a simple task, it
involves plenty of metadata adjustments, because established survey
programs like Eurobarometer, Afrobarometer or Arab Barometer have
several decades of history, and several decades of coding practices and
file formatting legacy.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;&lt;em&gt;Variable harmonization&lt;/em&gt; means that if the same question is called
in one microdata source &lt;code&gt;Q108&lt;/code&gt; and the other &lt;code&gt;eval-parl-elections&lt;/code&gt;
then we make sure that they get a harmonize and machine readable
name without spaces and special characters.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Variable label harmonization&lt;/em&gt; means that the same questionnaire
items get the same numeric coding and same categorical labels.&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Missing case harmonization&lt;/em&gt; means that various forms of missingness
are treated the same way.&lt;/li&gt;
&lt;/ul&gt;
















&lt;figure  id=&#34;figure-for-the-climate-awareness-dataset-get-the-country-averages-and-aggregates-from-zenodohttpsdoiorg105281zenodo5035562-and-the-plot-in-jpg-or-png-from-figsharehttpsdoiorg106084m9figshare14854359&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;For the climate awareness dataset get the country averages and aggregates from [Zenodo](https://doi.org/10.5281/zenodo.5035562), and the plot in `jpg` or `png` from [figshare](https://doi.org/10.6084/m9.figshare.14854359).&#34; srcset=&#34;
               /media/img/blogposts_2021/arab_barometer_5_climate_change_by_country_hu8dd9da8add5270829a1e50ead6a6a120_38791_1bab40489e5820c07250b277ffe362e0.webp 400w,
               /media/img/blogposts_2021/arab_barometer_5_climate_change_by_country_hu8dd9da8add5270829a1e50ead6a6a120_38791_fd825f05348e751021206419bd01c763.webp 760w,
               /media/img/blogposts_2021/arab_barometer_5_climate_change_by_country_hu8dd9da8add5270829a1e50ead6a6a120_38791_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/arab_barometer_5_climate_change_by_country_hu8dd9da8add5270829a1e50ead6a6a120_38791_1bab40489e5820c07250b277ffe362e0.webp&#34;
               width=&#34;760&#34;
               height=&#34;570&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      For the climate awareness dataset get the country averages and aggregates from &lt;a href=&#34;https://doi.org/10.5281/zenodo.5035562&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Zenodo&lt;/a&gt;, and the plot in &lt;code&gt;jpg&lt;/code&gt; or &lt;code&gt;png&lt;/code&gt; from &lt;a href=&#34;https://doi.org/10.6084/m9.figshare.14854359&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;figshare&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;In our new &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/arabbarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Arab Barometer case
study&lt;/a&gt;,
the evaulation of parliamentary elections has the following labels. We
code them consistently &lt;code&gt;1:  free_and_fair&lt;/code&gt;, &lt;code&gt;2:  some_minor_problems&lt;/code&gt;,
&lt;code&gt;3:  some_major_problems&lt;/code&gt; and &lt;code&gt;4:  not_free&lt;/code&gt;.&lt;/p&gt;
&lt;table&gt;
&lt;colgroup&gt;
&lt;col style=&#34;width: 50%&#34; /&gt;
&lt;col style=&#34;width: 50%&#34; /&gt;
&lt;/colgroup&gt;
&lt;tbody&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“0. missing”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“1. they were completely free and fair”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“2. they were free and fair, with some minor problems”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“3. they were free and fair, with some major problems”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“4. they were not free and fair”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“8. i don’t know”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“9. declined to answer”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Missing”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“They were completely free and fair”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“They were free and fair, with some minor breaches”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“They were free and fair, with some major breaches”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“They were not free and fair”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Don’t know”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Refuse”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Completely free and fair”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Free and fair, but with minor problems”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;odd&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Free and fair, with major problems”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Not free or fair”&lt;/td&gt;
&lt;/tr&gt;
&lt;tr class=&#34;even&#34;&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Don’t know (Do not read)”&lt;/td&gt;
&lt;td style=&#34;text-align: left;&#34;&gt;“Decline to answer (Do not read)”&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;
&lt;p&gt;Of course, this harmonization is essential to get clean results like this:&lt;/p&gt;
















&lt;figure  id=&#34;figure-for-evaluation-or-reuse-of-parliamentary-elections-dataset-get-the-replication-data-and-the-code-from-the-zenodohhttpsdoiorg105281zenodo5034759-open-repository&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;For evaluation or reuse of parliamentary elections dataset get the replication data and the code from the [Zenodo](hhttps://doi.org/10.5281/zenodo.5034759) open repository.&#34; srcset=&#34;
               /media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_30b9d9bccbe8f347c912dbe10ef5159c.webp 400w,
               /media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_f7e62366b8310160e9cdd16714a5ac44.webp 760w,
               /media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/arabb-comparison-country-chart_hu876e56138097bf35e9ab80c0a7351314_159521_30b9d9bccbe8f347c912dbe10ef5159c.webp&#34;
               width=&#34;506&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      For evaluation or reuse of parliamentary elections dataset get the replication data and the code from the &lt;a href=&#34;hhttps://doi.org/10.5281/zenodo.5034759&#34;&gt;Zenodo&lt;/a&gt; open repository.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;In our case study, we had three forms of missingness: the respondent
&lt;em&gt;did not know&lt;/em&gt; the answer, the respondent &lt;em&gt;did not want&lt;/em&gt; to answer, and
at last, in some cases the &lt;em&gt;respondent was not asked&lt;/em&gt;, because the
country held no parliamentary elections. While in numerical processing,
all these answers must be left out from calculating averages, for
example, in a more detailed, categorical analysis they represent very
different cases. A high level of refusal to answer may be an indicator
of surpressing democratic opinion forming in itself.&lt;/p&gt;
&lt;p&gt;Survey harmonization with many countries entails tens of thousands of
small data management task, which, unless automatically documented,
logged, and created with a reproducible code, is a helplessly
error-prone process. We believe that our open-source software will bring
many new statistical information to the light, which, while legally
open, was never processed due to the large investment needed.&lt;/p&gt;
&lt;p&gt;We also started building experimental APIs data is running
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt; regularly.
We will place cultural access and participation data in the &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital
Music Observatory&lt;/a&gt;, climate
awareness, policy support and self-reported mitigation strategies into
the &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data
Observatory&lt;/a&gt;, and economy and
well-being data into our &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data
Observatory&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;further-plans&#34;&gt;Further plans&lt;/h2&gt;
&lt;p&gt;Retrospective survey harmonization is a far more complex task than this
blogpost suggest. Retrospective survey harmonization is a far more complex task than this blogpost suggest, because established survey programs have gathered decades of legacy data in legacy coding schemes and legacy file formats.  Putting the data right, and especially putting the invaluable descriptive and administrative (processing) metadata right is a huge undertaking. We are releasing example codes, datasets and charts for researchers to comapre our harmonized results with theirs, and improve our software. We are releasing example codes, datasets and charts
for researchers to comapre our harmonized results with theirs, and
improve our software.&lt;/p&gt;
&lt;h3 id=&#34;use-our-software&#34;&gt;Use our software&lt;/h3&gt;
&lt;p&gt;The &lt;code&gt;retroharmonize&lt;/code&gt; R package can be freely used, modified and
distributed under the GPL-3 license. For the main developer and
contributors, see the
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;package&lt;/a&gt; homepage. If you
use it for your work, please kindly cite it as:&lt;/p&gt;
&lt;p&gt;Daniel Antal (2021). retroharmonize: Ex Post Survey Data Harmonization.
R package version 0.1.17. &lt;a href=&#34;https://doi.org/10.5281/zenodo.5034752&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.5281/zenodo.5034752&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Download the &lt;a href=&#34;https://greendeal.dataobservatory.eu/media/bibliography/cite-retroharmonize.bib&#34; target=&#34;_blank&#34;&gt;BibLaTeX entry&lt;/a&gt;.&lt;/p&gt;
&lt;h3 id=&#34;tutorial-to-work-with-the-arab-barometer-survey-data&#34;&gt;Tutorial to work with the Arab Barometer survey data&lt;/h3&gt;
&lt;p&gt;Daniel Antal, &amp;amp; Ahmed Shaibani. (2021, June 26). Case Study: Working
With Arab Barometer Surveys for the retroharmonize R package (Version
0.1.6). Zenodo. &lt;a href=&#34;https://doi.org/10.5281/zenodo.5034759&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.5281/zenodo.5034759&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;For the replication data to report potential
&lt;a href=&#34;https://github.com/rOpenGov/retroharmonize/issues&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;issues&lt;/a&gt; and
improvement suggestions with the code:&lt;/p&gt;
&lt;p&gt;Daniel Antal, &amp;amp; Ahmed Shaibani. (2021). Replication Data for the
retroharmonize R Package Case Study: Working With Arab Barometer Surveys
(Version 0.1.6) [Data set]. Zenodo.
&lt;a href=&#34;https://doi.org/10.5281/zenodo.5034741&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.5281/zenodo.5034741&lt;/a&gt;&lt;/p&gt;
&lt;h3 id=&#34;experimental-api&#34;&gt;Experimental API&lt;/h3&gt;
&lt;p&gt;We are also experimenting with the automated placement of authoritative
and citeable figures and datasets in open repositories. For the climate
awareness dataset get the country averages and aggregates from
&lt;a href=&#34;https://doi.org/10.5281/zenodo.5035562&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Zenodo&lt;/a&gt;, and the plot in &lt;code&gt;jpg&lt;/code&gt;
or &lt;code&gt;png&lt;/code&gt; from &lt;a href=&#34;https://doi.org/10.6084/m9.figshare.14854359&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;figshare&lt;/a&gt;.
Our plan is to release open data in a modern API with rich descriptive
metadata meeting the &lt;em&gt;Dublin Core&lt;/em&gt; and &lt;em&gt;DataCite&lt;/em&gt; standards, and further
administrative metadata for correct coding, joining and further
manipulating or data, or for easy import into your database.&lt;/p&gt;
&lt;h3 id=&#34;join-our-open-source-effort&#34;&gt;Join our open source effort&lt;/h3&gt;
&lt;p&gt;Want to help us improve our open data service? Include
&lt;a href=&#34;https://www.latinobarometro.org/lat.jsp&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Lationbarómetro&lt;/a&gt; and the
&lt;a href=&#34;https://caucasusbarometer.org/en/datasets/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Caucasus Barometer&lt;/a&gt; in our
offering? Join the rOpenGov community of R package developers, an our
open collaboration to create the automated data observatories. We are
not only looking for
&lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer/&#34;&gt;developers&lt;/a&gt;,
but &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator/&#34;&gt;data
curators&lt;/a&gt; and
&lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team/&#34;&gt;service design
associates&lt;/a&gt;, too.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Open Data - The New Gold Without the Rush</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-18-gold-without-rush/</link>
      <pubDate>Fri, 18 Jun 2021 17:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-18-gold-without-rush/</guid>
      <description>&lt;p&gt;&lt;em&gt;If open data is the new gold, why even those who release fail to reuse it? We created an open collaboration of data curators and open-source developers to dig into novel open data sources and/or increase the usability of existing ones. We transform reproducible research software into research- as-service.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Every year, the EU announces that billions and billions of data are now “open” again, but this is not gold. At least not in the form of nicely minted gold coins, but in gold dust and nuggets found in the muddy banks of chilly rivers. There is no rush for it, because panning out its value requires a lot of hours of hard work. Our goal is to automate this work to make open data usable at scale, even in trustworthy AI solutions.&lt;/p&gt;
















&lt;figure  id=&#34;figure-there-is-no-rush-for-it-because-panning-out-its-value-requires-a-lot-of-hours-of-hard-work-our-goal-is-to-automate-this-work-to-make-open-data-usable-at-scale-even-in-trustworthy-ai-solutions&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;There is no rush for it, because panning out its value requires a lot of hours of hard work. Our goal is to automate this work to make open data usable at scale, even in trustworthy AI solutions.&#34; srcset=&#34;
               /media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp 400w,
               /media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_faa00e96d3d0b700cfcf1daa513f3ad2.webp 760w,
               /media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      There is no rush for it, because panning out its value requires a lot of hours of hard work. Our goal is to automate this work to make open data usable at scale, even in trustworthy AI solutions.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Most open data is not public, it is not downloadable from the Internet – in the EU parlance, “open” only means a legal entitlement to get access to it. And even in the rare cases when data is open and public, often it is mired by data quality issues. We are working on the prototypes of a data-as-service and research-as-service built with open-source statistical software that taps into various and often neglected open data sources.&lt;/p&gt;
&lt;p&gt;We are in the prototype phase in June and our intentions are to have a well-functioning service by the time of the conference, because we are working only with open-source software elements; our technological readiness level is already very high. The novelty of our process is that we are trying to further develop and integrate a few open-source technology items into technologically and financially sustainable data-as-service and even research-as-service solutions.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-review-of-about-80-eu-un-and-oecd-data-observatories-reveals-that-most-of-them-do-not-use-these-organizationss-open-data---instead-they-use-various-and-often-not-well-processed-proprietary-sources&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our review of about 80 EU, UN and OECD data observatories reveals that most of them do not use these organizations&amp;#39;s open data - instead they use various, and often not well processed proprietary sources.&#34; srcset=&#34;
               /media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_0079ea9844f6c5e52b52fd0e627467a2.webp 400w,
               /media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_ecd6d08ba5e9bac19c8173546f036651.webp 760w,
               /media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_0079ea9844f6c5e52b52fd0e627467a2.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our review of about 80 EU, UN and OECD data observatories reveals that most of them do not use these organizations&amp;rsquo;s open data - instead they use various, and often not well processed proprietary sources.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;We are taking a new and modern approach to the &lt;code&gt;data observatory&lt;/code&gt; concept, and modernizing it with the application of 21st century data and metadata standards, the new results of reproducible research and data science. Various UN and OECD bodies, and particularly the European Union support or maintain more than 60 data observatories, or permanent data collection and dissemination points, but even these do not use these organizations and their members open data. We are building open-source data observatories, which run open-source statistical software that automatically processes and documents reusable public sector data (from public transport, meteorology, tax offices, taxpayer funded satellite systems, etc.) and reusable scientific data (from EU taxpayer funded research) into new, high quality statistical indicators.&lt;/p&gt;
















&lt;figure  id=&#34;figure-we-are-taking-a-new-and-modern-approach-to-the-data-observatory-concept-and-modernizing-it-with-the-application-of-21st-century-data-and-metadata-standards-the-new-results-of-reproducible-research-and-data-science&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We are taking a new and modern approach to the ‘data observatory’ concept, and modernizing it with the application of 21st century data and metadata standards, the new results of reproducible research and data science&#34; srcset=&#34;
               /media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_c18a97f00bbcac322614b6c2d55783f6.webp 400w,
               /media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_8b655e803b41b817a8093a37ccd19689.webp 760w,
               /media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_c18a97f00bbcac322614b6c2d55783f6.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      We are taking a new and modern approach to the ‘data observatory’ concept, and modernizing it with the application of 21st century data and metadata standards, the new results of reproducible research and data science
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;ul&gt;
&lt;li&gt;We are building various open-source data collection tools in R and Python to bring up data from big data APIs and legally open, but not public, and not well served data sources. For example, we are working on capturing representative data from the Spotify API or creating harmonized datasets from the Eurobarometer and Afrobarometer survey programs.&lt;/li&gt;
&lt;li&gt;Open data is usually not public; whatever is legally accessible is usually not ready to use for commercial or scientific purposes. In Europe, almost all taxpayer funded data is legally open for reuse, but it is usually stored in heterogeneous formats, processed into an original government or scientific need, and with various and low documentation standards. Our expert data curators are looking for new data sources that should be (re-) processed and re-documented to be usable for a wider community. We would like to introduce our service flow, which touches upon many important aspects of data scientist, data engineer and data curatorial work.&lt;/li&gt;
&lt;li&gt;We believe that even such generally trusted data sources as Eurostat often need to be reprocessed, because various legal and political constraints do not allow the common European statistical services to provide optimal quality data – for example, on the regional and city levels.&lt;/li&gt;
&lt;li&gt;With &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/ropengov/&#34;&gt;rOpenGov&lt;/a&gt; and other partners, we are creating open-source statistical software in R to re-process these heterogenous and low-quality data into tidy statistical indicators to automatically validate and document it.&lt;/li&gt;
&lt;li&gt;We are carefully documenting and releasing administrative, processing, and descriptive metadata, following international metadata standards, to make our data easy to find and easy to use for data analysts.&lt;/li&gt;
&lt;li&gt;We are automatically creating depositions and authoritative copies marked with an individual digital object identifier (DOI) to maintain data integrity.&lt;/li&gt;
&lt;li&gt;We are building simple databases and supporting APIs that release the data without restrictions, in a tidy format that is easy to join with other data, or easy to join into databases, together with standardized metadata.&lt;/li&gt;
&lt;li&gt;We maintain observatory websites (see: &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt;, &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt;) where not only the data is available, but we provide tutorials and use cases to make it easier to use them. Our mission is to show a modern, 21st century reimagination of the data observatory concept developed and supported by the UN, EU and OECD, and we want to show that modern reproducible research and open data could make the existing 60 data observatories and the planned new ones grow faster into data ecosystems.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;We are working around the open collaboration concept, which is well-known in open source software development and reproducible science, but we try to make this agile project management methodology more inclusive, and include data curators, and various institutional partners into this approach. Based around our early-stage startup, Reprex, and the open-source developer community rOpenGov, we are working together with other developers, data scientists, and domain specific data experts in climate change and mitigation, antitrust and innovation policies, and various aspects of the music and film industry.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-open-collaboration-is-truly-open-new-data-curatorsauthorscuratordevelopersauthorsdeveloper-and-service-designersauthorsteam-even-volunteers-and-citizen-scientists-are-welcome-to-join&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our open collaboration is truly open: new [data curators](/authors/curator/),[developers](/authors/developer/) and [service designers](/authors/team/), even volunteers and citizen scientists are welcome to join.&#34; srcset=&#34;
               /media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_a07a8e618fa7317f6f8256b9a334262e.webp 400w,
               /media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_3a4ae7f72478fd880961b08e1f7075dd.webp 760w,
               /media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/dmo_contributors_hua4f41ef7327b64bb97f169af135070bd_140729_a07a8e618fa7317f6f8256b9a334262e.webp&#34;
               width=&#34;760&#34;
               height=&#34;427&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our open collaboration is truly open: new &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator/&#34;&gt;data curators&lt;/a&gt;,&lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer/&#34;&gt;developers&lt;/a&gt; and &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team/&#34;&gt;service designers&lt;/a&gt;, even volunteers and citizen scientists are welcome to join.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Our open collaboration is truly open: new &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator/&#34;&gt;data curators&lt;/a&gt;, data scientists and data engineers are welcome to join. We develop open-source software in an agile way, so you can join in with an intermediate programming skill to build unit tests or add new functionality, and if you are a beginner, you can start with documentation and testing our tutorials. For business, policy, and scientific data analysts, we provide unexploited, exciting new datasets. Advanced developers can &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer/&#34;&gt;join&lt;/a&gt; our development team: the statistical data creation is mainly made in the R language, and the service infrastructure in Python and Go components.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Analyze Locally, Act Globally: New regions R Package Release</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-16-regions-release/</link>
      <pubDate>Wed, 16 Jun 2021 12:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-16-regions-release/</guid>
      <description>















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&#34; srcset=&#34;
               /media/img/package_screenshots/regions_017_169_hu4c6da2626fe9335e12d5da3506258dd2_123607_1aeab2d63a062640baf35ce7ffff4b52.webp 400w,
               /media/img/package_screenshots/regions_017_169_hu4c6da2626fe9335e12d5da3506258dd2_123607_340cd90381be5d85c6b08caba8072821.webp 760w,
               /media/img/package_screenshots/regions_017_169_hu4c6da2626fe9335e12d5da3506258dd2_123607_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/package_screenshots/regions_017_169_hu4c6da2626fe9335e12d5da3506258dd2_123607_1aeab2d63a062640baf35ce7ffff4b52.webp&#34;
               width=&#34;760&#34;
               height=&#34;427&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;p&gt;The new version of our &lt;a href=&#34;https://ropengov.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt; R package
&lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; was released today on
CRAN. This package is one of the engines of our experimental open
data-as-service &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt;, &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt;, &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; prototypes, which aim to
place open data packages into open-source applications.&lt;/p&gt;
&lt;details class=&#34;spoiler &#34;  id=&#34;spoiler-1&#34;&gt;
  &lt;summary&gt;Click to expand table of contents of the post&lt;/summary&gt;
  &lt;p&gt;&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#get-the-package&#34;&gt;Get the Package&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#join-us&#34;&gt;Join us&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;In international comparison the use of nationally aggregated indicators
often have many disadvantages: they inhibit very different levels of
homogeneity, and data is often very limited in number of observations
for a cross-sectional analysis. When comparing European countries, a few
missing cases can limit the cross-section of countries to around 20
cases which disallows the use of many analytical methods. Working with
sub-national statistics has many advantages: the similarity of the
aggregation level and high number of observations can allow more precise
control of model parameters and errors, and the number of observations
grows from 20 to 200-300.&lt;/p&gt;
















&lt;figure  id=&#34;figure-the-change-from-national-to-sub-national-level-comes-with-a-huge-data-processing-price-internal-administrative-boundaries-their-names-codes-codes-change-very-frequently&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;The change from national to sub-national level comes with a huge data processing price: internal administrative boundaries, their names, codes codes change very frequently.&#34; srcset=&#34;
               /media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_df043b13fb62aa7b45aa15fad51f4229.webp 400w,
               /media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_09a0d6124e334c5f1727420a059512a9.webp 760w,
               /media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/indicator_with_map_hue9f606f6489f63a22f67aeb7e2b3402b_98843_df043b13fb62aa7b45aa15fad51f4229.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      The change from national to sub-national level comes with a huge data processing price: internal administrative boundaries, their names, codes codes change very frequently.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Yet the change from national to sub-national level comes with a huge
data processing price. While national boundaries are relatively stable,
with only a handful of changes in each recent decade. The change of
national boundaries requires a more-or-less global consensus. But states
are free to change their internal administrative boundaries, and they do
it with large frequency. This means that the names, identification codes
and boundary definitions of sub-national regions change very frequently.
Joining data from different sources and different years can be very
difficult.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-regions-r-packagehttpsregionsdataobservatoryeu-helps-the-data-processing-validation-and-imputation-of-sub-national-regional-datasets-and-their-coding&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our [regions R package](https://regions.dataobservatory.eu/) helps the data processing, validation and imputation of sub-national, regional datasets and their coding.&#34; srcset=&#34;
               /media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_65df57cf4311bb2623535a1a5be044c0.webp 400w,
               /media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_81a53fd42fac7f0c3fe4e1a89d5b7892.webp 760w,
               /media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/recoded_indicator_with_map_hubda8124fbfd6305eacfd3d4f0fcd06cc_71873_65df57cf4311bb2623535a1a5be044c0.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our &lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions R package&lt;/a&gt; helps the data processing, validation and imputation of sub-national, regional datasets and their coding.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;There are numerous advantages of switching from a national level of the
analysis to a sub-national level comes with a huge price in data
processing, validation and imputation, and the
&lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; package aims to help this
process.&lt;/p&gt;
&lt;p&gt;You can review the problem, and the code that created the two map
comparisons, in the &lt;a href=&#34;https://regions.dataobservatory.eu/articles/maping.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Maping Regional Data, Maping Metadata
Problems&lt;/a&gt;
vignette article of the package. A more detailed problem description can
be found in &lt;a href=&#34;https://regions.dataobservatory.eu/articles/Regional_stats.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Working With Regional, Sub-National Statistical
Products&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;This package is an offspring of the
&lt;a href=&#34;https://ropengov.github.io/eurostat/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;eurostat&lt;/a&gt; package on
&lt;a href=&#34;https://ropengov.github.io/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt;. It started as a tool to
validate and re-code regional Eurostat statistics, but it aims to be a
general solution for all sub-national statistics. It will be developed
parallel with other rOpenGov packages.&lt;/p&gt;
&lt;h2 id=&#34;get-the-package&#34;&gt;Get the Package&lt;/h2&gt;
&lt;p&gt;You can install the development version from
&lt;a href=&#34;https://github.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GitHub&lt;/a&gt; with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;devtools::install_github(&amp;quot;rOpenGov/regions&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;or the released version from CRAN:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;install.packages(&amp;quot;regions&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can review the complete package documentation on
&lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions.dataobservaotry.eu&lt;/a&gt;. If
you find any problems with the code, please raise an issue on
&lt;a href=&#34;https://github.com/rOpenGov/regions&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Github&lt;/a&gt;. Pull requests are welcome
if you agree with the &lt;a href=&#34;https://contributor-covenant.org/version/2/0/CODE_OF_CONDUCT.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Contributor Code of
Conduct&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;If you use &lt;code&gt;regions&lt;/code&gt; in your work, please cite the
package as:
Daniel Antal. (2021, June 16). regions (Version 0.1.7). CRAN. &lt;a href=&#34;%28https://doi.org/10.5281/zenodo.4965909%29&#34;&gt;http://doi.org/10.5281/zenodo.4965909&lt;/a&gt;&lt;/p&gt;
&lt;p&gt;Download the &lt;a href=&#34;https://greendeal.dataobservatory.eu/media/bibliography/cite-regions.bib&#34; target=&#34;_blank&#34;&gt;BibLaTeX entry&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://cran.r-project.org/package=regions&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://www.r-pkg.org/badges/version/regions&#34; alt=&#34;CRAN_Status_Badge&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/a&gt;&lt;/p&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;details class=&#34;spoiler &#34;  id=&#34;spoiler-5&#34;&gt;
  &lt;summary&gt;Join our Green Deal Data Observatory collaboration!&lt;/summary&gt;
  &lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in economic policies, particularly computation antitrust, innovation and small enterprises? Check out our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Music Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;&lt;a href=&#34;https://twitter.com/intent/follow?screen_name=GreenDealObs&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://img.shields.io/twitter/follow/GreenDealObs.svg?style=social&#34; alt=&#34;Follow GreenDealObs&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/a&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Open Data is Like Gold in the Mud Below the Chilly Waves of Mountain Rivers</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-10-founder-daniel-antal/</link>
      <pubDate>Thu, 10 Jun 2021 07:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-10-founder-daniel-antal/</guid>
      <description>















&lt;figure  id=&#34;figure-open-data-is-like-gold-in-the-mud-below-the-chilly-waves-of-mountain-rivers-panning-it-out-requires-a-lot-of-patience-or-a-good-machine&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Open data is like gold in the mud below the chilly waves of mountain rivers. Panning it out requires a lot of patience, or a good machine.&#34; srcset=&#34;
               /media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp 400w,
               /media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_faa00e96d3d0b700cfcf1daa513f3ad2.webp 760w,
               /media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/slides/gold_panning_slide_notitle_hu8f7296f20da8c17f972a0534c44322c2_1382486_b042523dffe8143dea3d8c8c9c3262f4.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Open data is like gold in the mud below the chilly waves of mountain rivers. Panning it out requires a lot of patience, or a good machine.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;As the founder of the automated data observatories that are part of Reprex’s core activities, what type of data do you usually use in your day-to-day work?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The automated data observatories are results of syndicated research, data pooling, and other creative solutions to the problem of missing or hard-to-find data. The music industry is a very fragmented industry, where market research budgets and data are scattered in tens of thousands of small organizations in Europe. Working for the music and film industry as a data analyst and economist was always a pain because most of the efforts went into trying to find any data that can be analyzed. I spent most of the last 7-8 years trying to find any sort of information—from satellites to government archives—that could be formed into actionable data. I see three big sources of information: textual,numeric, and continuous recordings for on-site, offsite, and satellite sensors. I am much better with numbers than with natural language processing, and I am &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-06-tutorial-cds/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;improving with sensory sources&lt;/a&gt;. But technically, I can mint any systematic information—the text of an old book, a satellite image, or an opinion poll—into datasets.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;For you, what would be the ultimate dataset, or datasets that you would like to see in the Green Deal Data Observatory?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Our &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt; and &lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; packages can create regional statistics from &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/eurobarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer&lt;/a&gt; and &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/afrobarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Afrobarometer&lt;/a&gt; surveys on how people think locally about climate change. I would like to combine this with local information on observable climate change, such as drought, urban heat, and extreme weather conditions. Do people have to feel the pain of climate change to believe in the phenomenon? How do self-reported mitigation steps correlate with what people already feel in their local environment? Suzan is &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-07-introducing-suzan-sidal/&#34;&gt;talking&lt;/a&gt; about measuring mitigation and damage control, because she&amp;rsquo;s aware of the already present health risks in overheating urban environments. I am more interested in what people think.&lt;/p&gt;
















&lt;figure  id=&#34;figure-see-our-case-studyhttpsgreendealdataobservatoryeupost2021-04-23-belgium-flood-insurance-on-connecting-local-tax-revenues-climate-awareness-poll-data-and-drought-data-in-belgium---we-want-to-extend-this-to-europe-and-then-to-africa-we-also-published-the-code-how-to-do-it-with-tutorials-1post2021-03-05-retroharmonize-climate-2httpsrpubscomantaldanielregions-ood21-for-our-international-open-data-day-2021-eventhttpsgreendealnetlifyapptalkreprex-open-data-day-2021&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;See our [case study](https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/) on connecting local tax revenues, climate awareness poll data and drought data in Belgium - we want to extend this to Europe and then to Africa. We also published the code how to do it with tutorials [1](/post/2021-03-05-retroharmonize-climate/), [2](https://rpubs.com/antaldaniel/regions-OOD21) for our [International Open Data Day 2021 Event](https://greendeal.netlify.app/talk/reprex-open-data-day-2021/).&#34; srcset=&#34;
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp 400w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_732c5a4fed2e5086cd4649603e01bc64.webp 760w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      See our &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;case study&lt;/a&gt; on connecting local tax revenues, climate awareness poll data and drought data in Belgium - we want to extend this to Europe and then to Africa. We also published the code how to do it with tutorials &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-03-05-retroharmonize-climate/&#34;&gt;1&lt;/a&gt;, &lt;a href=&#34;https://rpubs.com/antaldaniel/regions-OOD21&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;2&lt;/a&gt; for our &lt;a href=&#34;https://greendeal.netlify.app/talk/reprex-open-data-day-2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;International Open Data Day 2021 Event&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;Is there a number or piece of information that recently surprised you? If so, what was it?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There were a few numbers that surprised me, and some of them were brought up by our observatory teams. Karel is &lt;a href=&#34;post/2021-06-08-data-curator-karel-volckaert/&#34;&gt;talking&lt;/a&gt; about the fact that not all green energy is green at all: many hydropower stations contribute to the greenhouse effect and not reduce it. Annette brought up the growing interest in the &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-09-team-annette-wong/&#34;&gt;Dalmatian breed&lt;/a&gt; after the Disney &lt;em&gt;101 Dalmatians&lt;/em&gt; movies, and it reminded me of the astonishing growth in interest for chess sets, chess tutorials, and platform subscriptions after the success of Netflix’s &lt;em&gt;The Queen’s Gambit&lt;/em&gt;.&lt;/p&gt;
















&lt;figure  id=&#34;figure-the-queens-gambit-chess-boom-moves-online-by-rachael-dottle-on-bloombergcomhttpswwwbloombergcomgraphics2020-chess-boom&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;*The Queen’s Gambit’ Chess Boom Moves Online By Rachael Dottle* on [bloomberg.com](https://www.bloomberg.com/graphics/2020-chess-boom/)&#34; srcset=&#34;
               /media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_4fc47acea402086dd3891772877289db.webp 400w,
               /media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_b60a154be5ab781fb70d16f62f39966c.webp 760w,
               /media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/queens_gambit_bloomberg_hub50434a1789646b36daf41ad10e65b52_92708_4fc47acea402086dd3891772877289db.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      &lt;em&gt;The Queen’s Gambit’ Chess Boom Moves Online By Rachael Dottle&lt;/em&gt; on &lt;a href=&#34;https://www.bloomberg.com/graphics/2020-chess-boom/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;bloomberg.com&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Annette is talking about the importance of cultural influencers, and on that theme, what could be more exciting that &lt;a href=&#34;https://www.netflix.com/nl-en/title/80234304&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Netflix’s biggest success&lt;/a&gt; so far is not a detective series or a soap opera but a coming-of-age story of a female chess prodigy. Intelligence is sexy, and we are in the intelligence business.&lt;/p&gt;
&lt;p&gt;But to tell a more serious and more sobering number, I recently read with surprise that there are &lt;a href=&#34;https://www.theguardian.com/society/2021/may/27/number-of-smokers-has-reached-all-time-high-of-11-billion-study-finds&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;more people smoking cigarettes&lt;/a&gt; on Earth in 2021 than in 1990. Population growth in developing countries replaced the shrinking number of developed country smokers. While I live in Europe, where smoking is strongly declining, it reminds me that Europe’s population is a small part of the world. We cannot take for granted that our home-grown experiences about the world are globally valid.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Do you have a good example of really good, or really bad use of data?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://fivethirtyeight.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;FiveThirtyEight.com&lt;/a&gt; had a wonderful podcast series, produced by Jody Avirgan, called &lt;em&gt;What’s the Point&lt;/em&gt;.  It is exactly about good and bad uses of data, and each episode is super interesting. Maybe the most memorable is &lt;em&gt;Why the Bronx Really Burned&lt;/em&gt;. New York City tried to measure fire response times, identify redundancies in service, and close or re-allocate fire stations accordingly. What resulted, though, was a perfect storm of bad data: The methodology was flawed, the analysis was rife with biases, and the results were interpreted in a way that stacked the deck against poorer neighborhoods. It is similar to many stories told in a very compelling argument by Catherine D’Ignazio and Lauren F. Klein in their much celebrated book,  &lt;em&gt;Data Feminism&lt;/em&gt;. Usually, the bad use of data starts with a bad data collection practice. Data analysts in corporations, NGOs, public policy organizations and even in science usually analyze the data that is available.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;You can find these examples, together with many more that our contributors recommend, in the motivating examples of &lt;a href=&#34;https://contributors.dataobservatory.eu/data-curators.html#create-new-datasets&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Create New Datasets&lt;/a&gt; and the &lt;a href=&#34;https://contributors.dataobservatory.eu/data-curators.html#critical-attitude&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Remain Critical&lt;/a&gt; parts of our onboarding material. We hope that more and more professionals and citizen scientist will help us to create high-quality and open data.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The real power lies in designing a data collection program. A consistent data collection program usually requires an investment that only powerful organizations, such as government agencies, very large corporations, or the richest universities can afford. You cannot really analyze the data that is not collected and recorded; and usually what is not recorded is more interesting than what is. Our observatories want to democratize the data collection process and make it more available, more shared with research automation and pooling.&lt;/p&gt;
















&lt;figure  id=&#34;figure-you-cannot-really-analyze-the-data-that-is-not-collected-and-recorded-and-usually-what-is-not-recorded-is-more-interesting-than-what-is-our-observatories-want-to-democratize-the-data-collection-process-and-make-it-more-available-more-shared-with-research-automation-and-pooling&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;You cannot really analyze the data that is not collected and recorded; and usually what is not recorded is more interesting than what is. Our observatories want to democratize the data collection process and make it more available, more shared with research automation and pooling.&#34; srcset=&#34;
               /media/img/slides/value_added_from_automation_hu0cd38ea00fa26e2a5a435a4734d443af_246915_0c9aff1728ccce942df2d778c9b3c8f3.webp 400w,
               /media/img/slides/value_added_from_automation_hu0cd38ea00fa26e2a5a435a4734d443af_246915_140e32925c748c51631149098ba27aac.webp 760w,
               /media/img/slides/value_added_from_automation_hu0cd38ea00fa26e2a5a435a4734d443af_246915_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/slides/value_added_from_automation_hu0cd38ea00fa26e2a5a435a4734d443af_246915_0c9aff1728ccce942df2d778c9b3c8f3.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      You cannot really analyze the data that is not collected and recorded; and usually what is not recorded is more interesting than what is. Our observatories want to democratize the data collection process and make it more available, more shared with research automation and pooling.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;From your perspective, what do you see being the greatest problem with open data in 2021?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I have been involved with open data policies since 2004. The problem has not changed much: more and more data are available from governmental and scientific sources, but in a form that makes them useless. Data without clear description and clear processing information is useless for analytical purposes: it cannot be integrated with other data, and it cannot be trusted and verified. If researchers or government entities that fall under the &lt;a href=&#34;https://eur-lex.europa.eu/legal-content/EN/TXT/?uri=uriserv:OJ.L_.2019.172.01.0056.01.ENG&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Open Data Directive&lt;/a&gt; release data for reuse in a way that does not have descriptive or processing metadata, it is almost as if they did not release anything. You need this additional information to make valid analyses of the data, and to reverse-engineer them may cost more than to recollect the data in a properly documented process. Our developers, particularly &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-06-04-developer-leo-lahti/&#34;&gt;Leo&lt;/a&gt; and &lt;a href=&#34;post/2021-06-07-data-curator-pyry-kantanen/&#34;&gt;Pyry&lt;/a&gt; are talking eloquently about why you have to be careful even with governmental statistical products, and constantly be on the watch out for data quality.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-apidata-is-not-only-publishing-descriptive-and-processing-metadata-alongside-with-our-data-but-we-also-make-all-critical-elements-of-our-processing-code-available-for-peer-review-on-ropengovauthorsropengov&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our [API](/#data) is not only publishing descriptive and processing metadata alongside with our data, but we also make all critical elements of our processing code available for peer-review on [rOpenGov](/authors/ropengov/)&#34; srcset=&#34;
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp 400w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_5807feecbd17bee02fd8c68fad87b1d7.webp 760w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our &lt;a href=&#34;https://greendeal.dataobservatory.eu/#data&#34;&gt;API&lt;/a&gt; is not only publishing descriptive and processing metadata alongside with our data, but we also make all critical elements of our processing code available for peer-review on &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/ropengov/&#34;&gt;rOpenGov&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;What do you think the Green Deal Data Observatory, and our other automated observatories do, to make open data more credible in the European economic policy community and be accepted as verified information?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Most of our work is in research automation, and a very large part of our efforts are aiming to reverse engineer missing descriptive and processing metadata. In a way, I like to compare ourselves to the working method of the open-source intelligence platform &lt;a href=&#34;https://www.bellingcat.com&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Bellingcat&lt;/a&gt;. They were able to use publicly available, &lt;a href=&#34;https://www.bellingcat.com/category/resources/case-studies/?fwp_tags=mh17&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;scattered information from satellites and social media&lt;/a&gt; to identify each member of the Russian military company that illegally entered the territory of Ukraine and shot down the Malaysian Airways MH17 with 297, mainly Dutch, civilians on board.&lt;/p&gt;
















&lt;figure  id=&#34;figure-how-we-create-value-for-research-oriented-consultancies-public-policy-institutes-university-research-teams-journalists-or-ngos&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;How we create value for research-oriented consultancies, public policy institutes, university research teams, journalists or NGOs.&#34; srcset=&#34;
               /media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_c18a97f00bbcac322614b6c2d55783f6.webp 400w,
               /media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_8b655e803b41b817a8093a37ccd19689.webp 760w,
               /media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/slides/automated_observatory_value_chain_huf9c0a6d9b150a8fdeb42cadf99abee90_616274_c18a97f00bbcac322614b6c2d55783f6.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      How we create value for research-oriented consultancies, public policy institutes, university research teams, journalists or NGOs.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;We do not do such investigations but work very similarly to them in how we are filtering through many data sources and attempting to verify them when their descriptions and processing history is unknown. In the last years, we were able to estore the metadata of many European and African open data surveys, economic impact, and environmental impact data, or many other open data that was lying around for many years without users.&lt;/p&gt;
&lt;p&gt;Open data is like gold in the mud below the chilly waves of mountain rivers. Panning it out requires a lot of patience, or a good machine. I think we will come to as surprising and strong findings as Bellingcat, but we are not focusing on individual events and stories, but on social and environmental processes and changes.&lt;/p&gt;
















&lt;figure  id=&#34;figure-join-our-open-collaboration-green-deal-data-observatory-team-as-a-data-curatorauthorscurator-developerauthorsdeveloper-or-business-developerauthorsteam-or-share-your-data-in-our-public-repository-green-deal-data-observatory-on-zenodohttpszenodoorgcommunitiesgreendeal_observatory&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Join our open collaboration Green Deal Data Observatory team as a [data curator](/authors/curator), [developer](/authors/developer) or [business developer](/authors/team), or share your data in our public repository [Green Deal Data Observatory on Zenodo](https://zenodo.org/communities/greendeal_observatory/).&#34; srcset=&#34;
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp 400w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_3b536090581f2795373e801d65371e20.webp 760w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;, or share your data in our public repository &lt;a href=&#34;https://zenodo.org/communities/greendeal_observatory/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory on Zenodo&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Educate and Train Data Admirers that Data is not Scary</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-09-team-annette-wong/</link>
      <pubDate>Wed, 09 Jun 2021 12:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-09-team-annette-wong/</guid>
      <description>&lt;p&gt;&lt;em&gt;Annette Wong is helping our service development from a digital strategy and marketing point of view.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why is data important to the work that you do as a digital strategist at an agency?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As a marketing and digital agency, we work with clients to produce and develop marketing campaigns that impact the bottom line. One of the ways to determine the Return-On-Investment (ROI) is through data. By analyzing the data, our team is able to help our clients predict audience behavior and ideally convert them into taking action (&lt;code&gt;$$$&lt;/code&gt;).&lt;/p&gt;
&lt;p&gt;Currently, I’m working on a music livestreaming platform and everyday we’re always looking at how our campaigns are performing (and measuring their effectiveness). For example, if we’re running a paid campaign through Facebook and if it’s not converting at the expected &lt;code&gt;%&lt;/code&gt; that we want, it indicates to us that we need to change our approach. Data gives us the power and freedom to experiment (with minimal risk) and empowers us to make informed decisions quickly.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why are you excited about the Digital Music Observatory and is there a reason you decided to participate in this initiative?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Seeing how the pandemic decimated the music industry, specifically in-person events, made me feel a lot of empathy for musicians and the economics of their situation, especially with how musicians generate a living income through their music. The importance of data and having open access promotes transparency, fairer wages (ideally), and levels the playing field for musicians of all sizes and popularity.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-retroharmonization-softwarehttpsretroharmonizedataobservatoryeu-helps-the-creation-of-objective-and-comparable-indicators-about-how-musicians-make-a-livinghttpsdatamusicdataobservatoryeumusic-economyhtmlsupply-or-how-people-think-about-climate-challengeshttpsgreendealdataobservatoryeupost2021-04-23-belgium-flood-insurance&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our [retroharmonization software](https://retroharmonize.dataobservatory.eu/) helps the creation of objective and comparable indicators about how musicians [make a living](https://data.music.dataobservatory.eu/music-economy.html#supply), or how people think about [climate challenges](https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/).&#34; srcset=&#34;
               /media/img/blogposts_2021/difficulty_bills_levels_hu78dfb92a43f00170e8390b0e5066e58e_221046_00525ad9e8cd67c5f65a3ddf0508cfcf.webp 400w,
               /media/img/blogposts_2021/difficulty_bills_levels_hu78dfb92a43f00170e8390b0e5066e58e_221046_71eabed8441e8ba3b2b17c3c8c9bdbc0.webp 760w,
               /media/img/blogposts_2021/difficulty_bills_levels_hu78dfb92a43f00170e8390b0e5066e58e_221046_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/difficulty_bills_levels_hu78dfb92a43f00170e8390b0e5066e58e_221046_00525ad9e8cd67c5f65a3ddf0508cfcf.webp&#34;
               width=&#34;760&#34;
               height=&#34;570&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonization software&lt;/a&gt; helps the creation of objective and comparable indicators about how musicians &lt;a href=&#34;https://data.music.dataobservatory.eu/music-economy.html#supply&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;make a living&lt;/a&gt;, or how people think about &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;climate challenges&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;I decided to participate in this challenge because I love how data is a secret weapon that anyone can use to re-balance the interests of creators, distributors, and consumers.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Is there a number that recently surprised you? What was it?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;This is a little silly but very recently I watched the 101 Dalmatians movie. After watching the movie, I was curious to see if there was a correlation between the release of the movie and the number of Dalmations adopted afterwards. 101 Dalmatians was released in 1985 and 1991 which made thousands of families (in the U.S.) want to adopt one. The &lt;a href=&#34;https://www.akc.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;American Kennel Club&lt;/a&gt; reported that the annual number of Dalmatian puppies registered skyrocketed from 8,170 animals to 42,816.&lt;/p&gt;
















&lt;figure  id=&#34;figure-photo-john-o-groats-unsplash-licensehttpsunsplashcomlicense&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Photo: John o&amp;#39; Groats, [Unsplash license](https://unsplash.com/license).&#34; srcset=&#34;
               /media/img/blogposts_2021/loan-7gG_OG9w4Ds-unsplash_hu1be397c6221516f8d6307e9cacc7505a_1835905_6b942432e48e3c6a2d3d82e4baa96f72.webp 400w,
               /media/img/blogposts_2021/loan-7gG_OG9w4Ds-unsplash_hu1be397c6221516f8d6307e9cacc7505a_1835905_31699597eb85d7d293a667b843af0111.webp 760w,
               /media/img/blogposts_2021/loan-7gG_OG9w4Ds-unsplash_hu1be397c6221516f8d6307e9cacc7505a_1835905_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/loan-7gG_OG9w4Ds-unsplash_hu1be397c6221516f8d6307e9cacc7505a_1835905_6b942432e48e3c6a2d3d82e4baa96f72.webp&#34;
               width=&#34;760&#34;
               height=&#34;475&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Photo: John o&amp;rsquo; Groats, &lt;a href=&#34;https://unsplash.com/license&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Unsplash license&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;This information is interesting because it validates the idea of how culture influences consumer behavior. I think it’s really cool that we can measure cultural collisions and how it impacts the way we act, think, and respond.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What can our automated data observatories do to make open data more credible in the European economic policy community, or in the music business community more accepted?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I believe that people, in general, appreciate and understand the importance of data. But, it can be overwhelming, sometimes scary, and intimidating to deal with (esp. in large quantities).&lt;/p&gt;
&lt;p&gt;However, I feel more people are open to the idea of using data and understand the value of leveraging data to share objective truths. Something that our automated data observatories can do is to provide more opportunities to educate and train data admirers that data is not scary, that it is accessible, and it is here to help uncover insights that can’t be immediately seen.&lt;/p&gt;
















&lt;figure  id=&#34;figure-join-our-open-collaboration-green-deal-data-observatory-team-as-a-data-curatorauthorscurator-developerauthorsdeveloper-or-business-developerauthorsteam-or-share-your-data-in-our-public-repositorygreen-deal-data-observatory-on-zenodohttpszenodoorgcommunitiesgreendeal_observatory&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Join our open collaboration Green Deal Data Observatory team as a [data curator](/authors/curator), [developer](/authors/developer) or [business developer](/authors/team), or share your data in our public repository[Green Deal Data Observatory on Zenodo](https://zenodo.org/communities/greendeal_observatory/)&#34; srcset=&#34;
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp 400w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_3b536090581f2795373e801d65371e20.webp 760w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;, or share your data in our public repository&lt;a href=&#34;https://zenodo.org/communities/greendeal_observatory/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory on Zenodo&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Credibility is Enhanced Through Cross Links Between Different Data from Different Domains</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-08-data-curator-karel-volckaert/</link>
      <pubDate>Tue, 08 Jun 2021 18:50:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-08-data-curator-karel-volckaert/</guid>
      <description>&lt;p&gt;&lt;strong&gt;As a consultant, what type of data do you usually work with?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I work at the intersection between strategy, finance and organisation. My usual dataset is quite broad - and sometimes unstructured. Oftentimes, the most decisive data are ones that cross domains: economic data coupled with environmental measurements, sociodemographic characteristics linked with online analytics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If you were able to pick, what would be the ultimate dataset, or datasets that you would like to see in the Green Deal Data Observatory? And the Economy Data Observatory?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If I may venture that far, the interesting point is where these two data observatories meet. But high on my wishlist would be anything related to geospatial dispersion of environmental and climate data: land erosion, aerosols, solar incidence. From an economic perspective, my interest would go especially to - again - dispersion across regions or other geographical domains of, say, number of new enterprises, disposable income, tax incidence&amp;hellip;&lt;/p&gt;
















&lt;figure  id=&#34;figure-see-our-case-studyhttpsgreendealdataobservatoryeupost2021-04-23-belgium-flood-insurance-on-connecting-local-tax-revenues-climate-awareness-poll-data-and-drought-data-in-belgium&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;See our [case study](https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/) on connecting local tax revenues, climate awareness poll data and drought data in Belgium.&#34; srcset=&#34;
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp 400w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_732c5a4fed2e5086cd4649603e01bc64.webp 760w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      See our &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;case study&lt;/a&gt; on connecting local tax revenues, climate awareness poll data and drought data in Belgium.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;Why did you decide to join the challenge and why do you think that this would be a game changer for policymakers and for business leaders?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;There is, both from an ecological and a societal point of view, an urgent need for open-access, real-time, trustworthy data to base decisions on. Ever since Kydland &amp;amp; Prescott’s analyses of “rules rather than discretion” and even earlier analyses of investment under uncertainty, the dynamic rules for optimal decision-making (including investment) require fast-response reliable data.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Do you have a favorite, or most used open governmental or open science data source? What do you think about it?  Could it be improved?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Let me give one example: the &lt;a href=&#34;https://ec.europa.eu/info/business-economy-euro/indicators-statistics/economic-databases/macro-economic-database-ameco/ameco-database_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;AMECO annual macro-economic database&lt;/a&gt; is great for long-term historical analyses but its components ought to be real-time available. As an anecdote, as a fund manager in emerging markets we needed to anticipate macro-economic evolutions and in particular the manner in which capital markets anticipate these evolutions by adjusting foreign exchange rates or positioning themselves along yield curves. To some extent, we needed to predict what AMECO would tell us one year later by means of any real-time trustworthy assessments of the financial or economic situation. The latter data is what we would ideally have in an observatory.&lt;/p&gt;
















&lt;figure  id=&#34;figure-to-some-extent-we-needed-to-predict-what-amecohttpseceuropaeuinfobusiness-economy-euroindicators-statisticseconomic-databasesmacro-economic-database-amecoameco-database_en-would-tell-us-one-year-later-by-means-of-any-real-time-trustworthy-assessments-of-the-financial-or-economic-situation-the-latter-data-is-what-we-would-ideally-have-in-an-observatory&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;To some extent, we needed to predict what [AMECO](https://ec.europa.eu/info/business-economy-euro/indicators-statistics/economic-databases/macro-economic-database-ameco/ameco-database_en) would tell us one year later by means of any real-time trustworthy assessments of the financial or economic situation. The latter data is what we would ideally have in an observatory.&#34; srcset=&#34;
               /media/img/blogposts_2021/AMECO_screenshot_hu1e290340e7c6d0ed7c7e3cf6b9e1eac5_97374_7431fd5b697b9816895cd67d4ae6686d.webp 400w,
               /media/img/blogposts_2021/AMECO_screenshot_hu1e290340e7c6d0ed7c7e3cf6b9e1eac5_97374_be2785699e3b9b35616175a509dec218.webp 760w,
               /media/img/blogposts_2021/AMECO_screenshot_hu1e290340e7c6d0ed7c7e3cf6b9e1eac5_97374_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/AMECO_screenshot_hu1e290340e7c6d0ed7c7e3cf6b9e1eac5_97374_7431fd5b697b9816895cd67d4ae6686d.webp&#34;
               width=&#34;760&#34;
               height=&#34;427&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      To some extent, we needed to predict what &lt;a href=&#34;https://ec.europa.eu/info/business-economy-euro/indicators-statistics/economic-databases/macro-economic-database-ameco/ameco-database_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;AMECO&lt;/a&gt; would tell us one year later by means of any real-time trustworthy assessments of the financial or economic situation. The latter data is what we would ideally have in an observatory.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;Is there a piece of information that recently surprised you? What was it?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I am currently working on water-related issues and came across a result reported in &lt;a href=&#34;https://www.nature.com/articles/s41560-021-00784-y&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Nature Energy&lt;/a&gt; earlier this year that in more than one in ten hydropower stations, the extra warming from the dark surface of the water reservoir was enough to outbalance its “green” electricity generation potential, leading to no net climate benefits.&lt;/p&gt;
&lt;blockquote&gt;
&lt;p&gt;The researchers found that almost half of the reservoirs they surveyed took just four years to reach a net climate benefit. Unfortunately, they also found that 19% of those surveyed took more than 40 years to do so, and approximately 12% of them took 80 years—the average lifetime of a hydroelectric plant. &lt;a href=&#34;https://techxplore.com/news/2021-03-albedo-climate-penalty-hydropower-reservoirs.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Calculating the albedo-climate penalty of hydropower dammed reservoirs&lt;/a&gt;&lt;/p&gt;
&lt;/blockquote&gt;
&lt;p&gt;Again: spatial distribution matters&amp;hellip;&lt;/p&gt;
















&lt;figure  id=&#34;figure-photo-kees-streefkerk-unplash-licensehttpsunsplashcomlicense&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Photo: Kees Streefkerk, [Unplash License](https://unsplash.com/license)&#34; srcset=&#34;
               /media/img/blogposts_2021/photo-1503754163129-a02a0c097de0_hu3d03a01dcc18bc5be0e67db3d8d209a6_95667_a6bd16b069f533993e861f2040801744.webp 400w,
               /media/img/blogposts_2021/photo-1503754163129-a02a0c097de0_hu3d03a01dcc18bc5be0e67db3d8d209a6_95667_0044a4749a440d65ecfd5b0fa333e141.webp 760w,
               /media/img/blogposts_2021/photo-1503754163129-a02a0c097de0_hu3d03a01dcc18bc5be0e67db3d8d209a6_95667_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/photo-1503754163129-a02a0c097de0_hu3d03a01dcc18bc5be0e67db3d8d209a6_95667_a6bd16b069f533993e861f2040801744.webp&#34;
               width=&#34;668&#34;
               height=&#34;501&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Photo: Kees Streefkerk, &lt;a href=&#34;https://unsplash.com/license&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Unplash License&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;From your experience, what do you think the greatest problem with open data in 2021 will be?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Trust. In a society where “value” and even “truth” is determined more by the amount of (web) links to a particular “fact” than by its intrinsic characteristics, we need to be able to trust data — open data because it’s open and “closed” data because it’s closed.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What can our automated data observatories do to make open data more credible in the European economic policy and climate change or mitigation community and be more accepted as verified information?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;If I may refer to the previous answer: credibility is enhanced through cross-links between different data from different domains that “does not disprove” one another or that is internally consistent. If, say, data on taxable income goes in one direction and taxes in another, it is the reasoned reconciliation of the - alleged or real - inconsistency that will validate the comprehensive data set. So I am a great believer in broad, real-time observatories where not only the data capture, but the data reconciliation is automated, sometimes by means of a simple comparative statics analysis, in other cases maybe through quite elaborate artificial intelligence.&lt;/p&gt;
















&lt;figure  id=&#34;figure-join-our-open-collaboration-green-deal-data-observatory-team-as-a-data-curatorauthorscurator-developerauthorsdeveloper-or-business-developerauthorsteam-or-share-your-data-in-our-public-repository-green-deal-data-observatory-on-zenodohttpszenodoorgcommunitiesgreendeal_observatory&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Join our open collaboration Green Deal Data Observatory team as a [data curator](/authors/curator), [developer](/authors/developer) or [business developer](/authors/team), or share your data in our public repository [Green Deal Data Observatory on Zenodo](https://zenodo.org/communities/greendeal_observatory/).&#34; srcset=&#34;
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp 400w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_3b536090581f2795373e801d65371e20.webp 760w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;, or share your data in our public repository &lt;a href=&#34;https://zenodo.org/communities/greendeal_observatory/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory on Zenodo&lt;/a&gt;.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Developing an Open API is the Right Direction</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-08-developer-botond-vitos/</link>
      <pubDate>Mon, 07 Jun 2021 20:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-08-developer-botond-vitos/</guid>
      <description>&lt;p&gt;&lt;em&gt;Botond Vitos, PhD is responsible for maintaing our &lt;a href=&#34;https://greendeal.dataobservatory.eu/data/api/&#34;&gt;API&lt;/a&gt;. He first started collaboration with our &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; and its trustwrothy AI project.&lt;/em&gt;&lt;/p&gt;
&lt;h2 id=&#34;as-data-engineer-what-type-of-data-do-you-usually-use-in-your-projects&#34;&gt;As data engineer, what type of data do you usually use in your projects?&lt;/h2&gt;
&lt;p&gt;Coming from a cultural studies background, my main research interest has been grassroots music scenes and festival cultures, which I hope to extend to my current projects as data engineer and as a data scientist. My prior research’s scope was mainly qualitative and focused on the inside views and stories of scene participants and stakeholders, which was invaluable in the understanding of specialized stylistic vocabularies. At the same time, I was interested in the “bigger picture,” which can be approximated through algorithmic approaches and data analysis. With both interests together, I shifted towards data science and engineering.&lt;/p&gt;
















&lt;figure  id=&#34;figure-see-our-trustworthy-ai-driven-music-export-case-study-for-slovakiahttpsmusicdataobservatoryeupublicationlisten_local_2020&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;See our trustworthy AI-driven music export case study for [Slovakia](https://music.dataobservatory.eu/publication/listen_local_2020/)&#34; srcset=&#34;
               /media/img/streaming/listen_local_SK_EN_hue3bbdd36723034473d5308625670dcc8_550932_8e1b9f713792380fd59264a40e5b9362.webp 400w,
               /media/img/streaming/listen_local_SK_EN_hue3bbdd36723034473d5308625670dcc8_550932_990e882f700e82da59356785ef840ceb.webp 760w,
               /media/img/streaming/listen_local_SK_EN_hue3bbdd36723034473d5308625670dcc8_550932_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/streaming/listen_local_SK_EN_hue3bbdd36723034473d5308625670dcc8_550932_8e1b9f713792380fd59264a40e5b9362.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      See our trustworthy AI-driven music export case study for &lt;a href=&#34;https://music.dataobservatory.eu/publication/listen_local_2020/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Slovakia&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;I was recently involved with the development of a classification algorithm that detected stylistic directions within the music genres of electronic dance music labels found on Bandcamp. The &lt;a href=&#34;https://medium.com/data-lyrics/how-to-speak-about-music-in-the-digital-age-from-taxonomies-to-folksonomies-ac2d25ed29f7&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Bandcamp Librarian&lt;/a&gt; project makes use of the genre taxonomy offered by the industry website Beatport, which is a very top-down approach on electronic dance music genres, often resisted by the artists themselves (many of the more niche subgenres don’t even appear on the Beatport site). Accordingly, the project defined genre clusters within each Bandcamp label, which show up as combinations of Beatport subgenres. Also, it indicated some of the folksonomies (bottom-up stylistic definitions and tags) propagated by the musicians themselves.&lt;/p&gt;
















&lt;figure  id=&#34;figure-screenshot-of-the-first-verison-of-the-demo-app&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Screenshot of the first verison of the demo app.&#34; srcset=&#34;
               /media/img/streaming/listen_local_app_1_hu098db0e3c2b2943b540798ab81deb1b0_117013_98cf3836f56fdd9aae930cde9bb5a3e5.webp 400w,
               /media/img/streaming/listen_local_app_1_hu098db0e3c2b2943b540798ab81deb1b0_117013_50e29da19d86792d96fd18dc07a23aa1.webp 760w,
               /media/img/streaming/listen_local_app_1_hu098db0e3c2b2943b540798ab81deb1b0_117013_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/streaming/listen_local_app_1_hu098db0e3c2b2943b540798ab81deb1b0_117013_98cf3836f56fdd9aae930cde9bb5a3e5.webp&#34;
               width=&#34;760&#34;
               height=&#34;309&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Screenshot of the first verison of the demo app.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;In addition, working with Reprex, I became involved in the development of the Listen Local initiative. The system was aimed to protect the rights of small local artists by offering recommendation algorithms that prioritize local talent for consideration and enables the user to find local talent. The current playlist recommendations of streaming industry giants, such a,s Spotify prioritize big labels and big names, blocking access to the output of smaller, local musicians. Naturally, I looked at this project as a possible continuation of my previous work, and we are currently &lt;a href=&#34;https://bvitos.medium.com/bandcamp-librarian-part-ii-57adc160d13f&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;extending the scope of the Bandcamp Librarian&lt;/a&gt; to fit this initiative.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;In an ideal data world, what would be the ultimate dataset or datasets that you would like to see in the Digital Music Observatory?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As my answer to the previous question suggests, my main concern is the development of a trustworthy AI framework. Acknowledging the national and cultural diversity of the European Union, it is essential to enable access to data that takes into account such diversities and the priorities of smaller stakeholders as well. This type of data needs to be comprehensive and well-maintained, and I believe that with curators&amp;rsquo; priorities and the development of an easily accessible, open API, we are moving in the right direction.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-apihttpsapigreendealdataobservatoryeu-contains-rich-processing-and-descriptive-metadata-besides-our-high-quality-indicators&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our [API](https://api.greendeal.dataobservatory.eu/) contains rich processing and descriptive metadata besides our high-quality indicators.&#34; srcset=&#34;
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp 400w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_5807feecbd17bee02fd8c68fad87b1d7.webp 760w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our &lt;a href=&#34;https://api.greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;API&lt;/a&gt; contains rich processing and descriptive metadata besides our high-quality indicators.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;read-more-on-data--lyrics&#34;&gt;Read More on Data &amp;amp; Lyrics&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;&lt;a href=&#34;https://dataandlyrics.com/post/2021-05-16-recommendation-outcomes/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Recommendation Systems: What can Go Wrong with the Algorithm?&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>We Need More Reliable Datasets on the Urban Heat Resilience and Disaster Risk Reduction</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-07-introducing-suzan-sidal/</link>
      <pubDate>Mon, 07 Jun 2021 20:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-07-introducing-suzan-sidal/</guid>
      <description>&lt;p&gt;&lt;em&gt;Suzan Sidal is working in the service design team on validating user needs and building a sustainable business model for our observatory.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;As a consultant, what type of data do you usually use in your work at ECORYS?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;We work with a great variety of data &amp;ndash; both from qualitative and quantitative sources &amp;ndash; that we retrieve from publicly available sources or get through our clients. Since we are a public policy consultancy, most of the datasets are related to government reports, policies, statistics or surveys that we analyse and assess within a specific timeframe. Oftentimes, we gather open data like non-textual or numeric, such as maps and satellite images; so-called &amp;ldquo;raw data,&amp;rdquo; like weather, geospatial and environmental data; or data such as that generated in research like genomes, medical data, mathematical and scientific formulas.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;If you were able to pick, what would be the ultimate dataset, or datasets that you would like to see in the Green Deal Data Observatory?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I would like to see more data on the consequences and impact of increasing drought and urban heat in our cities in the Green Deal Data Observatory. Because of the complexity of rapidly developing metropolitan regions and the uncertainty associated with climate change, we need to explore more climate change adaptation and mitigation activities, or disaster risk reduction, not only climate change itself.&lt;/p&gt;
















&lt;figure  id=&#34;figure-see-our-drought-case-studyhttpsgreendealdataobservatoryeupost2021-04-23-belgium-flood-insurance-on-how-we-combine-very-different-data-in-our-observatory&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;See our [drought case study](https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/) on how we combine very different data in our observatory&#34; srcset=&#34;
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp 400w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_732c5a4fed2e5086cd4649603e01bc64.webp 760w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      See our &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;drought case study&lt;/a&gt; on how we combine very different data in our observatory
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;We need more reliable datasets on the effect of global warming on urban resilience and more indicators to inform stakeholders on disaster risk reduction. The Green Deal Observatory could build indexes for public and private entities once we would have all the relevant data at hand. With this project, we could explore many possibilities to actually utilise open data for a common and societal good, working towards a great social cause.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why did you decide to join the challenge and why do you think that this would be a game changer for policymakers and for business?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As a consultant for many socially relevant projects, everyday I see the importance of high quality and diverse datasets. I joined the challenge to contribute to significant causes enabled through the &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt; and &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt;. We can all benefit from the usage of open data, which is, in my opinion, a prerequisite for open government partnerships.&lt;/p&gt;
&lt;p&gt;I believe that through our work and through open data collaborations, we show a good example for a cultural change in the relationship between citizens and the state, which can contribute to more transparency, more participation and more intensive cooperation.&lt;/p&gt;
&lt;p&gt;The access and analysis of open data for the general public would make political action more transparent and more comprehensible. This can lead to greater accountability and a sense of duty on the part of public officials to the general public, which in turn can lead to greater acceptance of government action and strengthen the public&amp;rsquo;s trust in their government and administration.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Is there a number that recently surprised you? What was it?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Climate change is increasing people&amp;rsquo;s exposure to heat. Extreme temperature events have been documented to be rising in frequency, duration, and magnitude over the world. The number of persons exposed to heatwaves grew by roughly 125 million between 2000 and 2016.&lt;/p&gt;
















&lt;figure  id=&#34;figure-sydney-by-marek-piwnicki-unplash-licensehttpsunsplashcomlicense&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Sydney by Marek Piwnicki [Unplash License](https://unsplash.com/license)&#34; srcset=&#34;
               /media/img/blogposts_2021/photo-1618677064524-58aa3077d724_hu3d03a01dcc18bc5be0e67db3d8d209a6_67858_7810271986d56226671366766d741afa.webp 400w,
               /media/img/blogposts_2021/photo-1618677064524-58aa3077d724_hu3d03a01dcc18bc5be0e67db3d8d209a6_67858_c974837c7f886b97f02b8e31e1adfc2d.webp 760w,
               /media/img/blogposts_2021/photo-1618677064524-58aa3077d724_hu3d03a01dcc18bc5be0e67db3d8d209a6_67858_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/photo-1618677064524-58aa3077d724_hu3d03a01dcc18bc5be0e67db3d8d209a6_67858_7810271986d56226671366766d741afa.webp&#34;
               width=&#34;760&#34;
               height=&#34;475&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Sydney by Marek Piwnicki &lt;a href=&#34;https://unsplash.com/license&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Unplash License&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;From your experience, what do you think the greatest problem with open data in 2021 will be?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I see two great problems with the use of open data. The first one is the low level of exploitation.  The other is the lack of transparency in data processing.&lt;/p&gt;
&lt;p&gt;The use of open data should be transparent and meet high quality standards. If we want to enable communities to use it for solving local problems, we must do two things. First, data must be made easy to use (or actionable), and second, we have to increase public awareness and offer training for use. Furthermore, governments should release data in usable formats that follow open data guidelines. Currently, there is very little effort made at the community level to encourage the reuse of public data for the public good.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What can our automated data observatories do to make open data more credible in the European economic policy community and be more accepted as verified information?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Almost nothing is being done to help communities build the capability to analyze and implement open data without relying on technology.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-api-contains-rich-processing-and-descriptive-metadata-besides-our-high-quality-indicators&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our API contains rich processing and descriptive metadata besides our high-quality indicators.&#34; srcset=&#34;
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp 400w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_5807feecbd17bee02fd8c68fad87b1d7.webp 760w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our API contains rich processing and descriptive metadata besides our high-quality indicators.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;This is a critical task that the our fledlging data Observatories, the &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt;,  &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt; and &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt;, may be able to help with. Facilitating private-public partnerships is one step to encourage the data community to work with valuable open data. However, transparency and a high level quality assurance step must be given. In a joint collaboration with data curators, developers, technical specialists and academics, the datasets should be retrieved, cleaned and assessed in order to deliver efficient, relevant and credible information. The constant monitoring and regulation as well as compliance with data security guidelines are indispensable.&lt;/p&gt;
















&lt;figure  id=&#34;figure-join-our-open-collaboration-green-deal-data-observatory-team-as-a-data-curatorauthorscurator-developerauthorsdeveloper-or-business-developerauthorsteam-or-share-your-data-in-our-public-repositorygreen-deal-data-observatory-on-zenodohttpszenodoorgcommunitiesgreendeal_observatory&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Join our open collaboration Green Deal Data Observatory team as a [data curator](/authors/curator), [developer](/authors/developer) or [business developer](/authors/team), or share your data in our public repository[Green Deal Data Observatory on Zenodo](https://zenodo.org/communities/greendeal_observatory/)&#34; srcset=&#34;
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp 400w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_3b536090581f2795373e801d65371e20.webp 760w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;, or share your data in our public repository&lt;a href=&#34;https://zenodo.org/communities/greendeal_observatory/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory on Zenodo&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Comparing Data to Oil is a Cliché: Crude Oil Has to Go Through a Number of Steps and Pipes Before it Becomes Useful</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-07-data-curator-pyry-kantanen/</link>
      <pubDate>Mon, 07 Jun 2021 10:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-07-data-curator-pyry-kantanen/</guid>
      <description>&lt;p&gt;&lt;strong&gt;As a developer at rOpenGov, and as an economic sociologist, what type of data do you usually use in your work?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Generally speaking, people&amp;rsquo;s access to (or inequalities in accessing) different types of resources and their ability in transforming these resources to other types of resources is what interests me. The data I usually work with is the kind of data that is actually nicely covered by existing &lt;a href=&#34;http://ropengov.org/projects/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov tools&lt;/a&gt;: data about population demographics and administrative units from Statistics Finland, statistical information on welfare and health from Sotkanet and also data from Eurostat. Aside from these a lot of information is of course data from surveys and texts scraped from the internet.&lt;/p&gt;
















&lt;figure  id=&#34;figure-we-are-placing-the-growing-number-of-ropengov-toolshttpropengovorgprojects-in-a-modern-application-with-a-user-friendly-service-and-a-modern-data-api&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We are placing the growing number of [rOpenGov tools](http://ropengov.org/projects/) in a modern application with a user-friendly service and a modern data API.&#34; srcset=&#34;
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp 400w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_3250accecb68b0ec9716afed72d0f77e.webp 760w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      We are placing the growing number of &lt;a href=&#34;http://ropengov.org/projects/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov tools&lt;/a&gt; in a modern application with a user-friendly service and a modern data API.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;In your ideal data world, what would be the ultimate dataset, or datasets that you would like to see in the Music Data Observatory?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Late spring and early summer time is, at least for me, defined by the Eurovision Song Contest. Every year watching the contest makes me ponder the state of the music industry in my home country Finland as well as in Europe. Was the song produced by homegrown talent or was it imported? Was it better received by the professional jury or the public? How well does the domestic appeal of an artist translate to the international stage? Many interesting phenomena are difficult to quantify in a meaningful way and writing a catchy song with international appeal is probably more an art than a science. Nevertheless that should not deter us from trying as music, too, is bound by certain rules and regularities that can be researched.&lt;/p&gt;
















&lt;figure  id=&#34;figure-music-too-is-bound-by-certain-rules-and-regularities-that-can-be-researched-our-digital-music-observatory-and-its-listen-localhttpslistenlocalcommunity-experimental-app-does-this-exactly-and-we-would-love-to-create-eurovision-musicology-datasets-photo-eurovision-song-contest-2021-press-photo-by-jordy-brada&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Music, too, is bound by certain rules and regularities that can be researched. Our Digital Music Observatory and its [Listen Local](https://listenlocal.community/) experimental App does this exactly, and we would love to create Eurovision musicology datasets. Photo: Eurovision Song Contest 2021 press photo by Jordy Brada&#34; srcset=&#34;
               /media/img/developers/eurovision_2021_huf9815e7cf4b1c9b3f684b59f4bffe562_174893_128e4603e1cc31d89be889f39db80a2b.webp 400w,
               /media/img/developers/eurovision_2021_huf9815e7cf4b1c9b3f684b59f4bffe562_174893_2a432aace03316af1742caebd211be99.webp 760w,
               /media/img/developers/eurovision_2021_huf9815e7cf4b1c9b3f684b59f4bffe562_174893_1200x1200_fit_q75_h2_lanczos.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/developers/eurovision_2021_huf9815e7cf4b1c9b3f684b59f4bffe562_174893_128e4603e1cc31d89be889f39db80a2b.webp&#34;
               width=&#34;760&#34;
               height=&#34;505&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Music, too, is bound by certain rules and regularities that can be researched. Our Digital Music Observatory and its &lt;a href=&#34;https://listenlocal.community/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Listen Local&lt;/a&gt; experimental App does this exactly, and we would love to create Eurovision musicology datasets. Photo: Eurovision Song Contest 2021 press photo by Jordy Brada
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;em&gt;Why did you decide to join the EU Datathon challenge team and why do you think that this would be a game changer for researchers and policymakers?&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;The challenge has, in my opinion, great potential in leading by example when it comes to open data access and reproducible research. Comparing data to oil is a common phrase but fitting in the sense that crude oil has to go through a number of steps and pipes before it becomes useful. Most users and especially policymakers appreciate ease-of-use of the finished product, but the quality of the product and the process must also be guaranteed somehow. Openness and peer-review practices are the best guarantors in the field of data, just as industrial standards and regulations are in the oil industry.&lt;/p&gt;
















&lt;figure  id=&#34;figure-we-provide-many-layers-of-fully-transparent-quality-control-about-the-data-we-are-placing-in-our-data-apis-and-provide-for-our-end-users&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We provide many layers of fully transparent quality control about the data we are placing in our data APIs and provide for our end-users.&#34; srcset=&#34;
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp 400w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_5807feecbd17bee02fd8c68fad87b1d7.webp 760w,
               /media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/GDO_API_metadata_table_hu31b494a33d5ae09272643545372dbd1d_100491_225afcd2a785db051b89c7c36fdc28b9.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      We provide many layers of fully transparent quality control about the data we are placing in our data APIs and provide for our end-users.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Music Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in climate change, mitigation or climate action? Check out our &lt;a href=&#34;https://greendeal.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Join Copernicus Climate Data Store Data with Socio-Economic and Opinion Poll Data</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-06-tutorial-cds/</link>
      <pubDate>Sun, 06 Jun 2021 10:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-06-tutorial-cds/</guid>
      <description>&lt;p&gt;In this series of blogposts we will show how to collect environmental
data from the EU’s &lt;a href=&#34;https://cds.climate.copernicus.eu/#!/home&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Copernicus Climate Data
Store&lt;/a&gt;, and bring it to a
data format that you can join with Eurostat’s socio-economic and
environmental data. We have shown in &lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;a previous
blogpost&lt;/a&gt;
how to connect this to survey (opinion poll) and tax data, and a real
policy problem in Belgium. We will create now subsequent tutorials to do
more!&lt;/p&gt;
&lt;p&gt;But first, why are we doing this? The European Union and its members
states are releasing every year more and more data for open re-use since
2003, yet these are often not used in the EU’s data dissemination
projects (the observatories) or in EU-funded research. We believe that
there are &lt;a href=&#34;https://greendeal.dataobservatory.eu/project/eu-datathon_2021/#problem-statement&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;many
reasons&lt;/a&gt;
behind this. Whilst more and more people can conduct business,
scientific or policy analysis programmatically or with statistical
software, knowledge how to systematically collect the data from the
exponentially growing availability is not everybody’s specialty. And the
lack of documentation, and high re-processing and validation need for
open data is another drawback.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;http://ropengov.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt; has long been producing high-quality,
peer-reviewed R packages to work with open data, but their use is not
for all. In an open collaboration, where you can join, too, rOpenGov
&lt;a href=&#34;https://greendeal.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;teamed up&lt;/a&gt; with
open source developers, knowledgeable data curators, and a service
developer team lead by the Dutch reproducible research start-up
&lt;a href=&#34;https://reprex.nl/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Reprex&lt;/a&gt; to create a sustainable infrastructure that
is permanently collecting, processing, documenting and visualizing open
data. What we do is that we access open data (that is not always
available for direct download) and re-process it to usable data that is
&lt;a href=&#34;https://cran.r-project.org/web/packages/tidyr/vignettes/tidy-data.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;tidy&lt;/a&gt;
to be integrated with your existing data or databases. We are competing
for the &lt;a href=&#34;https://greendeal.dataobservatory.eu/project/eu-datathon_2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;EU
Datathon&lt;/a&gt;
Challenge 1: supporting a European Green Deal agenda with open data as a
service, and research as a servcie, and you are more than welcome to
join our effort as a developer, a data curator, or as an occasional
contributor to open government packages.&lt;/p&gt;
















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&#34; srcset=&#34;
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp 400w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_3250accecb68b0ec9716afed72d0f77e.webp 760w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;h2 id=&#34;register-to-the-copernicus-climate-data-store&#34;&gt;Register to the Copernicus Climate Data Store&lt;/h2&gt;
&lt;p&gt;Koen Hufkens, Reto Stauffer and Elio Campitelli created the
&lt;a href=&#34;https://bluegreen-labs.github.io/ecmwfr/index.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ecmwfr&lt;/a&gt; R package
for programmatically accessing the Copernicus Data Store service. Follow
the &lt;a href=&#34;https://bluegreen-labs.github.io/ecmwfr/articles/cds_vignette.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;CDS Functionality
vignette&lt;/a&gt;
to get started.&lt;/p&gt;
&lt;p&gt;You will need to create a &lt;a href=&#34;https://cds.climate.copernicus.eu/user/91923/edit&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Register yourself for CDS
services&lt;/a&gt; after
accepting the &lt;a href=&#34;https://cds.climate.copernicus.eu/disclaimer-privacy&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Terms and
conditions&lt;/a&gt;.&lt;/p&gt;
















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&#34; srcset=&#34;
               /media/img/tutorials/register_to_cds_hub0b07c0de85c1c6f552b5959e300cde5_61323_bf70ade001619e999a885daf0f712a00.webp 400w,
               /media/img/tutorials/register_to_cds_hub0b07c0de85c1c6f552b5959e300cde5_61323_92f833ed7a49aa44d59ff98c399f97dd.webp 760w,
               /media/img/tutorials/register_to_cds_hub0b07c0de85c1c6f552b5959e300cde5_61323_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/tutorials/register_to_cds_hub0b07c0de85c1c6f552b5959e300cde5_61323_bf70ade001619e999a885daf0f712a00.webp&#34;
               width=&#34;760&#34;
               height=&#34;427&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;pre&gt;&lt;code&gt;wf_set_key(user:  &amp;quot;12345&amp;quot;, 
           key:  &amp;quot;00000000-aaaa-b1b1-0000-a1a1a1a1a1a1&amp;quot;, 
           service:  &amp;quot;cds&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;You can check if you were successful with:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;ecmwfr::wf_get_key(user:  &amp;quot;12345&amp;quot;, service:  &amp;quot;cds&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;get-the-data&#34;&gt;Get the Data&lt;/h2&gt;
&lt;p&gt;Let us formulate our first request:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;request_lai_hv_2019_06 &amp;lt;- list(
  &amp;quot;dataset_short_name&amp;quot;:  &amp;quot;reanalysis-era5-land-monthly-means&amp;quot;,
  &amp;quot;product_type&amp;quot;  :  &amp;quot;monthly_averaged_reanalysis&amp;quot;,
  &amp;quot;variable&amp;quot;      :  &amp;quot;leaf_area_index_high_vegetation&amp;quot;,
  &amp;quot;year&amp;quot;          :  &amp;quot;2019&amp;quot;,
  &amp;quot;month&amp;quot;         :   &amp;quot;06&amp;quot;,
  &amp;quot;time&amp;quot;          :  &amp;quot;00:00&amp;quot;,
  &amp;quot;area&amp;quot;          :  &amp;quot;70/-20/30/60&amp;quot;,
  &amp;quot;format&amp;quot;        :  &amp;quot;netcdf&amp;quot;,
  &amp;quot;target&amp;quot;        :  &amp;quot;demo_file.nc&amp;quot;)

lai_hv_2019_06.nc  &amp;lt;- wf_request(user:  &amp;quot;&amp;lt;your_ID&amp;gt;&amp;quot;,
                     request:  request_lai_hv_2019_06 ,
                     transfer:  TRUE,
                     path:  &amp;quot;data-raw&amp;quot;,
                     verbose:  FALSE)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;effective-leaf-area-index&#34;&gt;Effective Leaf Area Index&lt;/h2&gt;
&lt;p&gt;You can find this data either in global computer raster images, or in
re-processed monthly averages. Working with the raw data is not very
practical – in case of cloudy weather you have missing data, and the
files are extremely huge for a personal computer. For the purposes of
our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt;
the monthly average values are far more practical, which are called
&lt;code&gt;monthly_averaged_reanalysis&lt;/code&gt; product types.&lt;/p&gt;
&lt;p&gt;For compatibility with other R packages, convert the data with the from
&lt;a href=&#34;https://rspatial.org/raster/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;raster&lt;/a&gt; package from
&lt;a href=&#34;https://rspatial.org&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rSpatial.org&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;lai_file &amp;lt;- here::here( &amp;quot;data-raw&amp;quot;, &amp;quot;demo_file.nc&amp;quot;)
lai_raster &amp;lt;- raster::raster(lai_file)

## Loading required namespace: ncdf4
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let us convert this to a &lt;code&gt;SpatialDataPointsDataFrame&lt;/code&gt; class, which is an
augmented data frame class with coordinates.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LAI_df &amp;lt;- raster::rasterToPoints(lai_raster, fun=NULL, spatial=TRUE)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;get-the-map&#34;&gt;Get The Map&lt;/h2&gt;
&lt;p&gt;With the help fo &lt;a href=&#34;http://ropengov.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt;, we are creating
various R packages to programmatically access open data and put them
into the right format. The popular
&lt;a href=&#34;http://ropengov.github.io/eurostat/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;eurostat&lt;/a&gt; package is not only
useful to download data from Eurostat, but also to map it.&lt;/p&gt;
&lt;p&gt;In this case, we want to create regional maps. Europe has five levels of
geographical regions: &lt;code&gt;NUTS0&lt;/code&gt; for countries, &lt;code&gt;NUTS1&lt;/code&gt; for larger areas
like states, provinces; &lt;code&gt;NUTS2&lt;/code&gt; for smaller areas like countries,
&lt;code&gt;NUTS3&lt;/code&gt; for even smaller areas. The &lt;code&gt;LAU&lt;/code&gt; level contains settlemens and
their surrounding areas.&lt;/p&gt;
&lt;p&gt;Country borders change sometimes (think about the unification of
Germany, or the breakup of Czechoslovakia and Yugoslavia), but they are
relatively stable entities. Sub-national regional border change
very-very frequently – since 2000 there were many thousand changes in
Europe. This means that you must choose one regional boundary
definition. The latest edition is &lt;code&gt;NUTS2021&lt;/code&gt; but most of the data
available is still in the &lt;code&gt;NUTS2016&lt;/code&gt; format, and often you will find
&lt;code&gt;NUTS2013&lt;/code&gt; or even &lt;code&gt;NUTS2010&lt;/code&gt; data around. Our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data
Observatory&lt;/a&gt; uses the &lt;code&gt;NUTS2016&lt;/code&gt;
definition, because it is far the most used in 2021. An offspring of the
&lt;a href=&#34;http://ropengov.github.io/eurostat/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;eurostat&lt;/a&gt; package,
&lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; helps you take care of
NUTS changes when you work, and can convert your data to &lt;code&gt;NUTS2021&lt;/code&gt; if
you later need it.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## sf at resolution 1:60 read from local file

## Warning in eurostat::get_eurostat_geospatial(resolution:  &amp;quot;60&amp;quot;, nuts_level: 
## &amp;quot;2&amp;quot;, : Default of &#39;make_valid&#39; for &#39;output_class=&amp;quot;sf&amp;quot;&#39; will be changed in the
## future (see function details).

plot(map_nuts_2)
&lt;/code&gt;&lt;/pre&gt;
















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&#34; srcset=&#34;
               /media/img/tutorials/cds_tutorial_plot_1_hue23442eb5edee4c705b69c6160645e77_6309_00bf66866999e071c262a0963b7726e5.webp 400w,
               /media/img/tutorials/cds_tutorial_plot_1_hue23442eb5edee4c705b69c6160645e77_6309_28265a8228e87ca8ef84824993690bcf.webp 760w,
               /media/img/tutorials/cds_tutorial_plot_1_hue23442eb5edee4c705b69c6160645e77_6309_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/tutorials/cds_tutorial_plot_1_hue23442eb5edee4c705b69c6160645e77_6309_00bf66866999e071c262a0963b7726e5.webp&#34;
               width=&#34;672&#34;
               height=&#34;480&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;p&gt;Our measurement of the average Effective Leaf Area Index is a raster
data, it is given for many points of Europe’s map. What we need to do is
to overlay this raster information of the statistical map of Europe. We
use the excellent &lt;a href=&#34;https://github.com/edzer/sp&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;sp: R Classes and Methods for Spatial
Data&lt;/a&gt; package for this purpose. The
&lt;code&gt;sp::over()&lt;/code&gt; function decides if a point of Leaf Area Index measurement
falls into the polygon (shape) of a particular NUTS2 regions, for
example, Zuid-Holland or South Holland in the Netherlands, or Saarland
in Germany, or not. Then it averages with the &lt;code&gt;mean()&lt;/code&gt; function those
measurements falling in the area.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;LAI_nuts_2:  sp::over(sp::geometry(
  as(map_nuts_2, &#39;Spatial&#39;)), 
  LAI_df,
  fn=mean)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s call the average LAI index &lt;code&gt;lai&lt;/code&gt;, and bind it to the Eurostat map:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;names(LAI_nuts_2)[1] &amp;lt;- &amp;quot;lai&amp;quot;
LAI_sfdf &amp;lt;- bind_cols ( map_nuts_2, LAI_nuts_2 )
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you want to work with the data in a numeric context, you do not need
the geographical information, and you can “downgrade” the
&lt;code&gt;SpatialDataPointsDataFrame&lt;/code&gt; to a simple data frame.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set.seed(2019) #to always see the same sample
LAI_sfdf %&amp;gt;%
  as.data.frame() %&amp;gt;%
  select ( all_of(c(&amp;quot;NUTS_NAME&amp;quot;, &amp;quot;NUTS_ID&amp;quot;, &amp;quot;lai&amp;quot;)) ) %&amp;gt;%
  sample_n(10)

##                      NUTS_NAME NUTS_ID lai
## 281                       Vest    RO42  NA
## 125                     Kassel    DE73  NA
## 69              Friesland (NL)    NL12  NA
## 237 Agri, Kars, Igdir, Ardahan    TRA2  NA
## 273                East Anglia    UKH1  NA
## 119                Prov. Liège    BE33  NA
## 61                   Bourgogne    FRC1  NA
## 275                      Essex    UKH3  NA
## 282                   Istanbul    TR10  NA
## 174                    Leipzig    DED5  NA
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We’ll plot the map with &lt;a href=&#34;https://ggplot2.tidyverse.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ggplot2&lt;/a&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;library(ggplot2)
library(sf)
ggplot(data=LAI_sfdf) + 
  geom_sf(aes(fill=lai),
          color=&amp;quot;dim grey&amp;quot;, size=.1) + 
  scale_fill_gradient( low: &amp;quot;#FAE000&amp;quot;, high:  &amp;quot;#00843A&amp;quot;) +
  guides(fill:  guide_legend(reverse=T, title:  &amp;quot;LAI&amp;quot;)) +
  labs(title=&amp;quot;Leaf Area Index&amp;quot;,
       subtitle:  &amp;quot;High vegetation half, NUTS2 regional avareage values&amp;quot;,
       caption=&amp;quot;\ua9 EuroGeographics for the administrative boundaries 
                \ua9 Copernicus Data Service, June 2019 average values
                Tutorial and ready-to-use data on greendeal.dataobservatory.eu&amp;quot;) +
  theme_light() + theme(legend.position=c(.88,.78)) +
  coord_sf(xlim=c(-22,48), ylim=c(34,70))
&lt;/code&gt;&lt;/pre&gt;
















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&#34; srcset=&#34;
               /media/img/tutorials/LAI_plot_demo_hu4d370a736e40349b168ee924157b9365_71580_e36c601565f21c35efd1c5c8858ec5e9.webp 400w,
               /media/img/tutorials/LAI_plot_demo_hu4d370a736e40349b168ee924157b9365_71580_d6621addc530408eab0e7f4bdd6783aa.webp 760w,
               /media/img/tutorials/LAI_plot_demo_hu4d370a736e40349b168ee924157b9365_71580_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/tutorials/LAI_plot_demo_hu4d370a736e40349b168ee924157b9365_71580_e36c601565f21c35efd1c5c8858ec5e9.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;h2 id=&#34;data-integrity&#34;&gt;Data Integrity&lt;/h2&gt;
&lt;p&gt;Our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt;
has a data API where we place the new data with metadata for
programmatic download in CSV, JSON or even with SQL queries. For data
integrity purposes, we are placing an authoritative copy on &lt;a href=&#34;https://zenodo.org/communities/greendeal_observatory/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Zenodo
(Green Deal Data Observatory
Community)&lt;/a&gt;. You
can use this for scientific citations. We are also happy if you place
your own climate policy related research data here, so that we can
include it in our observatory. In our subsequent tutorials, we will show
how to do this programmatically in R. This particular dataset (not only
with the month June, which we selected to streamline the tutorial) is
available &lt;a href=&#34;https://zenodo.org/record/4903940#.YLyYrqgzbIU&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt; with
the digital object identifier
&lt;a href=&#34;http://doi.org/10.5281/zenodo.4903940&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;doi.org/10.5281/zenodo.4903940&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Creating Algorithmic Tools to Interpret and Communicate Open Data Efficiently</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-04-developer-leo-lahti/</link>
      <pubDate>Fri, 04 Jun 2021 10:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-04-developer-leo-lahti/</guid>
      <description>&lt;p&gt;&lt;strong&gt;As a developer at rOpenGov, what type of data do you usually use in your work?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;As an academic data scientist whose research focuses on the development of general-purpose algorithmic methods, I work with a range of applications from life sciences to humanities. Population studies play a big role in our research, and often the information that we can draw from public sources - geospatial, demographic, environmental - provides invaluable support. We typically use open data in combination with sensitive research data but some of the research questions can be readily addressed based on open data from statistical authorities such as Statistics Finland or Eurostat.&lt;/p&gt;
















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&#34; srcset=&#34;
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp 400w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_3250accecb68b0ec9716afed72d0f77e.webp 760w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;In your ideal data world, what would be the ultimate dataset, or datasets that you would like to see in the Music Data Observatory?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;One line of our research analyses the historical trends and spread of knowledge production, in particular book printing based on large-scale metadata collections. It would be interesting to extend this research to music, to understand the contemporary trends as well as the broader historical developments. Gaining access to a large systematic collection of music and composition data from different countries across long periods of time would make this possible.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Why did you decide to join the challenge and why do you think that this would be a game changer for researchers and policymakers?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Joining the challenge was a natural development based on our overall activities in this area; &lt;a href=&#34;http://ropengov.org/community/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;the rOpenGov project&lt;/a&gt; has been around for a decade now, since the early days of the broader open data movement. This has also created an active international developer network and we felt well equipped for picking up the challenge. The game changer for researchers is that the project highlights the importance of data quality, even when dealing with official statistics, and provides new methods to solve these issues efficiently through the open collaboration model. For policymakers, this provides access to new high-quality curated data and case studies that can support evidence-based decision-making.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Do you have a favorite, or most used open governmental or open science data source? What do you think about it?  Could it be improved?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Regarding open government data, one of my favorites is not a single data source but a data representation standard. The &lt;a href=&#34;https://www.scb.se/en/services/statistical-programs-for-px-files/#:~:text=PX%20is%20a%20standard%20format,and%20data.&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;px format&lt;/a&gt; is widely used by statistical authorities in various countries, and this has allowed us to create R tools that allow the retrieval and analysis of official statistics from many countries across Europe, spanning dozens of statistical institutions. Standardization of open data formats allows us to build robust algorithmic tools for downstream data analysis and visualization.  Open government data is still too often shared in obscure, non-standard or closed-source file formats and this is creating significant bottlenecks for the development of scalable and interoperable AI and machine learning methods that can harness the full potential of open data.&lt;/p&gt;
















&lt;figure  id=&#34;figure-regarding-open-government-data-one-of-my-favorites-is-not-a-single-data-source-but-a-data-representation-standard-the-px-format&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Regarding open government data, one of my favorites is not a single data source but a data representation standard, the Px format.&#34; srcset=&#34;
               /media/img/developers/PxWeb_hu1855a7f442346dd4157ad8b8bb51b6dc_124293_ee6a50b05be5954c8a175be0348fba8c.webp 400w,
               /media/img/developers/PxWeb_hu1855a7f442346dd4157ad8b8bb51b6dc_124293_dc404336590b3bc74e63f364832e2877.webp 760w,
               /media/img/developers/PxWeb_hu1855a7f442346dd4157ad8b8bb51b6dc_124293_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/developers/PxWeb_hu1855a7f442346dd4157ad8b8bb51b6dc_124293_ee6a50b05be5954c8a175be0348fba8c.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Regarding open government data, one of my favorites is not a single data source but a data representation standard, the Px format.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;From your perspective, what do you see being the greatest problem with open data in 2021?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;Although there are a variety of open data sources available (and the numbers continue to increase), the availability of open algorithmic tools to interpret and communicate open data efficiently is lagging behind. One of the greatest challenges for open data in 2021 is to demonstrate how we can maximize the potential of open data by designing smart tools for open data analytics.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;What can our automated data observatories do to make open data more credible in the European economic policy community and be accepted as verified information?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The role of the professional network backing up the project, and the possibility of getting critical feedback and later adoption by the academic communities will support the efforts. Transparency of the data harmonization operations is the key to credibility, and will be further supported by concrete benchmarks that highlight the critical differences in drawing conclusions based on original sources versus the harmonized high-quality data sets.&lt;/p&gt;
















&lt;figure  id=&#34;figure-we-need-to-get-critical-feedback-and-later-adoption-by-the-academic-communities&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;We need to get critical feedback and later adoption by the academic communities.&#34; srcset=&#34;
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp 400w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_3b536090581f2795373e801d65371e20.webp 760w,
               /media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/greendeal_and_zenodo_huddcd7485e56cb33c97d3e664ae383275_281994_debfc54dcf2193c7c800dab0f36de429.webp&#34;
               width=&#34;760&#34;
               height=&#34;507&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      We need to get critical feedback and later adoption by the academic communities.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;How we can ensure the long-term sustainability of the efforts?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;The extent of open data space is such that no single individual or institution can address all the emerging needs in this area. The open developer networks play a huge role in the development of algorithmic methods, and strong communities have developed around specific open data analytical environments such as R, Python, and Julia. These communities support networked collaboration and provide services such as software peer review. The long-term sustainability will depend on the support that such developer communities can receive, both from individual contributors as well as from institutions and governments.&lt;/p&gt;
&lt;h2 id=&#34;join-us&#34;&gt;Join us&lt;/h2&gt;
&lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in antitrust, innovation policy or economic impact analysis? Try our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Economic and Environment Impact Analysis, Automated for Data-as-Service</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-06-03-iotables-release/</link>
      <pubDate>Thu, 03 Jun 2021 16:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-06-03-iotables-release/</guid>
      <description>&lt;p&gt;We have released a new version of
&lt;a href=&#34;https://iotables.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;iotables&lt;/a&gt; as part of the
&lt;a href=&#34;http://ropengov.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt; project. The package, as the name
suggests, works with European symmetric input-output tables (SIOTs).
SIOTs are among the most complex governmental statistical products. They
show how each country’s 64 agricultural, industrial, service, and
sometimes household sectors relate to each other. They are estimated
from various components of the GDP, tax collection, at least every five
years.&lt;/p&gt;
&lt;div class=&#34;alert alert-note&#34;&gt;
  &lt;div&gt;
    This code tutorial is not outdated, but the &lt;a href=&#34;https://iotables.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;iotables&lt;/a&gt; R package has a new release with more environmental impact analysis featues.
  &lt;/div&gt;
&lt;/div&gt;
&lt;details class=&#34;spoiler &#34;  id=&#34;spoiler-1&#34;&gt;
  &lt;summary&gt;Click to expand table of contents of the post&lt;/summary&gt;
  &lt;p&gt;&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#accessing-and-tidying-the-data-programmatically&#34;&gt;Accessing and tidying the data programmatically&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#example&#34;&gt;Example&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#vignettes&#34;&gt;Vignettes&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#environmental-impact-analysis&#34;&gt;Environmental Impact Analysis&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#ropengov-and-the-eu-datathon-challenges&#34;&gt;rOpenGov and the EU Datathon Challenges&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;SIOTs offer great value to policy-makers and analysts to make more than
educated guesses on how a million euros, pounds or Czech korunas spent
on a certain sector will impact other sectors of the economy, employment
or GDP. What happens when a bank starts to give new loans and advertise
them? How is an increase in economic activity going to affect the amount
of wages paid and and where will consumers most likely spend their
wages? As the national economies begin to reopen after COVID-19 pandemic
lockdowns, is to utilize SIOTs to calculate direct and indirect
employment effects or value added effects of government grant programs
to sectors such as cultural and creative industries or actors such as
venues for performing arts, movie theaters, bars and restaurants.&lt;/p&gt;
&lt;p&gt;Making such calculations requires a bit of matrix algebra, and
understanding of input-output economics, direct, indirect effects, and
multipliers. Economists, grant designers, policy makers have those
skills, but until now, such calculations were either made in cumbersome
Excel sheets, or proprietary software, as the key to these calculations
is to keep vectors and matrices, which have at least one dimension of
64, perfectly aligned. We made this process reproducible with
&lt;a href=&#34;https://iotables.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;iotables&lt;/a&gt; and
&lt;a href=&#34;https://CRAN.R-project.org/package=eurostat&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;eurostat&lt;/a&gt; on
&lt;a href=&#34;http://ropengov.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt;&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-iotables-package-creates-direct-indirect-effects-and-multipliers-programatically-our-observatory-will-make-those-indicators-available-for-all-european-countries&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our iotables package creates direct, indirect effects and multipliers programatically. Our observatory will make those indicators available for all European countries.&#34; srcset=&#34;
               /media/img/package_screenshots/iotables_045_hu2a20ce082ac1035f2e18bbe9f771b917_198414_1ff902b174dec383d32d5245b4103fff.webp 400w,
               /media/img/package_screenshots/iotables_045_hu2a20ce082ac1035f2e18bbe9f771b917_198414_af398c7bd5f3c936ea4ad7a030c0a82e.webp 760w,
               /media/img/package_screenshots/iotables_045_hu2a20ce082ac1035f2e18bbe9f771b917_198414_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/package_screenshots/iotables_045_hu2a20ce082ac1035f2e18bbe9f771b917_198414_1ff902b174dec383d32d5245b4103fff.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our iotables package creates direct, indirect effects and multipliers programatically. Our observatory will make those indicators available for all European countries.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;accessing-and-tidying-the-data-programmatically&#34;&gt;Accessing and tidying the data programmatically&lt;/h2&gt;
&lt;p&gt;The iotables package is in a way an extension to the &lt;em&gt;eurostat&lt;/em&gt; R
package, which provides a programmatic access to the
&lt;a href=&#34;https://ec.europa.eu/eurostat&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurostat&lt;/a&gt; data warehouse. The reason for
releasing a new package is that working with SIOTs requires plenty of
meticulous data wrangling based on various &lt;em&gt;metadata&lt;/em&gt; sources, apart
from actually accessing the &lt;em&gt;data&lt;/em&gt; itself. When working with matrix
equations, the bar is higher than with tidy data. Not only your rows and
columns must match, but their ordering must strictly conform the
quadrants of the a matrix system, including the connecting trade or tax
matrices.&lt;/p&gt;
&lt;p&gt;When you download a country’s SIOT table, you receive a long form data
frame, a very-very long one, which contains the matrix values and their
labels like this:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## Table naio_10_cp1700 cached at C:\Users\...\Temp\RtmpGQF4gr/eurostat/naio_10_cp1700_date_code_FF.rds

# we save it for further reference here 
saveRDS(naio_10_cp1700, &amp;quot;not_included/naio_10_cp1700_date_code_FF.rds&amp;quot;)

# should you need to retrieve the large tempfiles, they are in 
dir (file.path(tempdir(), &amp;quot;eurostat&amp;quot;))

dplyr::slice_head(naio_10_cp1700, n:  5)

## # A tibble: 5 x 7
##   unit    stk_flow induse  prod_na geo       time        values
##   &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;   &amp;lt;chr&amp;gt;     &amp;lt;date&amp;gt;       &amp;lt;dbl&amp;gt;
## 1 MIO_EUR DOM      CPA_A01 B1G     EA19      2019-01-01 141873.
## 2 MIO_EUR DOM      CPA_A01 B1G     EU27_2020 2019-01-01 174976.
## 3 MIO_EUR DOM      CPA_A01 B1G     EU28      2019-01-01 187814.
## 4 MIO_EUR DOM      CPA_A01 B2A3G   EA19      2019-01-01      0 
## 5 MIO_EUR DOM      CPA_A01 B2A3G   EU27_2020 2019-01-01      0
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The metadata reads like this: the units are in millions of euros, we are
analyzing domestic flows, and the national account items &lt;code&gt;B1-B2&lt;/code&gt; for the
industry &lt;code&gt;A01&lt;/code&gt;. The information of a 64x64 matrix (the SIOT) and its
connecting matrices, such as taxes, or employment, or &lt;em&gt;C**O&lt;/em&gt;&lt;sub&gt;2&lt;/sub&gt;
emissions, must be placed exactly in one correct ordering of columns and
rows. Every single data wrangling error will usually lead in an error
(the matrix equation has no solution), or, what is worse, in a very
difficult to trace algebraic error. Our package not only labels this
data meaningfully, but creates very tidy data frames that contain each
necessary matrix of vector with a key column.&lt;/p&gt;
&lt;p&gt;iotables package contains the vocabularies (abbreviations and human
readable labels) of three statistical vocabularies: the so called
&lt;code&gt;COICOP&lt;/code&gt; product codes, the &lt;code&gt;NACE&lt;/code&gt; industry codes, and the vocabulary of
the &lt;code&gt;ESA2010&lt;/code&gt; definition of national accounts (which is the government
equivalent of corporate accounting).&lt;/p&gt;
&lt;p&gt;Our package currently solves all equations for direct, indirect effects,
multipliers and inter-industry linkages. Backward linkages show what
happens with the suppliers of an industry, such as catering or
advertising in the case of music festivals, if the festivals reopen. The
forward linkages show how much extra demand this creates for connecting
services that treat festivals as a ‘supplier’, such as cultural tourism.&lt;/p&gt;
&lt;h2 id=&#34;example&#34;&gt;Example&lt;/h2&gt;
&lt;pre&gt;&lt;code&gt;## Downloading employment data from the Eurostat database.

## Table lfsq_egan22d cached at C:\Users\...\Temp\RtmpGQF4gr/eurostat/lfsq_egan22d_date_code_FF.rds
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;and match it with the latest structural information on from the
&lt;a href=&#34;http://appsso.eurostat.ec.europa.eu/nui/show.do?wai=true&amp;amp;dataset=naio_10_cp1700&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Symmetric input-output table at basic prices (product by
product)&lt;/a&gt;
Eurostat product. A quick look at the Eurostat website already shows
that there is a lot of work ahead to make the data look like an actual
Symmetric input-output table. Download it with &lt;code&gt;iotable_get()&lt;/code&gt; which
does basic labelling and preprocessing on the raw Eurostat files.
Because of the size of the unfiltered dataset on Eurostat, the following
code may take several minutes to run.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sk_io &amp;lt;-  iotable_get ( labelled_io_data:  NULL, 
                        source:  &amp;quot;naio_10_cp1700&amp;quot;, geo:  &amp;quot;SK&amp;quot;, 
                        year:  2015, unit:  &amp;quot;MIO_EUR&amp;quot;, 
                        stk_flow:  &amp;quot;TOTAL&amp;quot;,
                        labelling:  &amp;quot;iotables&amp;quot; )

## Reading cache file C:\Users\..\Temp\RtmpGQF4gr/eurostat/naio_10_cp1700_date_code_FF.rds

## Table  naio_10_cp1700  read from cache file:  C:\Users\..\Temp\RtmpGQF4gr/eurostat/naio_10_cp1700_date_code_FF.rds

## Saving 808 input-output tables into the temporary directory
## C:\Users\...\Temp\RtmpGQF4gr

## Saved the raw data of this table type in temporary directory C:\Users\...\Temp\RtmpGQF4gr/naio_10_cp1700.rds.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;input_coefficient_matrix_create()&lt;/code&gt; creates the input coefficient
matrix, which is used for most of the analytical functions.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;a&lt;/em&gt;&lt;sub&gt;&lt;em&gt;i**j&lt;/em&gt;&lt;/sub&gt;:  &lt;em&gt;X&lt;/em&gt;&lt;sub&gt;&lt;em&gt;i**j&lt;/em&gt;&lt;/sub&gt; / &lt;em&gt;x&lt;/em&gt;&lt;sub&gt;&lt;em&gt;j&lt;/em&gt;&lt;/sub&gt;&lt;/p&gt;
&lt;p&gt;It checks the correct ordering of columns, and furthermore it fills up 0
values with 0.000001 to avoid division with zero.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;input_coeff_matrix_sk &amp;lt;- input_coefficient_matrix_create(
  data_table:  sk_io
)

## Columns and rows of real_estate_imputed_a, extraterriorial_organizations are all zeros and will be removed.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Then you can create the Leontieff-inverse, which contains all the
structural information about the relationships of 64x64 sectors of the
chosen country, in this case, Slovakia, ready for the main equations of
input-output economics.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;I_sk &amp;lt;- leontieff_inverse_create(input_coeff_matrix_sk)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And take out the primary inputs:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;primary_inputs_sk &amp;lt;- coefficient_matrix_create(
  data_table:  sk_io, 
  total:  &#39;output&#39;, 
  return:  &#39;primary_inputs&#39;)

## Columns and rows of real_estate_imputed_a, extraterriorial_organizations are all zeros and will be removed.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let’s see if there the government tries to stimulate the economy in
three sectors, agricultulre, car manufacturing, and R&amp;amp;D with a billion
euros. Direct effects measure the initial, direct impact of the change
in demand and supply for a product. When production goes up, it will
create demand in all supply industries (backward linkages) and create
opportunities in the industries that use the product themselves (forward
linkages.)&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;direct_effects_create( primary_inputs_sk, I_sk ) %&amp;gt;%
  select ( all_of(c(&amp;quot;iotables_row&amp;quot;, &amp;quot;agriculture&amp;quot;,
                    &amp;quot;motor_vechicles&amp;quot;, &amp;quot;research_development&amp;quot;))) %&amp;gt;%
  filter (.data$iotables_row %in% c(&amp;quot;gva_effect&amp;quot;, &amp;quot;wages_salaries_effect&amp;quot;, 
                                    &amp;quot;imports_effect&amp;quot;, &amp;quot;output_effect&amp;quot;))

##            iotables_row agriculture motor_vechicles research_development
## 1        imports_effect   1.3684350       2.3028203            0.9764921
## 2 wages_salaries_effect   0.2713804       0.3183523            0.3828014
## 3            gva_effect   0.9669621       0.9790771            0.9669467
## 4         output_effect   2.2876287       3.9840251            2.2579634
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Car manufacturing requires much imported components, so each extra
demand will create a large importing activity. The R&amp;amp;D will create a the
most local wages (and supports most jobs) because research is
job-intensive. As we can see, the effect on imports, wages, gross value
added (which will end up in the GDP) and output changes are very
different in these three sectors.&lt;/p&gt;
&lt;p&gt;This is not the total effect, because some of the increased production
will translate into income, which in turn will be used to create further
demand in all parts of the domestic economy. The total effect is
characterized by multipliers.&lt;/p&gt;
&lt;p&gt;Then solve for the multipliers:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;multipliers_sk &amp;lt;- input_multipliers_create( 
  primary_inputs_sk %&amp;gt;%
    filter (.data$iotables_row: = &amp;quot;gva&amp;quot;), I_sk ) 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And select a few industries:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set.seed(12)
multipliers_sk %&amp;gt;% 
  tidyr::pivot_longer ( -all_of(&amp;quot;iotables_row&amp;quot;), 
                        names_to:  &amp;quot;industry&amp;quot;, 
                        values_to:  &amp;quot;GVA_multiplier&amp;quot;) %&amp;gt;%
  select (-all_of(&amp;quot;iotables_row&amp;quot;)) %&amp;gt;%
  arrange( -.data$GVA_multiplier) %&amp;gt;%
  dplyr::sample_n(8)

## # A tibble: 8 x 2
##   industry               GVA_multiplier
##   &amp;lt;chr&amp;gt;                           &amp;lt;dbl&amp;gt;
## 1 motor_vechicles                  7.81
## 2 wood_products                    2.27
## 3 mineral_products                 2.83
## 4 human_health                     1.53
## 5 post_courier                     2.23
## 6 sewage                           1.82
## 7 basic_metals                     4.16
## 8 real_estate_services_b           1.48
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;vignettes&#34;&gt;Vignettes&lt;/h2&gt;
&lt;p&gt;The &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/germany_1990.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Germany
1990&lt;/a&gt;
provides an introduction of input-output economics and re-creates the
examples of the &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/germany_1990.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurostat Manual of Supply, Use and Input-Output
Tables&lt;/a&gt;,
by Jörg Beutel (Eurostat Manual).&lt;/p&gt;
&lt;p&gt;The &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/united_kingdom_2010.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;United Kingdom Input-Output Analytical Tables Daniel Antal, based
on the work edited by Richard
Wild&lt;/a&gt;
is a use case on how to correctly import data from outside Eurostat
(i.e., not with &lt;code&gt;eurostat::get_eurostat()&lt;/code&gt;) and join it properly to a
SIOT. We also used this example to create unit tests of our functions
from a published, official government statistical release.&lt;/p&gt;
&lt;p&gt;Finally, &lt;a href=&#34;https://iotables.dataobservatory.eu/articles/working_with_eurostat.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Working With Eurostat
Data&lt;/a&gt;
is a detailed use case of working with all the current functionalities
of the package by comparing two economies, Czechia and Slovakia and
guides you through a lot more examples than this short blogpost.&lt;/p&gt;
&lt;p&gt;Our package was originally developed to calculate GVA and employment
effects for the Slovak music industry, and similar calculations for the
Hungarian film tax shelter. We can now programatically create
reproducible multipliers for all European economies in the &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital
Music Observatory&lt;/a&gt;, and create
further indicators for economic policy making in the &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data
Observatory&lt;/a&gt;.&lt;/p&gt;
&lt;h2 id=&#34;environmental-impact-analysis&#34;&gt;Environmental Impact Analysis&lt;/h2&gt;
&lt;p&gt;Our package allows the calculation of various economic policy scenarios,
such as changing the VAT on meat or effects of re-opening music
festivals on aggregate demand, GDP, tax revenues, or employment. But
what about the &lt;em&gt;C**O&lt;/em&gt;&lt;sub&gt;2&lt;/sub&gt;, methane and other greenhouse gas
effects of the reopening festivals, or the increasing meat prices?&lt;/p&gt;
&lt;p&gt;Technically our package can already calculate such effects, but to do
so, you have to carefully match further statistical vocabulary items
used by the European Environmental Agency about air pollutants and
greenhouse gases.&lt;/p&gt;
&lt;p&gt;The last released version of &lt;em&gt;iotables&lt;/em&gt; is Importing and Manipulating
Symmetric Input-Output Tables (Version 0.4.4). Zenodo.
&lt;a href=&#34;https://zenodo.org/record/4897472&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.5281/zenodo.4897472&lt;/a&gt;,
but we are already  working on a new major release. (Download the &lt;a href=&#34;https://greendeal.dataobservatory.eu/media/bibliography/cite-iotables.bib&#34; target=&#34;_blank&#34;&gt;BibLaTeX entry&lt;/a&gt;.) In that release, we
are planning to build in the necessary vocabulary into the metadata
functions to increase the functionality of the package, and create new
indicators for our &lt;a href=&#34;https://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt;. This experimental
data observatory is creating new, high quality statistical indicators
from open governmental and open science data sources that has not seen
the daylight yet.&lt;/p&gt;
&lt;h2 id=&#34;ropengov-and-the-eu-datathon-challenges&#34;&gt;rOpenGov and the EU Datathon Challenges&lt;/h2&gt;
















&lt;figure  id=&#34;figure-ropengov-reprex-and-other-open-collaboration-partners-teamed-up-to-build-on-our-expertise-of-open-source-statistical-software-development-further-we-want-to-create-a-technologically-and-financially-feasible-data-as-service-to-put-our-reproducible-research-products-into-wider-user-for-the-business-analyst-scientific-researcher-and-evidence-based-policy-design-communities&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;rOpenGov, Reprex, and other open collaboration partners teamed up to build on our expertise of open source statistical software development further: we want to create a technologically and financially feasible data-as-service to put our reproducible research products into wider user for the business analyst, scientific researcher and evidence-based policy design communities.&#34; srcset=&#34;
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp 400w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_3250accecb68b0ec9716afed72d0f77e.webp 760w,
               /media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/partners/rOpenGov-intro_hubd4fef93bdda18dae35145b86090eaef_399543_15755b0682ab231bcd4f2ccab28e7c33.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      rOpenGov, Reprex, and other open collaboration partners teamed up to build on our expertise of open source statistical software development further: we want to create a technologically and financially feasible data-as-service to put our reproducible research products into wider user for the business analyst, scientific researcher and evidence-based policy design communities.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;a href=&#34;http://ropengov.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt; is a community of open governmental
data and statistics developers with many packages that make programmatic
access and work with open data possible in the R language.
&lt;a href=&#34;https://reprex.nl/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Reprex&lt;/a&gt; is a Dutch-startup that teamed up with
rOpenGov and other open collaboration partners to create a
technologically and financially feasible service to exploit reproducible
research products for the wider business, scientific and evidence-based
policy design community. Open data is a legal concept - it means that
you have the rigth to reuse the data, but often the reuse requires
significant programming and statistical know-how. We entered into the
annual &lt;a href=&#34;https://reprex.nl/project/eu-datathon_2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;EU Datathon&lt;/a&gt;
competition in all three challenges with our applications to not only
provide open-source software, but daily updated, validated, documented,
high-quality statistical indicators as open data in an open database.
Our &lt;a href=&#34;https://iotables.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;iotables&lt;/a&gt; package is one of
our many open-source building blocks to make open data more accessible
to all.&lt;/p&gt;
&lt;!---
recruitment
--&gt;
&lt;details class=&#34;spoiler &#34;  id=&#34;spoiler-5&#34;&gt;
  &lt;summary&gt;Join our Green Deal Data Observatory collaboration!&lt;/summary&gt;
  &lt;p&gt;&lt;em&gt;Join our open collaboration Green Deal Data Observatory team as a &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/curator&#34;&gt;data curator&lt;/a&gt;, &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/developer&#34;&gt;developer&lt;/a&gt; or &lt;a href=&#34;https://greendeal.dataobservatory.eu/authors/team&#34;&gt;business developer&lt;/a&gt;. More interested in economic policies, particularly computation antitrust, innovation and small enterprises? Check out our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Music Observatory&lt;/a&gt; team! Or your interest lies more in data governance, trustworthy AI and other digital market problems? Check out our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; team!&lt;/em&gt;&lt;/p&gt;
&lt;/details&gt;
</description>
    </item>
    
    <item>
      <title>The Green Deal Data Observatory is Contesting the EU Datathon 2021 Prize</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-05-21-eu-datathon-2021/</link>
      <pubDate>Fri, 21 May 2021 20:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-05-21-eu-datathon-2021/</guid>
      <description>&lt;p&gt;Reprex, a Dutch start-up enterprise formed to utilize open source software and open data, is looking for partners in an agile, open collaboration to win at least one of the three EU Datathon Prizes. We are looking for policy partners, academic partners and a consultancy partner. Our project is based on agile, open collaboration with three types of contributors.&lt;/p&gt;
&lt;p&gt;With our competing prototypes we want to show that we have a research automation technology that can find open data, process it and validate it into high-quality business, policy or scientific indicators, and release it with daily refreshments in a modern API.&lt;/p&gt;
&lt;p&gt;We are looking for institutions to challenge us with their data problems, and sponsors to increase our capacity. Over then next 5 months, we need to find a sustainable business model for a high-quality and open alternative to other public data programs.&lt;/p&gt;
&lt;h2 id=&#34;the-eu-datathon-2021-challenge&#34;&gt;The EU Datathon 2021 Challenge&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;To take part, you should propose the development of an application that links and uses open datasets.&lt;/em&gt; - our &lt;a href=&#34;https://music.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;data curator team&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Your application &amp;hellip; is also expected to find suitable new approaches and solutions to help Europe achieve important goals set by the European Commission through the use of open data.&lt;/em&gt;” - this application is developed by our &lt;a href=&#34;https://greendeal.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;technology contributors&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;&lt;em&gt;Your application should showcase opportunities for concrete business models or social enterprises.&lt;/em&gt; - our &lt;a href=&#34;https://economy.dataobservatory.eu/#contributors&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;service development team&lt;/a&gt; is working to make this happen!&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We use open source software and open data. The applications are hosted on the cloud resources of &lt;a href=&#34;#reprex&#34;&gt;Reprex&lt;/a&gt;, an early-stage technology startup currently building a viable, open-source, open-data business model to create reproducible research products.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;We are working together with experts in the domain as curators (check out our guidelines if you want to join: &lt;a href=&#34;https://curators.dataobservatory.eu/data-curators.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Data Curators: Get Inspired!&lt;/a&gt;).&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Our development team works on an open collaboration basis. Our indicator R packages, and our services are developed together with &lt;a href=&#34;https://music.dataobservatory.eu/author/ropengov/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;rOpenGov&lt;/a&gt;.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;mission-statement&#34;&gt;Mission statement&lt;/h2&gt;
&lt;p&gt;We want to win an &lt;a href=&#34;https://op.europa.eu/en/web/eudatathon&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;EU Datathon prize&lt;/a&gt; by processing the vast, already-available governmental and scientific open data made usable for policy-makers, scientific researchers, and business researcher end-users.&lt;/p&gt;
&lt;p&gt;“&lt;em&gt;To take part, you should propose the development of an application that links and uses open datasets. Your application should showcase opportunities for concrete business models or social enterprises. It is also expected to find suitable new approaches and solutions to help Europe achieve important goals set by the European Commission through the use of open data.&lt;/em&gt;”&lt;/p&gt;
&lt;p&gt;We aim to win at least one first prize in the EU Datathon 2021. We are contesting &lt;strong&gt;all three&lt;/strong&gt; challenges, which are related to the EU’s official strategic policies for the coming decade.&lt;/p&gt;
&lt;h2 id=&#34;challenge-1-a-european-grean-deel&#34;&gt;Challenge 1: A European Grean Deel&lt;/h2&gt;
















&lt;figure  id=&#34;figure-our-green-deal-data-observatory-connects-socio-economic-and-environmental-data-to-help-understanding-and-combating-climate-change&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our Green Deal Data Observatory connects socio-economic and environmental data to help understanding and combating climate change.&#34; srcset=&#34;
               /media/img/observatory_screenshots/GD_Observatory_opening_page_hucc13bb64069da5f1e36667b8db70b016_264112_57c130dac4ded4a481b6bb652578a723.webp 400w,
               /media/img/observatory_screenshots/GD_Observatory_opening_page_hucc13bb64069da5f1e36667b8db70b016_264112_47561f8fdb4837762153f21c3e070e9a.webp 760w,
               /media/img/observatory_screenshots/GD_Observatory_opening_page_hucc13bb64069da5f1e36667b8db70b016_264112_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/GD_Observatory_opening_page_hucc13bb64069da5f1e36667b8db70b016_264112_57c130dac4ded4a481b6bb652578a723.webp&#34;
               width=&#34;760&#34;
               height=&#34;350&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our Green Deal Data Observatory connects socio-economic and environmental data to help understanding and combating climate change.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Challenge 1: &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/european-green-deal_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;A European Green Deal&lt;/a&gt;, with a particular focus on the &lt;a href=&#34;https://ec.europa.eu/commission/presscorner/detail/en/ip_20_2323&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;The European Climate Pact&lt;/a&gt;, the &lt;a href=&#34;https://ec.europa.eu/info/food-farming-fisheries/farming/organic-farming/organic-action-plan_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Organic Action Plan&lt;/a&gt;, and the &lt;a href=&#34;https://ec.europa.eu/commission/presscorner/detail/en/IP_21_111&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;New European Bauhaus&lt;/a&gt;, i.e., mitigation strategies.&lt;/p&gt;
&lt;p&gt;Climate change and environmental degradation are an existential threat to Europe and the world. To overcome these challenges, the European Union created the European Green Deal strategic plan, which aims to make the EU’s economy sustainable by turning climate and environmental challenges into opportunities and making the transition just and inclusive for all.&lt;/p&gt;
&lt;p&gt;Our &lt;a href=&#34;http://greendeal.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Green Deal Data Observatory&lt;/a&gt; is a modern reimagination of existing ‘data observatories’; currently, there are over 70 permanent international data collection and dissemination points. One of our objectives is to understand why the dozens of the EU’s observatories do not use open data and reproducible research. We want to show that open governmental data, open science, and reproducible research can lead to a higher quality and faster data ecosystem that fosters growth for policy, business, and academic data users.&lt;/p&gt;
&lt;p&gt;We provide high quality, tidy data through a modern API which enables data flows between public and proprietary databases. We believe that introducing Open Policy Analysis standards with open data, open-source software, and research automation, can help the Green Deal policymaking process. Our collaboration is open for individuals, citizens scientists, research institutes, NGOS, and companies.&lt;/p&gt;
&lt;h2 id=&#34;other-challenges&#34;&gt;Other Challenges&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;
&lt;p&gt;Challenge 2: &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/economy-works-people_en#:~:text=Individuals%20and%20businesses%20in%20the,needs%20of%20the%20EU%27s%20citizens.&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;An economy that works for people&lt;/a&gt;, with a particular focus on the &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/economy-works-people/internal-market_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Single market strategy&lt;/a&gt;. Big data and automation create new inequalities and injustices and have the potential to create a jobless growth economy. Our &lt;a href=&#34;https://economy.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Economy Data Observatory&lt;/a&gt; is a fully automated, open source, open data observatory that produces new indicators from open data sources and experimental big data sources, with authoritative copies and a modern API.&lt;/p&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;p&gt;Challenge 3: &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;A Europe fit for the digital age&lt;/a&gt;, with a particular focus &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/excellence-trust-artificial-intelligence_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Artificial Intelligence&lt;/a&gt;, the &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/european-data-strategy_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;European Data Strategy&lt;/a&gt;, the &lt;a href=&#34;https://ec.europa.eu/info/strategy/priorities-2019-2024/europe-fit-digital-age/digital-services-act-ensuring-safe-and-accountable-online-environment_en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Services Act&lt;/a&gt;, &lt;a href=&#34;https://digital-strategy.ec.europa.eu/en/policies/digital-skills-and-jobs&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Skills&lt;/a&gt; and &lt;a href=&#34;https://digital-strategy.ec.europa.eu/en/policies/connectivity&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Connectivity&lt;/a&gt;. The &lt;a href=&#34;https://music.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Digital Music Observatory&lt;/a&gt; (DMO) is a fully automated, open source, open data observatory that creates public datasets to provide a comprehensive view of the European music industry. It provides high-quality and timely indicators in all four pillars of the planned official European Music Observatory as a modern, open source and largely open data-based, automated, API-supported alternative solution for this planned observatory. The insight and methodologies we are refining in the DMO are applicable and transferable to about 60 other data observatories funded by the EU which do not currently employ governmental or scientific open data.&lt;/p&gt;
&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;Our Product/Market Fit was validated in the world’s 2nd ranked university-backed incubator program, the &lt;a href=&#34;https://music.dataobservatory.eu/post/2020-09-25-yesdelft-validation/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Yes!Delft AI Validation Lab&lt;/a&gt;. We are currently developing this project with the help of the &lt;a href=&#34;https://www.jumpmusic.eu/fellow2021/automated-music-observatory/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;JUMP European Music Market Accelerator&lt;/a&gt; program.&lt;/p&gt;
&lt;h2 id=&#34;problem-statement&#34;&gt;Problem Statement&lt;/h2&gt;
&lt;p&gt;The EU has an 18-year-old open data regime and it makes public taxpayer-funded data in the values of tens of billions of euros per year; the Eurostat program alone handles 20,000 international data products, including at least 5,000 pan-European environmental indicators.&lt;/p&gt;
&lt;p&gt;As open science principles gain increased acceptance, scientific researchers are making hundreds of thousands of valuable datasets public and available for replication every year.&lt;/p&gt;
&lt;p&gt;The EU, the OECD, and UN institutions run around 100 data collection programs, so-called ‘data observatories’ that more or less avoid touching this data, and buy proprietary data instead. Annually, each observatory spends between 50 thousand and 3 million EUR on collecting untidy and proprietary data of inconsistent quality, while never even considering open data.&lt;/p&gt;
















&lt;figure  id=&#34;figure-our-automated-data-observatories-are-modern-reimaginations-of-the-existing-observatories-that-do-not-use-open-data-and-research-automation&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Our automated data observatories are modern reimaginations of the existing observatories that do not use open data and research automation.&#34; srcset=&#34;
               /media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_0079ea9844f6c5e52b52fd0e627467a2.webp 400w,
               /media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_ecd6d08ba5e9bac19c8173546f036651.webp 760w,
               /media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/observatory_screenshots/observatory_collage_16x9_800_hu47f74f5cdae63c7248c2367b9d148671_353025_0079ea9844f6c5e52b52fd0e627467a2.webp&#34;
               width=&#34;760&#34;
               height=&#34;428&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our automated data observatories are modern reimaginations of the existing observatories that do not use open data and research automation.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;The problem with the current EU data strategy is that while it produces enormous quantities of valuable open data, in the absence of common basic data science and documentation principles, it seems often cheaper to create new data than to put the existing open data into shape.&lt;/p&gt;
&lt;p&gt;This is an absolute waste of resources and efforts. With a few R packages and our deep understanding of advanced data science techniques, we can create valuable datasets from unprocessed open data. In most domains, we are able to repurpose data originally created for other purposes at a historical cost of several billions of euros, converting these unused data assets into valuable datasets that can replace tens of millions’ worth of proprietary data.&lt;/p&gt;
&lt;p&gt;What we want to achieve with this project – and we believe such an accomplishment would merit one of the first prizes - is to add value to a significant portion of pre-existing EU open data (for example, available on &lt;a href=&#34;https://data.europa.eu/data/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;data.europa.eu/data&lt;/a&gt;) by re-processing and integrating them into a modern, tidy database with an API access, and to find a business model that emphasises a triangular use of data in 1. business, 2. science and 3. policy-making. Our mission is to modernize the concept of &lt;code&gt;data observatories.&lt;/code&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Is Drought Risk Uninsurable?</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/</link>
      <pubDate>Fri, 23 Apr 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-04-23-belgium-flood-insurance/</guid>
      <description>&lt;p&gt;Climate change is real and it is everywhere. Whereas island nations in
the Pacific are threatened with rising sea levels, Europe suffers from
ever more frequent scorching summers and resulting drought. Take the
case of Belgium, where heat waves in 2018 or 2020 have exacerbated an
already fragile drought risk profile. An all too tangible effect is that
houses built in areas where groundwater reservoirs are dwindling start
to rupture. What adds insult to injury is that insurers appear unwilling
to pay for damages: these climate-related risks simply did not feature
in insurance policies made up decades ago. The public and the media have
called upon the secretary of state responsible for consumer protection
to come up with a solution. (Download this
document in &lt;a href=&#34;https://greendeal.dataobservatory.eu/documents/Belgium-flood-risk-open-data.pdf&#34; target=&#34;_blank&#34;&gt;pdf&lt;/a&gt;.)&lt;/p&gt;
&lt;p&gt;The Belgian insurance sector and government are currently investigating
how to address the ecological and financial issue. Should the risk
premium be raised on all insurance policies in an effort to spread risk,
or should only policy holders in designated risk areas be subject to a
raise in premia? Should urban planning initiatives and real estate
projects be required to assess these new types of risk beforehand?&lt;/p&gt;
&lt;p&gt;Driven by the Open Data Directive, we went in search for data at
government websites such as &lt;a href=&#34;http://waterinfo.be/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;waterinfo.be&lt;/a&gt;. That
proved harder than you would want, with quite a number of technological
barriers to cross. We independently explored the matter ourselves and
came up with this: a dynamic map that pictures the spatial distribution
of drought risk - as measured by a climate indicator known as the
&lt;a href=&#34;http://sac.csic.es/spei/map/maps.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;standardised recipitation-evapotranspiration
index&lt;/a&gt;.&lt;/p&gt;
















&lt;figure  id=&#34;figure-actual-drying-soil&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Actual drying soil.&#34; srcset=&#34;
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp 400w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_732c5a4fed2e5086cd4649603e01bc64.webp 760w,
               /media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/belgium_spei_2018_hu053711948486f3d03232ef0d63e51704_295716_85cb3a3e9d67ae93c4b48d13c76f103f.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Actual drying soil.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;This SPEI index, measured as a standardized variate, shows the
deviations of the current climatic balance (precipitation minus
evapotranspiration potential) in the long run and is presented on a
monthly basis. As the SPEI in this form is more predictive for flood
risk, we simply inverted the index to suggest a measure of drought
risk&lt;sup id=&#34;fnref:1&#34;&gt;&lt;a href=&#34;#fn:1&#34; class=&#34;footnote-ref&#34; role=&#34;doc-noteref&#34;&gt;1&lt;/a&gt;&lt;/sup&gt;.&lt;/p&gt;
&lt;p&gt;Readers familiar with the “Kingdom by the sea” will remark that Belgium
cannot possibly have a lack of precipitation. It rains more than the
average Belgian cares for in the country. As a result, the water
management system has historically been based on getting the water out
as quickly as possible to the sea, in particular through the Ijzer,
Schelde and Maas rivers. Add the abundance of concrete in the densely
populated country - and its grossly mismanaged urban planning - and the
capacity to hold water in surface and ground reservoirs is severely
impaired. With climate change in full swing, these historical practices
come back to haunt Belgium.&lt;/p&gt;
&lt;h2 id=&#34;are-belgians-aware-of-climate-risk&#34;&gt;Are Belgians aware of climate risk?&lt;/h2&gt;
&lt;p&gt;We projected the public opinion data from Eurobarometer 90.2 (fieldwork:
October-November 2018.) on the municipal map of Belgium. We used the
answers to the multiple choice question
&lt;code&gt;QB1 Do you think that the following extreme weather events are due to climate change?&lt;/code&gt;
We highlighted areas where people find it more likely to be exposed to
&lt;code&gt;Droughts and wildfires.&lt;/code&gt; We used the GESIS datafile (European
Commission 2019) and used the (Antal 2021b, 2021a) packages to project
the values to municipalities.&lt;/p&gt;
&lt;p&gt;We see a weak spatial correlation between awareness of drought risk and
actual draught risk. The least affected parts of Belgium appear least
concerned. Despite its weakness, authorities and insurers can at least
build their mitigation policies on a hypothesis of positive correlation.
Of note is that concern for climate change effects follows regional,
linguistic and other patterns. The map in particular suggests the
Belgian provinces as markers for awareness.&lt;/p&gt;
















&lt;figure  id=&#34;figure-perception-of-likely-drought&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Perception of likely drought.&#34; srcset=&#34;
               /media/img/blogposts_2021/belgium_response_2018_hu47aa5cc947047eed9c66d363c68d4890_303964_e9673e15410217a2a4c7bb862bf2154f.webp 400w,
               /media/img/blogposts_2021/belgium_response_2018_hu47aa5cc947047eed9c66d363c68d4890_303964_d056770abb0309766c1c85bcf1ece158.webp 760w,
               /media/img/blogposts_2021/belgium_response_2018_hu47aa5cc947047eed9c66d363c68d4890_303964_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/belgium_response_2018_hu47aa5cc947047eed9c66d363c68d4890_303964_e9673e15410217a2a4c7bb862bf2154f.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Perception of likely drought.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;h2 id=&#34;financial-capacity-to-pay-for-insurance&#34;&gt;Financial Capacity to Pay for Insurance&lt;/h2&gt;
&lt;p&gt;The next question we asked ourselves, was if the drought risk correlates
with the ability to pay as distributed among local communities. Whether
an insurance policy – or the regulation of insurance – attempts to
provide cover on an individual level (through increased premia), or
looks for local, regional or national mitigation strategies, the
income/tax base might be an appropriate benchmark to test for financial
capacity.&lt;/p&gt;
















&lt;figure  id=&#34;figure-financial-capacity-to-mitigate-drought-risk&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Financial capacity to mitigate drought risk.&#34; srcset=&#34;
               /media/img/blogposts_2021/belgium_income_2018_hub7a174433f849aaff3387c877c96021f_300014_18df5ea027c9f2eaf3cd731691071414.webp 400w,
               /media/img/blogposts_2021/belgium_income_2018_hub7a174433f849aaff3387c877c96021f_300014_55fa0c944ecfb9f3d24ee0ca21993ed0.webp 760w,
               /media/img/blogposts_2021/belgium_income_2018_hub7a174433f849aaff3387c877c96021f_300014_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/belgium_income_2018_hub7a174433f849aaff3387c877c96021f_300014_18df5ea027c9f2eaf3cd731691071414.webp&#34;
               width=&#34;760&#34;
               height=&#34;760&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Financial capacity to mitigate drought risk.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;The match between the (inverted) SPEI and total net income is less than
perfect. Some of the areas most at risk coincide with the highest-income
communities, but other threatened communities are low-income by Belgian
standards. The actual risk awareness and the financial capacity to solve
the problem are again only weakly correlated.^[2]&lt;/p&gt;
&lt;h2 id=&#34;correlation&#34;&gt;Correlation&lt;/h2&gt;
&lt;p&gt;Let’s have a look at the variables on &lt;code&gt;NUTS3&lt;/code&gt; level:&lt;/p&gt;
















&lt;figure  id=&#34;figure-correlation-of-the-variables&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Correlation of the variables.&#34; srcset=&#34;
               /media/img/blogposts_2021/var-cor-1_hu93a5ac94c874461206dbc5e3ba932828_16200_898d1dcc10909f7ec231312858e7e566.webp 400w,
               /media/img/blogposts_2021/var-cor-1_hu93a5ac94c874461206dbc5e3ba932828_16200_ba2a45810a48e2158032082e40312816.webp 760w,
               /media/img/blogposts_2021/var-cor-1_hu93a5ac94c874461206dbc5e3ba932828_16200_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/var-cor-1_hu93a5ac94c874461206dbc5e3ba932828_16200_898d1dcc10909f7ec231312858e7e566.webp&#34;
               width=&#34;672&#34;
               height=&#34;480&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Correlation of the variables.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;ul&gt;
&lt;li&gt;&lt;code&gt;Average SPEI&lt;/code&gt;, which is a measure of increasing humidity, is
negatively correlated with &lt;code&gt;dry&lt;/code&gt; that we defined as &lt;code&gt;-1 x avg_spei&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Dry&lt;/code&gt; areas, that are losing water, are less populous and more rich
regions.&lt;/li&gt;
&lt;li&gt;&lt;code&gt;Dry_18&lt;/code&gt; is a version of dry that only shows 12 months before the
Eurobarometer survey about opinions on climate change effects, to
see if the recent memory of actual weather conditions has had an
affect of the perception of Belgians about these risk. It is
seemingly not correlated with worries about floods or droughts.&lt;/li&gt;
&lt;li&gt;The&lt;code&gt;dry_18&lt;/code&gt; and the &lt;code&gt;dry&lt;/code&gt; variables are largely correlated. One
possible explanation is that the year before the survey was not an
unusual period, it fit very well with the 2016-2020 trend.&lt;/li&gt;
&lt;li&gt;Worries about extreme weather conditions are correlated with each
other – i.e., some part of the population (concentrated
geographically) is far more concerned with climate change than
others.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;The same on municipality (local administrative unit) level:&lt;/p&gt;
















&lt;figure  id=&#34;figure-correlation-on-the-level-of-municipalities&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;Correlation on the level of municipalities.&#34; srcset=&#34;
               /media/img/blogposts_2021/cor-lau-1_hub6a23b7ba8396d39074c5b9dfd2a35bd_15934_6836db390a9019eedcfbf0056ccae62f.webp 400w,
               /media/img/blogposts_2021/cor-lau-1_hub6a23b7ba8396d39074c5b9dfd2a35bd_15934_3b84b61d8e46e482d54e71828a3e14cc.webp 760w,
               /media/img/blogposts_2021/cor-lau-1_hub6a23b7ba8396d39074c5b9dfd2a35bd_15934_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/media/img/blogposts_2021/cor-lau-1_hub6a23b7ba8396d39074c5b9dfd2a35bd_15934_6836db390a9019eedcfbf0056ccae62f.webp&#34;
               width=&#34;672&#34;
               height=&#34;480&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Correlation on the level of municipalities.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;The correlations with opinion polling data are a little bit distorted,
because the data is on &lt;code&gt;NUTS2&lt;/code&gt;, and to bring it down to &lt;code&gt;NUTS3&lt;/code&gt; or &lt;code&gt;LAU&lt;/code&gt;
level would be a complicated small area statistical estimation task. We
have also computed geospatial cross-correlation. Awareness of the
climate problem and the dryness in 2018 were positively correlated in
time – the drier the year was in an area, the more likely it was that
people are aware of the problem; and the poorer areas were more likely
to be afraid of this problem. The global spatial cross-correlation of
the drying and local income was very low. This is a neutral situation:
local income is not more concentrated to drying areas (which would be a
lucky coincidence) nor concentrated in the relatively stable areas.&lt;/p&gt;
&lt;p&gt;Generally, the problem map appears to be neutral to mildly favorable.
The financial capacity to solve the problem is not working in the favor,
nor against the problem, and awareness seems to be somewhat higher in
the more affected areas.&lt;/p&gt;
&lt;p&gt;The codes are in &lt;code&gt;R/join_belgium_water_lau_dataset.R&lt;/code&gt; and
&lt;code&gt;R/join_belgium_water_nuts3_dataset.R&lt;/code&gt;.&lt;/p&gt;
&lt;h2 id=&#34;adverse-selection-and-climate-solidarity&#34;&gt;Adverse Selection and Climate Solidarity&lt;/h2&gt;
&lt;p&gt;In addition to these historical analyses that put the drought risk in
context, we are investigating whether climate data from integrated
climate models might be harnessed to predict medium- to longer-term risk
profiles on a spatially distributed basis. Urban planners, real estate
promoters, individual households and governments will need to rely on
such predictions to better adapt to climate change and reverse some of
the earlier policy choices we mentioned.&lt;/p&gt;
&lt;p&gt;To quote the Nobel Prize winning thoughts of Finn E. Kydland and Edward
C. Prescott (Kydland and Prescott 1977):&lt;/p&gt;
&lt;p&gt;&lt;em&gt;The issues are obvious in many well-known problems of public policy.
For example, suppose the socially desirable outcome is not to have
houses built in a particular flood plain but, given that they are there,
to take certain costly flood-control measures. If the government’s
policy were not to build the dams and levees needed for flood protection
and agents knew this was the case, even if houses were built there,
rational agents would not live in the flood plains. But the rational
agent knows that, if he and others build houses there, the government
will take the necessary flood-control measures. Consequently, in the
absence of a law prohibiting the construction of houses in the flood
plain, houses are built there, and the army corps of engineers
subsequently builds the dams and levees.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;Our initial explorations at least suggest that leaving the resolution
entirely to market forces, for example through increased property
insurance premia may well lead to underinsurance in poorer areas that is
&lt;em&gt;dynamically inconsistent&lt;/em&gt; with government policy. If in particular
severe drought will bankrupt farmers in such areas, eventually regional
or national government will be forced to bail them out.&lt;/p&gt;
&lt;p&gt;The other extreme approach, i.e., leaving the climate-change related
damages entirely to the taxpayer, therefore does not seem feasible
either with climate awareness and local income tax base only weakly
correlating with the drought patterns. In addition, drought of course
does not confine itself to municipal borders; the hydrological topology
of the issue inherently implies a coordination problem between local,
regional and federal entities passing the buck from one to another. One
can imagine some form of solidarity and redistribution will be required
to align interests and avoid adverse selection. To address these typical
market failures, government will need to step in to allow these risks,
that may be privately uninsurable, to be covered on a society-wide
basis.&lt;/p&gt;
&lt;p&gt;These problems are not unique to property damage. Similar problems arise
in many student loan systems in the world (where it is desirable that
the loan can be taken by arts students or future teachers, who may not
have as high earning potential as easy-to-credit future lawyers,
engineers, managers) or in many social security issues: a minimum level
of health insurance for the unemployed and poor is desirable not only on
the basis of humanity, but to avoid epidemic risks. Such special loan
systems and special insurance systems are balancing some social welfare
with individual welfare and individual risk considerations, and at the
same time they try to avoid adverse selection, free-riding. We believe
that our example can spark some ideas how a desirable social outcome can
be aligned with the principles of insurance and personal responsibility.&lt;/p&gt;
&lt;p&gt;In this case, on a longer term basis, incentives that may transfer
water-intensive industrial and agricultural activities from the areas
most at risk, could be called for, as well as better hydrological
management to safeguard water reserves. We invite the authorities and
relevant stakeholders to render the appropriate data needed to assess
climate and drought evolution and to calculate risk premia scenarios and
solidarity mechanisms open data, verified for quality through unit-tests
and peer review.&lt;/p&gt;
&lt;h2 id=&#34;references&#34;&gt;References&lt;/h2&gt;
&lt;p&gt;Antal, Daniel. 2021a. &lt;em&gt;Regions: Processing Regional Statistics&lt;/em&gt;.
&lt;a href=&#34;https://regions.danielantal.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://regions.danielantal.eu/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;———. 2021b. &lt;em&gt;Retroharmonize: Ex Post Survey Data Harmonization&lt;/em&gt;.
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://retroharmonize.dataobservatory.eu/&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Beguerı́a, Santiago, Sergio M Vicente-Serrano, Fergus Reig, and Borja
Latorre. 2014. “Standardized Precipitation Evapotranspiration Index
(SPEI) Revisited: Parameter Fitting, Evapotranspiration Models, Tools,
Datasets and Drought Monitoring.” &lt;em&gt;International Journal of Climatology&lt;/em&gt;
34 (10): 3001–23.&lt;/p&gt;
&lt;p&gt;European Commission. 2019. “Eurobarometer 90.2 (2018).” GESIS Data
Archive, Cologne. ZA7488 Data file Version 1.0.0,
&lt;a href=&#34;https://doi.org/10.4232/1.13289&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13289&lt;/a&gt;. &lt;a href=&#34;https://doi.org/10.4232/1.13289&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13289&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Kydland, Finn E., and Edward C. Prescott. 1977. “Rules Rather Than
Discretion: The Inconsistency of Optimal Plans.” &lt;em&gt;Journal of Political
Economy&lt;/em&gt; 85 (3): 473–91. &lt;a href=&#34;http://www.jstor.org/stable/1830193&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;http://www.jstor.org/stable/1830193&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Statbel. 2020. “&lt;span class=&#34;nocase&#34;&gt;Fiscal statistics on
income&lt;/span&gt;.” Eurostat.
&lt;a href=&#34;https://statbel.fgov.be/en/open-data/fiscal-statistics-income&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://statbel.fgov.be/en/open-data/fiscal-statistics-income&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;Vicente-Serrano, Sergio M, Santiago Beguerı́a, and Juan I López-Moreno.
2010. “A Multiscalar Drought Index Sensitive to Global Warming: The
Standardized Precipitation Evapotranspiration Index.” &lt;em&gt;Journal of
Climate&lt;/em&gt; 23 (7): 1696–1718.&lt;/p&gt;
&lt;div class=&#34;footnotes&#34; role=&#34;doc-endnotes&#34;&gt;
&lt;hr&gt;
&lt;ol&gt;
&lt;li id=&#34;fn:1&#34;&gt;
&lt;p&gt;As a standardized variate, SPEI can be compared across space and
time. The original calculation of SPEI is based on the FAO-56
Penman-Monteith method. Other relevant indicators might consider the
soil composition for example: clay and lime soils tend to be more
vulnerable to drought. We combined this ecological dimension with the
socio-economic dimension to suggest that insurance premia design might
be targeted to, say, income levels as well - or alternatively to real
estate prices. See (Beguerı́a et al. 2014; Vicente-Serrano, Beguerı́a, and
López-Moreno 2010)&amp;#160;&lt;a href=&#34;#fnref:1&#34; class=&#34;footnote-backref&#34; role=&#34;doc-backlink&#34;&gt;&amp;#x21a9;&amp;#xfe0e;&lt;/a&gt;&lt;/p&gt;
&lt;/li&gt;
&lt;/ol&gt;
&lt;/div&gt;
</description>
    </item>
    
    <item>
      <title>Connecting the Dots to Environmental Degradation Open Data</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-11-environmental_data/</link>
      <pubDate>Thu, 11 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-11-environmental_data/</guid>
      <description>&lt;p&gt;If you live in a polluted area, does it mean that you take climate
change seriously? Following the &lt;a href=&#34;https://reprex.nl/talk/reprex-open-data-day-2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Reprex Open Data Day
2021&lt;/a&gt;, we embarked on
a quest to explore this question using a unique combination of
micro-level data from Eurobarometer surveys, Eurostat’s sub-national
socio-economic data and satellite imagery from &lt;a href=&#34;https://www.eea.europa.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;European Environmental
Agency&lt;/a&gt; (EEA) and NASA. Before venturing
forth into the forest of open data we, as all visual creatures out
there, first mapped the road ahead.&lt;/p&gt;
&lt;p&gt;We used three sensory sources on pollution and deforestation, all of
which are closely related to environmental degradation, to create these
maps. In the first set of maps, we draw on EEA’s Air Quality
&lt;a href=&#34;https://www.eea.europa.eu/data-and-maps/data/aqereporting-8&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;e-Reporting
data&lt;/a&gt; on
environmental pollution (particulate matter 2.5 and 10) for the period
2014–2016. What makes these data complex is their organization on the
level of the reporting stations. So, this means that we had to first
figure out the nearest aerial distance from every reporting station to
local administrative unit (LAU), assign the annual pollution levels to
every LAU and, finally, create our fine-grained map. Using this
approach, we are able to aggregate the data to any NUTS level and, with
help of the &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt;
and &lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; R packages, work with
public opinion and sub-national data to tackle our initial question.&lt;/p&gt;
&lt;p&gt;Below you will notice that findings are constrained to countries for
which EEA commonly collects environmental data. Far from being
Euro-centric, our project is inclusive of other countries and continents
for which the pollution data is available – with the aforementioned
packages we could work with any nation’s or larger regions data. In
fact, we would like to invite contributors with greater knowledge of
reliable data sources from all continents.&lt;/p&gt;
&lt;img src=&#34;blogpost_pm10_pm25_eur.png&#34; alt=&#34;&#34; width=&#34;1200&#34; /&gt;
















&lt;figure  id=&#34;figure-our-joined-dataset-allows-hypothesis-testing-on-how-much-peoples-perception-and-attitudes-to-environmental-degradation-depends-on-the-quality-of-the-environment-that-surrounds-them&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://greendeal.dataobservatory.eu/img/belgium-flood-risk/blogpost_pm10_pm25_eur.png&#34; alt=&#34;Our joined dataset allows hypothesis testing on how much people&amp;#39;s perception and attitudes to environmental degradation depends on the quality of the environment that surrounds them.&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Our joined dataset allows hypothesis testing on how much people&amp;rsquo;s perception and attitudes to environmental degradation depends on the quality of the environment that surrounds them.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p class=&#34;caption&#34;&gt;
Our joined dataset allows hypothesis testing on how much people’s
perception and attitudes to environmental degradation depends on the
quality of the environment that surrounds them.
&lt;/p&gt;
&lt;p&gt;In the next map, we go beyond the EU/EEA/EU candidate focus to depict
light pollution for the whole European continent. We used the
&lt;a href=&#34;https://figshare.com/articles/dataset/Harmonization_of_DMSP_and_VIIRS_nighttime_light_data_from_1992-2018_at_the_global_scale/9828827?file=17626079&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Harmonized VIIRS nighttime light
data&lt;/a&gt;
for 2014–2018, which is a novel open source with calibrated global
information on nightlight. This outstanding source offers an
unparalleled opportunity to measure the intensity of the socioeconomic
activities and urbanization. We showcase this in our map of estimated
average size of urban areas for every LAU using DN values higher than
30. This is a tip of an iceberg as our mapping capabilities may extend
to any available subnational data around the globe.&lt;/p&gt;
&lt;p&gt;The VIIRS nighttime light dataset excels particularly in countries and
regions where GDP estimation and desagregation is patchy or
non-existent. We would like to find collaborators from Africa, the Arab
World, the Caucasus and Latin America, where we have harmonized,
individual level survey data and socio-econometric data, to join forces
with us to build relevant sub-national regional dictionaries for the
&lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; package, which can do the
rest of the work.&lt;/p&gt;
&lt;img src=&#34;urban_lights.png&#34; alt=&#34;Nighttime lights are accurate predictors of local income, energy use and contribution to carbon emissions.&#34; width=&#34;1200&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Nighttime lights are accurate predictors of local income, energy use and
contribution to carbon emissions.
&lt;/p&gt;
&lt;p&gt;In the final map, we use the &lt;a href=&#34;https://land.copernicus.eu/pan-european/high-resolution-layers/forests/tree-cover-density&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Copernicus Tree Cover
Density&lt;/a&gt;
dataset to compute how much deforestation has taken place on the LAU
level in Europe between 2015 and 2019. Using our regions package, these
data could easily be paired with public opinion and NUTS-level data to
analyze how deforestation influences individual attitudes on climate
change.&lt;/p&gt;
&lt;img src=&#34;forest_change_2015_2019_africa.png&#34; alt=&#34;Deforestration is a key factor in carbon emission, because trees store so much carbon. Any path to net zero carbon emission requires a vast re-forestration of the Earth.&#34; width=&#34;1200&#34; /&gt;
&lt;p class=&#34;caption&#34;&gt;
Deforestration is a key factor in carbon emission, because trees store
so much carbon. Any path to net zero carbon emission requires a vast
re-forestration of the Earth.
&lt;/p&gt;
&lt;p&gt;As we can see, in most of Europe deforestation is ongoing. This is
partly caused by effects of climate change, but partly further aggravate
the situation as the fallen trees release previously captured CO2. For
example, in Slovakia the Tatra mountains lost many trees in a
devastating storm; such extreme weather conditions kill vulnerable tree
cover, leading to soil errosion. Again, the Copernicus tree cover data
available for the entire Earth, and our
&lt;a href=&#34;https://regions.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;regions&lt;/a&gt; package only requires
local geocoding and geographical vocabulary additions to allow analysis
on almost all continents.&lt;/p&gt;
&lt;p&gt;All this artwork barely scratches the surface of possibilities that
mapping sensoring data could offer to NGOs, think-tanks, small
enterprises as well as academic institutions. Most importantly, this
powerful approach could help these actors effectively link patterns in
environmental change to individual attitudes and subnational
socio-economic data.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Regional Geocoding Harmonization Case Study - Regional Climate Change Awareness Datasets</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-06-regions-climate/</link>
      <pubDate>Sat, 06 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-06-regions-climate/</guid>
      <description>&lt;pre&gt;&lt;code&gt;library(regions)
library(lubridate)
library(dplyr)

if ( dir.exists(&#39;data-raw&#39;) ) {
  data_raw_dir &amp;lt;- &amp;quot;data-raw&amp;quot;
} else {
  data_raw_dir &amp;lt;- file.path(&amp;quot;..&amp;quot;, &amp;quot;..&amp;quot;, &amp;quot;data-raw&amp;quot;)
  }
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;going-beyond-the-national-level&#34;&gt;Going beyond the national level&lt;/h2&gt;
&lt;p&gt;Let’s start with a dirty averaging by sub-national unit. The w1
weighting variable contains the post-stratification weight for the
national samples. The Eurobarometer samples represent nations (with the
exception of East and West Germany, Northern Ireland and Great Britain.)
The average of the &lt;code&gt;w1&lt;/code&gt; variable is 1.00 for each sample, but it is not
necessarily 1 for smaller territorial units. If &lt;code&gt;sum(w)&amp;gt;1&lt;/code&gt; for say,
&lt;code&gt;AT23&lt;/code&gt; it only means that the &lt;code&gt;AT23&lt;/code&gt; region was undersampled relatively
to the rest of Austria, and responses must be over-weighted in
post-stratification.&lt;/p&gt;
&lt;p&gt;There is no way to make the samples become regionally representative,
and a correct post-stratification would require further data about the
sampel design. But we can simply adjust to over/undersampling by making
sure that oversampled territorial averages are proportionally increased
and undersampled ones are decreased. [Another ‘dirty’ averaging would
be the use of an unweighted average, but our method is better, because
it more-or-less adjusts gender and education level biases, but leaves
intra-country regional biases in the sample.]&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;panel &amp;lt;- readRDS((file.path(data_raw_dir, &amp;quot;climate-panel.rds&amp;quot;)))

climate_data &amp;lt;-  panel %&amp;gt;%
  mutate ( year:  lubridate::year(date_of_interview)) %&amp;gt;%
  select ( all_of(c(&amp;quot;isocntry&amp;quot;, &amp;quot;geo&amp;quot;, &amp;quot;w1&amp;quot;)), 
           contains(&amp;quot;problem&amp;quot;)
  )  %&amp;gt;%
  mutate ( 
    # use the post-stratification weights for national samples
    serious_world_problems_first:  w1*serious_world_problems_first , 
    serious_world_problems_climate_change:  w1*serious_world_problems_climate_change) %&amp;gt;%
  group_by (  .data$geo ) %&amp;gt;%
  summarise( serious_world_problems_first:  mean(serious_world_problems_first, na.rm=TRUE),
             serious_world_problems_climate_change:  mean (serious_world_problems_climate_change, na.rm=TRUE),
             mean_w1:  mean(w1)
             ) %&amp;gt;%
  mutate ( 
    # adjust for post-stratification weight bias due to regional over/undersampling
    climate_first:  serious_world_problems_first / mean_w1, 
    climate_mentioned:  serious_world_problems_climate_change / mean_w1
    ) 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;So, we averaged, weighted and adjusted the mentioning of climate change
as the world’s most serious, or one of the most serious problems by NUTS
regions.&lt;/p&gt;
&lt;h2 id=&#34;aggregation-level&#34;&gt;Aggregation level&lt;/h2&gt;
&lt;p&gt;The problem is that most statistical data is available in for the NUTS
regional boundaries according to the &lt;code&gt;NUTS2016&lt;/code&gt; definition. However,
GESIS uses &lt;code&gt;NUTS2013&lt;/code&gt; regions, so 252 regional codes in the four survey
waves are invalid. Some data is available only on national level, but it
can be projected to regional level, because small countries like
Luxembourg have no regional divisions. Larger countries like Germany are
divided only on state level (&lt;code&gt;NUTS1&lt;/code&gt;), while small countries are divided
on &lt;code&gt;NUTS3&lt;/code&gt; level.&lt;/p&gt;
&lt;p&gt;This leads to various problems. Many data is available only on &lt;code&gt;NUTS2&lt;/code&gt;
level, in which case &lt;code&gt;NUTS1&lt;/code&gt; data should be projected to its constituent
smaller &lt;code&gt;NUTS2&lt;/code&gt; regions, and &lt;code&gt;NUTS3&lt;/code&gt; level data must be aggregated up to
larger, containing &lt;code&gt;NUTS2&lt;/code&gt; levels.&lt;/p&gt;
&lt;p&gt;Of course, we also must choose if we use `&lt;code&gt;NUTS2013&lt;/code&gt; or &lt;code&gt;NUTS2016&lt;/code&gt;
boundaries. Sub-national boundaries have changed many thousand times in
the EU27 countries alone since 1999.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;## # A tibble: 5 x 2
##   validate         n
##   &amp;lt;chr&amp;gt;        &amp;lt;int&amp;gt;
## 1 country         15
## 2 invalid        252
## 3 nuts_level_1   132
## 4 nuts_level_2   452
## 5 nuts_level_3   141
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;recoding-the-regions&#34;&gt;Recoding the Regions&lt;/h2&gt;
&lt;p&gt;Our regions package was designed to keep track of sub-national regional
boundary changes. It can validate regional data codes, and to some
extent carry out recoding, imputation or simple aggregation.&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Recoding means that the boundaries are unchanged, but the country
changed the names/codes of regions, because there were other
boundary changes which did not affect our observation unit.&lt;/li&gt;
&lt;li&gt;Imputation must not be done with usual, general imputation tools,
because our data is regionally structured. However, some imputations
are very simple, because we can use equality equasions like &lt;code&gt;MT&lt;/code&gt;:
&lt;code&gt;MT0&lt;/code&gt;, &lt;code&gt;MT00&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;Often the boundary change is additive, and merged territorial units
can simple aggregated for comparison in earlier data.&lt;/li&gt;
&lt;/ul&gt;
&lt;!-- --&gt;
&lt;pre&gt;&lt;code&gt;regional_coding_2016 &amp;lt;- panel %&amp;gt;%
  mutate ( year:  lubridate::year(date_of_interview)) %&amp;gt;%
  select (  all_of(c(&amp;quot;isocntry&amp;quot;, &amp;quot;geo&amp;quot;, &amp;quot;region&amp;quot;, &amp;quot;year&amp;quot;) ) ) %&amp;gt;%
  distinct_all() %&amp;gt;%
  recode_nuts()

regional_coding_2013 &amp;lt;- panel %&amp;gt;%
  mutate ( year:  lubridate::year(date_of_interview)) %&amp;gt;%
  select (  all_of(c(&amp;quot;isocntry&amp;quot;, &amp;quot;geo&amp;quot;, &amp;quot;region&amp;quot;, &amp;quot;year&amp;quot;) ) ) %&amp;gt;%
  distinct_all() %&amp;gt;%
  recode_nuts( nuts_year:  2013)

climate_data_recoded &amp;lt;- climate_data %&amp;gt;% 
  left_join ( regional_coding_2016, by:  &#39;geo&#39; ) %&amp;gt;%
  left_join ( regional_coding_2013 %&amp;gt;% 
                select ( all_of(c(&amp;quot;geo&amp;quot;, &amp;quot;code_2013&amp;quot;))), 
              by:  &amp;quot;geo&amp;quot;) %&amp;gt;%
  distinct_all()

saveRDS ( climate_data_recoded , file.path(tempdir(), &amp;quot;climate_panel_recoded_agr.rds&amp;quot;), version:  2)

# not evaluated
saveRDS( climate_data_recoded , file:  file.path(&amp;quot;data-raw&amp;quot;, &amp;quot;climate_panel_recoded_agr.rds&amp;quot;))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;https://netzero.dataobservatory.eu/media/gif/eu_climate_change.gif&#34; alt=&#34;&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Where Are People More Likely To Treat Climate Change as the Most Serious Global Problem?</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-06-individual-join/</link>
      <pubDate>Sat, 06 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-06-individual-join/</guid>
      <description>&lt;pre&gt;&lt;code&gt;library(regions)
library(lubridate)
library(dplyr)

if ( dir.exists(&#39;data-raw&#39;) ) {
  data_raw_dir &amp;lt;- &amp;quot;data-raw&amp;quot;
} else {
  data_raw_dir &amp;lt;- file.path(&amp;quot;..&amp;quot;, &amp;quot;..&amp;quot;, &amp;quot;data-raw&amp;quot;)
  }
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The first results of our longitudinal table &lt;a href=&#34;post/2021-03-05-retroharmonize-climate/&#34;&gt;were difficult to
map&lt;/a&gt;, because the surveys used
an obsolete regional coding. We will adjust the wrong coding, when
possible, and join the data with the European Environment Agency’s (EEA)
Air Quality e-Reporting (AQ e-Reporting) data on environmental
pollution. We recoded the annual level for every available reporting
stations [&lt;em&gt;not shown here&lt;/em&gt;] and all values are in μg/m3. The period
under observation is 2014-2016. Data file:
&lt;a href=&#34;https://www.eea.europa.eu/data-and-maps/data/aqereporting-8&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://www.eea.europa.eu/data-and-maps/data/aqereporting-8&lt;/a&gt; (European
Environment Agency 2021).&lt;/p&gt;
&lt;h2 id=&#34;recoding-the-regions&#34;&gt;Recoding the Regions&lt;/h2&gt;
&lt;p&gt;Recoding means that the boundaries are unchanged, but the country
changed the names and codes of regions because there were other boundary
changes which did not affect our observation unit. We explain the
problem and the solution in greater detail in &lt;a href=&#34;http://netzero.dataobservatory.eu/post/2021-03-06-regions-climate/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;our
tutorial&lt;/a&gt;
that aggregates the data on regional levels.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;panel &amp;lt;- readRDS((file.path(data_raw_dir, &amp;quot;climate-panel.rds&amp;quot;)))

climate_data_geocode &amp;lt;-  panel %&amp;gt;%
  mutate ( year:  lubridate::year(date_of_interview)) %&amp;gt;%
  recode_nuts()
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s join the air pollution data and join it by corrected geocodes:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;load(file.path(&amp;quot;data&amp;quot;, &amp;quot;air_pollutants.rda&amp;quot;)) ## good practice to use system-independent file.path

climate_awareness_air &amp;lt;- climate_data_geocode %&amp;gt;%
  rename ( region_nuts_codes :  .data$code_2016) %&amp;gt;%
  left_join ( air_pollutants, by:  &amp;quot;region_nuts_codes&amp;quot; ) %&amp;gt;%
  select ( -all_of(c(&amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;, &amp;quot;date_of_interview&amp;quot;, 
                     &amp;quot;typology&amp;quot;, &amp;quot;typology_change&amp;quot;, &amp;quot;geo&amp;quot;, &amp;quot;region&amp;quot;))) %&amp;gt;%
  mutate (
    # remove special labels and create NA_numeric_ 
    age_education:  retroharmonize::as_numeric(age_education)) %&amp;gt;%
  mutate_if ( is.character, as.factor) %&amp;gt;%
  mutate ( 
    # we only have responses from 4 years, and this should be treated as a categorical variable
    year:  as.factor(year) 
    ) %&amp;gt;%
  filter ( complete.cases(.) ) 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;climate_awareness_air&lt;/code&gt; data frame contains the answers of 75086
individual respondents. 17.07% thought that climate change was the most
serious world problem and 33.6% mentioned climate change as one of the
three most important global problems.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;summary ( climate_awareness_air  )

##                  rowid       serious_world_problems_first
##  ZA5877_v2-0-0_1    :    1   Min.   :0.0000              
##  ZA5877_v2-0-0_10   :    1   1st Qu.:0.0000              
##  ZA5877_v2-0-0_100  :    1   Median :0.0000              
##  ZA5877_v2-0-0_1000 :    1   Mean   :0.1707              
##  ZA5877_v2-0-0_10000:    1   3rd Qu.:0.0000              
##  ZA5877_v2-0-0_10001:    1   Max.   :1.0000              
##  (Other)            :75080                               
##  serious_world_problems_climate_change    isocntry    
##  Min.   :0.000                         BE     : 3028  
##  1st Qu.:0.000                         CZ     : 3023  
##  Median :0.000                         NL     : 3019  
##  Mean   :0.336                         SK     : 3000  
##  3rd Qu.:1.000                         SE     : 2980  
##  Max.   :1.000                         DE-W   : 2978  
##                                        (Other):57058  
##                                    marital_status         age_education  
##  (Re-)Married: without children           :13242   18            :15485  
##  (Re-)Married: children this marriage     :12696   19            : 7728  
##  Single: without children                 : 7650   16            : 5840  
##  (Re-)Married: w children of this marriage: 6520   still studying: 5098  
##  (Re-)Married: living without children    : 6225   17            : 5092  
##  Single: living without children          : 4102   15            : 4528  
##  (Other)                                  :24651   (Other)       :31315  
##    age_exact                      occupation_of_respondent
##  Min.   :15.0   Retired, unable to work       :22911      
##  1st Qu.:36.0   Skilled manual worker         : 6774      
##  Median :51.0   Employed position, at desk    : 6716      
##  Mean   :50.1   Employed position, service job: 5624      
##  3rd Qu.:65.0   Middle management, etc.       : 5252      
##  Max.   :99.0   Student                       : 5098      
##                 (Other)                       :22711      
##             occupation_of_respondent_recoded
##  Employed (10-18 in d15a)   :32763          
##  Not working (1-4 in d15a)  :37125          
##  Self-employed (5-9 in d15a): 5198          
##                                             
##                                             
##                                             
##                                             
##                        respondent_occupation_scale_c_14
##  Retired (4 in d15a)                   :22911          
##  Manual workers (15 to 18 in d15a)     :15269          
##  Other white collars (13 or 14 in d15a): 9203          
##  Managers (10 to 12 in d15a)           : 8291          
##  Self-employed (5 to 9 in d15a)        : 5198          
##  Students (2 in d15a)                  : 5098          
##  (Other)                               : 9116          
##                   type_of_community   is_student      no_education     
##  DK                        :   34   Min.   :0.0000   Min.   :0.000000  
##  Large town                :20939   1st Qu.:0.0000   1st Qu.:0.000000  
##  Rural area or village     :24686   Median :0.0000   Median :0.000000  
##  Small or middle sized town: 9850   Mean   :0.0679   Mean   :0.008151  
##  Small/middle town         :19577   3rd Qu.:0.0000   3rd Qu.:0.000000  
##                                     Max.   :1.0000   Max.   :1.000000  
##                                                                        
##    education       year       region_nuts_codes  country_code  
##  Min.   :14.00   2013:25103   LU     : 1432     DE     : 4531  
##  1st Qu.:17.00   2015:    0   MT     : 1398     GB     : 3538  
##  Median :18.00   2017:25053   CY     : 1192     BE     : 3028  
##  Mean   :19.61   2019:24930   SK02   : 1053     CZ     : 3023  
##  3rd Qu.:22.00                EL30   :  974     NL     : 3019  
##  Max.   :30.00                EE     :  973     SK     : 3000  
##                               (Other):68064     (Other):54947  
##      pm2_5             pm10               o3              BaP        
##  Min.   : 2.109   Min.   :  5.883   Min.   : 66.37   Min.   :0.0102  
##  1st Qu.: 9.374   1st Qu.: 28.326   1st Qu.: 90.89   1st Qu.:0.1779  
##  Median :11.866   Median : 33.673   Median :102.81   Median :0.4105  
##  Mean   :12.954   Mean   : 38.637   Mean   :101.49   Mean   :0.8759  
##  3rd Qu.:15.890   3rd Qu.: 49.488   3rd Qu.:110.73   3rd Qu.:1.0692  
##  Max.   :41.293   Max.   :123.239   Max.   :141.04   Max.   :7.8050  
##                                                                      
##       so2              ap_pc1            ap_pc2             ap_pc3       
##  Min.   : 0.0000   Min.   :-4.6669   Min.   :-2.21851   Min.   :-2.1007  
##  1st Qu.: 0.0000   1st Qu.:-0.4624   1st Qu.:-0.49130   1st Qu.:-0.5695  
##  Median : 0.0000   Median : 0.4263   Median : 0.02902   Median :-0.1113  
##  Mean   : 0.1032   Mean   : 0.1031   Mean   : 0.04166   Mean   :-0.1746  
##  3rd Qu.: 0.0000   3rd Qu.: 0.9748   3rd Qu.: 0.57416   3rd Qu.: 0.3309  
##  Max.   :42.5325   Max.   : 2.0344   Max.   : 3.25841   Max.   : 4.1615  
##                                                                          
##      ap_pc4            ap_pc5        
##  Min.   :-1.7387   Min.   :-2.75079  
##  1st Qu.:-0.1669   1st Qu.:-0.18748  
##  Median : 0.0371   Median : 0.01811  
##  Mean   : 0.1154   Mean   : 0.06797  
##  3rd Qu.: 0.3050   3rd Qu.: 0.34937  
##  Max.   : 3.2476   Max.   : 1.42816  
## 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s see a simple CART tree! We remove the regional codes, because
there are very serious differences among regional climate awareness.
These differences, together with education level, and the year we are
talking about, are the most important predictors of thinking about
climate change as the most important global problem in Europe.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# Classification Tree with rpart
library(rpart)

# grow tree
fit &amp;lt;- rpart(as.factor(serious_world_problems_first) ~ .,
   method=&amp;quot;class&amp;quot;, data=climate_awareness_air %&amp;gt;%
     select ( - all_of(c(&amp;quot;rowid&amp;quot;, &amp;quot;region_nuts_codes&amp;quot;))), 
   control:  rpart.control(cp:  0.005))

printcp(fit) # display the results

## 
## Classification tree:
## rpart(formula:  as.factor(serious_world_problems_first) ~ ., 
##     data:  climate_awareness_air %&amp;gt;% select(-all_of(c(&amp;quot;rowid&amp;quot;, 
##         &amp;quot;region_nuts_codes&amp;quot;))), method:  &amp;quot;class&amp;quot;, control:  rpart.control(cp:  0.005))
## 
## Variables actually used in tree construction:
## [1] age_education                         isocntry                             
## [3] serious_world_problems_climate_change year                                 
## 
## Root node error: 12817/75086:  0.1707
## 
## n= 75086 
## 
##          CP nsplit rel error  xerror      xstd
## 1 0.0240566      0   1.00000 1.00000 0.0080438
## 2 0.0082703      3   0.92783 0.92783 0.0078055
## 3 0.0050000      5   0.91129 0.91425 0.0077588

plotcp(fit) # visualize cross-validation results
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&amp;amp;ldquo;Visualize cross-validation results&amp;amp;rdquo;&#34; srcset=&#34;
               /post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_8ce48ac0f7ba6b1d3752385b96368cc3.webp 400w,
               /post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_b20e6dca7fcadd4576da216956498a35.webp 760w,
               /post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/post/2021-03-06-individual-join/rpart-1_hu9f1f775a32eec3a67a573c0d2df50ef4_4271_8ce48ac0f7ba6b1d3752385b96368cc3.webp&#34;
               width=&#34;672&#34;
               height=&#34;480&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;summary(fit) # detailed summary of splits

## Call:
## rpart(formula:  as.factor(serious_world_problems_first) ~ ., 
##     data:  climate_awareness_air %&amp;gt;% select(-all_of(c(&amp;quot;rowid&amp;quot;, 
##         &amp;quot;region_nuts_codes&amp;quot;))), method:  &amp;quot;class&amp;quot;, control:  rpart.control(cp:  0.005))
##   n= 75086 
## 
##            CP nsplit rel error    xerror        xstd
## 1 0.024056592      0 1.0000000 1.0000000 0.008043837
## 2 0.008270266      3 0.9278302 0.9278302 0.007805478
## 3 0.005000000      5 0.9112897 0.9142545 0.007758824
## 
## Variable importance
## serious_world_problems_climate_change                              isocntry 
##                                    31                                    26 
##                          country_code                                   BaP 
##                                    20                                     8 
##                                 pm2_5                                ap_pc1 
##                                     4                                     3 
##                         age_education                                  pm10 
##                                     2                                     2 
##                             education                                ap_pc2 
##                                     2                                     1 
##                                  year 
##                                     1 
## 
## Node number 1: 75086 observations,    complexity param=0.02405659
##   predicted class=0  expected loss=0.1706976  P(node): 1
##     class counts: 62269 12817
##    probabilities: 0.829 0.171 
##   left son=2 (25229 obs) right son=3 (49857 obs)
##   Primary splits:
##       serious_world_problems_climate_change &amp;lt; 0.5          to the right, improve=2214.2040, (0 missing)
##       isocntry                              splits as  RRLLLRRRLLRLRLLLLLLLLLLRRLLLRLL, improve= 728.0160, (0 missing)
##       country_code                          splits as  RRLLLRRLLRLLLLLLLLLLRRLLLRLL, improve= 673.3656, (0 missing)
##       BaP                                   &amp;lt; 0.4300347    to the right, improve= 310.6229, (0 missing)
##       pm2_5                                 &amp;lt; 13.38264     to the right, improve= 296.4013, (0 missing)
##   Surrogate splits:
##       age_education splits as  ----RRRRRR-RRRRRRRRRR-RRRRRRRRRR-RRRRRRRRRR-RRRRRRRRRR-RRRRRL-RRR-RRRRRRRRR--RRRLLR--R-R, agree=0.664, adj=0, (0 split)
##       pm10          &amp;lt; 7.491315     to the left,  agree=0.664, adj=0, (0 split)
## 
## Node number 2: 25229 observations
##   predicted class=0  expected loss=0  P(node): 0.3360014
##     class counts: 25229     0
##    probabilities: 1.000 0.000 
## 
## Node number 3: 49857 observations,    complexity param=0.02405659
##   predicted class=0  expected loss=0.2570752  P(node): 0.6639986
##     class counts: 37040 12817
##    probabilities: 0.743 0.257 
##   left son=6 (34631 obs) right son=7 (15226 obs)
##   Primary splits:
##       isocntry     splits as  RRLLLRRRLLRLRLLLLLLLLLLRRLLLRLL, improve=1454.9460, (0 missing)
##       country_code splits as  RRLLLRRLLRLLLLLLLLLLRRLLLRLL, improve=1359.7210, (0 missing)
##       BaP          &amp;lt; 0.4300347    to the right, improve= 629.8844, (0 missing)
##       pm2_5        &amp;lt; 13.38264     to the right, improve= 555.7484, (0 missing)
##       ap_pc1       &amp;lt; -0.005459537 to the left,  improve= 533.3579, (0 missing)
##   Surrogate splits:
##       country_code splits as  RRLLLRRLLRLLLLLLLLLLRRLLLRLL, agree=0.987, adj=0.957, (0 split)
##       BaP          &amp;lt; 0.1749425    to the right, agree=0.775, adj=0.264, (0 split)
##       pm2_5        &amp;lt; 5.206993     to the right, agree=0.737, adj=0.140, (0 split)
##       ap_pc1       &amp;lt; 1.405527     to the left,  agree=0.733, adj=0.126, (0 split)
##       pm10         &amp;lt; 25.31211     to the right, agree=0.718, adj=0.076, (0 split)
## 
## Node number 6: 34631 observations
##   predicted class=0  expected loss=0.1769802  P(node): 0.4612178
##     class counts: 28502  6129
##    probabilities: 0.823 0.177 
## 
## Node number 7: 15226 observations,    complexity param=0.02405659
##   predicted class=0  expected loss=0.4392487  P(node): 0.2027808
##     class counts:  8538  6688
##    probabilities: 0.561 0.439 
##   left son=14 (11607 obs) right son=15 (3619 obs)
##   Primary splits:
##       isocntry      splits as  LL---LLR--L-L----------LL---R--, improve=337.5462, (0 missing)
##       country_code  splits as  LL---LR--L-L--------LL---R--, improve=337.5462, (0 missing)
##       age_education splits as  ----LLLLLL-LLLRRRRRRR-RRRRRRRRRL-RRRRRRLLRR-RRRRLLRLRL-RRLRRR-RRR-LLLLRRR-----LR-----L-R, improve=294.0807, (0 missing)
##       education     &amp;lt; 22.5         to the left,  improve=262.3747, (0 missing)
##       BaP           &amp;lt; 0.053328     to the right, improve=232.7043, (0 missing)
##   Surrogate splits:
##       BaP           &amp;lt; 0.053328     to the right, agree=0.878, adj=0.485, (0 split)
##       pm2_5         &amp;lt; 4.810361     to the right, agree=0.827, adj=0.271, (0 split)
##       ap_pc2        &amp;lt; 0.8746175    to the left,  agree=0.792, adj=0.124, (0 split)
##       so2           &amp;lt; 0.3302972    to the left,  agree=0.781, adj=0.078, (0 split)
##       age_education splits as  ----LLLLLL-LLLLLLLRLR-LRRLRRRRRR-RRRRLLLLLR-LRLRLLRRLL-LLRLLR-LLR-RRLLLLL-----RR-----R-L, agree=0.779, adj=0.071, (0 split)
## 
## Node number 14: 11607 observations,    complexity param=0.008270266
##   predicted class=0  expected loss=0.3804601  P(node): 0.1545827
##     class counts:  7191  4416
##    probabilities: 0.620 0.380 
##   left son=28 (7462 obs) right son=29 (4145 obs)
##   Primary splits:
##       age_education                    splits as  ----LLLLLL-LRRRRRRRRR-RRLRRLRRLL-RRRRLRLLRR-RLRLLLRLRL-RR-RR--RRL-L-LLRRR------------L-R, improve=123.71070, (0 missing)
##       year                             splits as  R-LR, improve=107.79460, (0 missing)
##       education                        &amp;lt; 20.5         to the left,  improve= 90.28724, (0 missing)
##       occupation_of_respondent         splits as  LRRLRRRRRLRLLLRLLL, improve= 84.62865, (0 missing)
##       respondent_occupation_scale_c_14 splits as  LRLLLRRL, improve= 68.88653, (0 missing)
##   Surrogate splits:
##       education                        &amp;lt; 20.5         to the left,  agree=0.950, adj=0.861, (0 split)
##       occupation_of_respondent         splits as  LLLLRLLRRLRLLLRLLL, agree=0.738, adj=0.267, (0 split)
##       respondent_occupation_scale_c_14 splits as  LRLLLLRL, agree=0.733, adj=0.251, (0 split)
##       is_student                       &amp;lt; 0.5          to the left,  agree=0.709, adj=0.186, (0 split)
##       age_exact                        &amp;lt; 23.5         to the right, agree=0.676, adj=0.094, (0 split)
## 
## Node number 15: 3619 observations
##   predicted class=1  expected loss=0.3722023  P(node): 0.04819807
##     class counts:  1347  2272
##    probabilities: 0.372 0.628 
## 
## Node number 28: 7462 observations
##   predicted class=0  expected loss=0.326052  P(node): 0.09937938
##     class counts:  5029  2433
##    probabilities: 0.674 0.326 
## 
## Node number 29: 4145 observations,    complexity param=0.008270266
##   predicted class=0  expected loss=0.4784077  P(node): 0.05520337
##     class counts:  2162  1983
##    probabilities: 0.522 0.478 
##   left son=58 (2573 obs) right son=59 (1572 obs)
##   Primary splits:
##       year                     splits as  L-LR, improve=40.13885, (0 missing)
##       occupation_of_respondent splits as  LRLLRRRRRLRLLLRLLL, improve=18.33254, (0 missing)
##       marital_status           splits as  LRRRLRRRLRRLRLLRRRRRRLRLRLLRR, improve=17.86888, (0 missing)
##       type_of_community        splits as  LRLRL, improve=17.55254, (0 missing)
##       age_education            splits as  ------------LLRRRRRRR-RR-RL-RR---LRRR-R--LR-R-R---R-R--RR-RR--RR------RRR--------------R, improve=14.66121, (0 missing)
##   Surrogate splits:
##       type_of_community splits as  LLLRL, agree=0.777, adj=0.412, (0 split)
##       marital_status    splits as  RRLLLLLRLLLLLLLRRRLLLLLLRLRLL, agree=0.680, adj=0.155, (0 split)
##       isocntry          splits as  LL---LL---L-R----------LL------, agree=0.669, adj=0.127, (0 split)
##       country_code      splits as  LL---L---L-R--------LL------, agree=0.669, adj=0.127, (0 split)
##       o3                &amp;lt; 83.06345     to the right, agree=0.650, adj=0.076, (0 split)
## 
## Node number 58: 2573 observations
##   predicted class=0  expected loss=0.4240187  P(node): 0.03426737
##     class counts:  1482  1091
##    probabilities: 0.576 0.424 
## 
## Node number 59: 1572 observations
##   predicted class=1  expected loss=0.43257  P(node): 0.02093599
##     class counts:   680   892
##    probabilities: 0.433 0.567

# plot tree
plot(fit, uniform=TRUE,
   main=&amp;quot;Classification Tree: Climate Change Is The Most Serious Threat&amp;quot;)
text(fit, use.n=TRUE, all=TRUE, cex=.8)

## Warning in labels.rpart(x, minlength:  minlength): more than 52 levels in a
## predicting factor, truncated for printout
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;















&lt;figure  &gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img alt=&#34;&amp;amp;ldquo;predicting factor, truncated for printout&amp;amp;rdquo;&#34; srcset=&#34;
               /post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_0bdd94d7f6c1efcc2575c1adeb6917c8.webp 400w,
               /post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_daf3b553e16b54a4b23a242bc9ef1e6b.webp 760w,
               /post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_1200x1200_fit_q75_h2_lanczos_3.webp 1200w&#34;
               src=&#34;https://greendeal.dataobservatory.eu/post/2021-03-06-individual-join/rpart-2_hu8765078af843fd2a25e4b77d7cba4bfb_9882_0bdd94d7f6c1efcc2575c1adeb6917c8.webp&#34;
               width=&#34;672&#34;
               height=&#34;480&#34;
               loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;/figure&gt;
&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;saveRDS ( climate_awareness_air , file.path(tempdir(), &amp;quot;climate_panel_recoded.rds&amp;quot;), version:  2)

# not evaluated
saveRDS( climate_awareness_air, file:  file.path(&amp;quot;data-raw&amp;quot;, &amp;quot;climate-panel_recoded.rds&amp;quot;))
&lt;/code&gt;&lt;/pre&gt;
</description>
    </item>
    
    <item>
      <title>Retrospective Survey Harmonization Case Study - Climate Awareness Change in Europe 2013-2019.</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-05-retroharmonize-climate/</link>
      <pubDate>Fri, 05 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-05-retroharmonize-climate/</guid>
      <description>&lt;p&gt;Retrospective survey harmonization comes with many challenges, as we
have shown in the
&lt;a href=&#34;https://greendeal.dataobservatory.eu/post/2021-03-04_retroharmonize_intro/&#34;&gt;introduction&lt;/a&gt;
to this tutorial case study. In this example, we will work with
Eurobarometer’s data.&lt;/p&gt;
&lt;div class=&#34;alert alert-note&#34;&gt;
  &lt;div&gt;
    This code tutorial is not outdated, but the &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt; R package has a new (development) release with more featues.
  &lt;/div&gt;
&lt;/div&gt;
&lt;details class=&#34;spoiler &#34;  id=&#34;spoiler-1&#34;&gt;
  &lt;summary&gt;Click to expand table of contents of the post&lt;/summary&gt;
  &lt;p&gt;&lt;details class=&#34;toc-inpage d-print-none  &#34; open&gt;
  &lt;summary class=&#34;font-weight-bold&#34;&gt;Table of Contents&lt;/summary&gt;
  &lt;nav id=&#34;TableOfContents&#34;&gt;
  &lt;ul&gt;
    &lt;li&gt;&lt;a href=&#34;#get-the-data&#34;&gt;Get the Data&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#metadata-analysis&#34;&gt;Metadata analysis&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#metadata-protocol-variables&#34;&gt;Metadata: Protocol Variables&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#metadata-geographical-information&#34;&gt;Metadata: Geographical information&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#socio-demography-and-weights&#34;&gt;Socio-demography and Weights&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#harmonizing-variable-labels&#34;&gt;Harmonizing Variable Labels&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#creating-the-longitudional-table&#34;&gt;Creating the Longitudional Table&lt;/a&gt;&lt;/li&gt;
    &lt;li&gt;&lt;a href=&#34;#putting-it-on-a-map&#34;&gt;Putting It on a Map&lt;/a&gt;&lt;/li&gt;
  &lt;/ul&gt;
&lt;/nav&gt;
&lt;/details&gt;
&lt;/p&gt;
&lt;/details&gt;
&lt;p&gt;Please use the development version of
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;devtools::install_github(&amp;quot;antaldaniel/retroharmonize&amp;quot;)

library(retroharmonize)
library(dplyr)       # this is necessary for the example 
library(lubridate)   # easier date conversion

## Warning: package &#39;lubridate&#39; was built under R version 4.0.4

library(stringr)     # You can also use base R string processing functions 
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;get-the-data&#34;&gt;Get the Data&lt;/h2&gt;
&lt;p&gt;&lt;code&gt;retroharmonize&lt;/code&gt; is not associated with Eurobarometer, or its creators,
Kantar, or its archivists, GESIS. We assume that you have acquired the
necessary files from GESIS after carefully reading their terms and you
placed it on a path that you call gesis_dir. The precise documentation
of the data we use can be found in this supporting
&lt;a href=&#34;http://netzero.dataobservatory.eu/post/2021-03-04-eurobarometer_data/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;blogpost&lt;/a&gt;.
To reproduce this blogpost, you will need &lt;code&gt;ZA5877_v2-0-0.sav&lt;/code&gt;,
&lt;code&gt;ZA6595_v3-0-0.sav&lt;/code&gt;, &lt;code&gt;ZA6861_v1-2-0.sav&lt;/code&gt;, &lt;code&gt;ZA7488_v1-0-0.sav&lt;/code&gt;,
&lt;code&gt;ZA7572_v1-0-0.sav&lt;/code&gt; in a directory that you will name &lt;code&gt;gesis_dir&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#Not run in the blogpost. In the repo we have a saved version.
climate_change_files &amp;lt;- c(&amp;quot;ZA5877_v2-0-0.sav&amp;quot;, &amp;quot;ZA6595_v3-0-0.sav&amp;quot;,  &amp;quot;ZA6861_v1-2-0.sav&amp;quot;, 
                          &amp;quot;ZA7488_v1-0-0.sav&amp;quot;, &amp;quot;ZA7572_v1-0-0.sav&amp;quot;)

eb_waves &amp;lt;- read_surveys(file.path(gesis_dir, climate_change_files), .f=&#39;read_spss&#39;)

if (dir.exists(&amp;quot;data-raw&amp;quot;)) {
  save ( eb_waves,  file:  file.path(&amp;quot;data-raw&amp;quot;, &amp;quot;eb_climate_change_waves.rda&amp;quot;) )
}

if ( file.exists( file.path(&amp;quot;data-raw&amp;quot;, &amp;quot;eb_climate_change_waves.rda&amp;quot;) )) {
  load (file.path( &amp;quot;data-raw&amp;quot;, &amp;quot;eb_climate_change_waves.rda&amp;quot; ) )
} else {
  load (file.path(&amp;quot;..&amp;quot;, &amp;quot;..&amp;quot;,  &amp;quot;data-raw&amp;quot;, &amp;quot;eb_climate_change_waves.rda&amp;quot;) )
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;eb_waves&lt;/code&gt; nested list contains five surveys imported from SPSS to
the survey class of
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/labelled_spss_survey.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt;.
The survey class is a data.frame that retains important metadata for
further harmonization.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;document_waves (eb_waves)

## # A tibble: 5 x 5
##   id            filename           ncol  nrow object_size
##   &amp;lt;chr&amp;gt;         &amp;lt;chr&amp;gt;             &amp;lt;int&amp;gt; &amp;lt;int&amp;gt;       &amp;lt;dbl&amp;gt;
## 1 ZA5877_v2-0-0 ZA5877_v2-0-0.sav   604 27919   139352456
## 2 ZA6595_v3-0-0 ZA6595_v3-0-0.sav   519 27718   119370440
## 3 ZA6861_v1-2-0 ZA6861_v1-2-0.sav   657 27901   151397528
## 4 ZA7488_v1-0-0 ZA7488_v1-0-0.sav   752 27339   169465928
## 5 ZA7572_v1-0-0 ZA7572_v1-0-0.sav   348 27655    80562432
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Beware the object sizes. If you work with many surveys, memory-efficient
programming becomes imperative. We will be subsetting whenever possible.&lt;/p&gt;
&lt;h2 id=&#34;metadata-analysis&#34;&gt;Metadata analysis&lt;/h2&gt;
&lt;p&gt;As noted before, prepare to work with nested lists. Each imported survey
is nested as a data frame in the &lt;code&gt;eb_waves&lt;/code&gt; list.&lt;/p&gt;
&lt;h2 id=&#34;metadata-protocol-variables&#34;&gt;Metadata: Protocol Variables&lt;/h2&gt;
&lt;p&gt;Eurobarometer calls certain metadata elements, like interviewee
cooperation level or the date of a survey interview as protocol
variable. Let’s start here. This will be our template to harmonize more
and more aspects of the five surveys (which are, in fact, already
harmonization of about 30 surveys conducted in a single ‘wave’ in
multiple countries.)&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# select variables of interest from the metadata
eb_protocol_metadata &amp;lt;- eb_climate_metadata %&amp;gt;%
  filter ( .data$label_orig %in% c(&amp;quot;date of interview&amp;quot;) |
             .data$var_name_orig: = &amp;quot;rowid&amp;quot;)  %&amp;gt;%
  suggest_var_names( survey_program:  &amp;quot;eurobarometer&amp;quot; )

# subset and harmonize these variables in all nested list items of &#39;waves&#39; of surveys
interview_dates &amp;lt;- harmonize_var_names(eb_waves, 
                                       eb_protocol_metadata )

# apply similar data processing rules to same variables
interview_dates &amp;lt;- lapply (interview_dates, 
                      function (x) x %&amp;gt;% mutate ( date_of_interview:  as_character(.data$date_of_interview) )
                      )

# join the individual survey tables into a single table 
interview_dates &amp;lt;- as_tibble ( Reduce (rbind, interview_dates) )

# Check the variable classes.

vapply(interview_dates, function(x) class(x)[1], character(1))

##             rowid date_of_interview 
##       &amp;quot;character&amp;quot;       &amp;quot;character&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;This is our sample workflow for each block of variables.&lt;/p&gt;
&lt;ol&gt;
&lt;li&gt;Get a unique identifier.&lt;/li&gt;
&lt;li&gt;Add other variables&lt;/li&gt;
&lt;li&gt;Harmonize the variable names&lt;/li&gt;
&lt;li&gt;Subset the data leaving out anything that you do not harmonize in
this block.&lt;/li&gt;
&lt;li&gt;Apply some normalization in a nested list.&lt;/li&gt;
&lt;li&gt;When the variables are harmonized to same name, class, merge them
into a data.frame-like &lt;code&gt;tibble&lt;/code&gt; object.&lt;/li&gt;
&lt;/ol&gt;
&lt;p&gt;Now finish the harmonization. &lt;code&gt;Wednesday, 31st October 2018&lt;/code&gt; should
become a Date type &lt;code&gt;2018-10-31&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;require(lubridate)
harmonize_date &amp;lt;- function(x) {
  x &amp;lt;- tolower(as.character(x))
  x &amp;lt;- gsub(&amp;quot;monday|tuesday|wednesday|thursday|friday|saturday|sunday|\\,|th|nd|rd|st&amp;quot;, &amp;quot;&amp;quot;, x)
  x &amp;lt;- gsub(&amp;quot;decemberber&amp;quot;, &amp;quot;december&amp;quot;, x) # all those annoying real-life data problems!
  x &amp;lt;- stringr::str_trim (x, &amp;quot;both&amp;quot;)
  x &amp;lt;- gsub(&amp;quot;^0&amp;quot;, &amp;quot;&amp;quot;, x )
  x &amp;lt;- gsub(&amp;quot;\\s\\s&amp;quot;, &amp;quot;\\s&amp;quot;, x)
  lubridate::dmy(x) 
}

interview_dates &amp;lt;- interview_dates %&amp;gt;%
  mutate ( date_of_interview:  harmonize_date(.data$date_of_interview) )

vapply(interview_dates, function(x) class(x)[1], character(1))

##             rowid date_of_interview 
##       &amp;quot;character&amp;quot;            &amp;quot;Date&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;To avoid duplication of row IDs in surveys that may not be unique in
&lt;em&gt;different&lt;/em&gt; surveys, we created a simple, sequential ID for each survey,
including the ID of the original file.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set.seed(2021)
sample_n(interview_dates, 6)

## # A tibble: 6 x 2
##   rowid               date_of_interview
##   &amp;lt;chr&amp;gt;               &amp;lt;date&amp;gt;           
## 1 ZA7488_v1-0-0_7016  2018-10-28       
## 2 ZA7488_v1-0-0_19187 2018-11-02       
## 3 ZA6861_v1-2-0_1218  2017-03-18       
## 4 ZA6861_v1-2-0_4142  2017-03-21       
## 5 ZA7572_v1-0-0_12363 2019-04-17       
## 6 ZA7572_v1-0-0_8071  2019-04-18
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;After this type-conversion problem let’s see an issue when an original
SPSS variable can have two meaningful R representations.&lt;/p&gt;
&lt;h2 id=&#34;metadata-geographical-information&#34;&gt;Metadata: Geographical information&lt;/h2&gt;
&lt;p&gt;Let’s continue with harmonizing geographical information in the files.
In this example, &lt;code&gt;var_name_suggested&lt;/code&gt; will contain the harmonized
variable name. It is likely that you have to make this call, after
carefully reading the original questionnaires and codebooks.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;eb_regional_metadata &amp;lt;- eb_climate_metadata %&amp;gt;%
  filter ( grepl( &amp;quot;rowid|isocntry|^nuts$&amp;quot;, .data$var_name_orig)) %&amp;gt;%
  suggest_var_names( survey_program:  &amp;quot;eurobarometer&amp;quot; ) %&amp;gt;%
  mutate ( var_name_suggested:  case_when ( 
    var_name_suggested: = &amp;quot;region_nuts_codes&amp;quot;     ~ &amp;quot;geo&amp;quot;,
    TRUE ~ var_name_suggested ))
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;harmonize_var_names()&lt;/code&gt; takes all variables in the subsetted,
geographical metadata table, and brings them to the harmonized
&lt;code&gt;var_name_suggested&lt;/code&gt; name. The function subsets the surveys to avoid the
presence of non-harmonized variables. All regional NUTS codes become
&lt;code&gt;geo&lt;/code&gt; in our case:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;geography &amp;lt;- harmonize_var_names(eb_waves, 
                                 eb_regional_metadata)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;If you are used to work with single survey files, you are likely to work
in a tabular format, which easily converts into a data.frame like
object, in our example, to tidyverse’s &lt;code&gt;tibble&lt;/code&gt;. However, when working
with longitudinal data, it is far simpler to work with nested lists,
because the tables usually have different dimensions (neither the rows
corresponding to observations or the columns are the same across all
survey files.)&lt;/p&gt;
&lt;p&gt;In the nested list, each list element is a single, tabular-format
survey. (In fact, the survey are in retroharmonize’s
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/reference/survey.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;survey&lt;/a&gt;
class, which is a rich tibble that contains the metadata and the
processing history of the survey.)&lt;/p&gt;
&lt;p&gt;The regional information in the Eurobarometer files is contained in the
&lt;code&gt;nuts&lt;/code&gt; variable. We want to keep both the original labels and values.
The original values are the region’s codes, and the labels are the
names. The easiest and fastest solution is the base R &lt;code&gt;lapply&lt;/code&gt; loop.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;geography &amp;lt;- lapply ( geography, 
                      function (x) x %&amp;gt;% mutate ( region:  as_character(geo), 
                                                  geo   :  as.character(geo) )  
)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Because each table has exactly the same columns, we can simply use
&lt;code&gt;rbind()&lt;/code&gt; and reduce the list to a modern &lt;code&gt;data.frame&lt;/code&gt;, i.e. a &lt;code&gt;tibble&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;geography &amp;lt;- as_tibble ( Reduce (rbind, geography) )
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s see a dozen cases:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set.seed(2021)
sample_n(geography, 12)

## # A tibble: 12 x 4
##    rowid               isocntry geo   region              
##    &amp;lt;chr&amp;gt;               &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;               
##  1 ZA7488_v1-0-0_7016  SI       SI012 Podravska           
##  2 ZA7488_v1-0-0_19187 PL       PL63  Pomorskie           
##  3 ZA6861_v1-2-0_1218  DK       DK02  Sjaelland           
##  4 ZA6861_v1-2-0_4142  FI       FI1B  Helsinki-Uusimaa    
##  5 ZA7572_v1-0-0_12363 SE       SE12  Oestra Mellansverige
##  6 ZA7572_v1-0-0_8071  IT       ITH   Nord-Est [IT]       
##  7 ZA6861_v1-2-0_6145  IE       IE021 Dublin              
##  8 ZA6861_v1-2-0_24638 RO       RO31  South [RO]          
##  9 ZA7488_v1-0-0_11315 CY       CY    REPUBLIC OF CYPRUS  
## 10 ZA6595_v3-0-0_27568 HR       HR041 Grad Zagreb         
## 11 ZA7572_v1-0-0_17397 CZ       CZ06  Jihovychod          
## 12 ZA6861_v1-2-0_10993 PT       PT17  Lisboa
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The idea is that we do similar variable harmonization block by block,
and eventually we will join them together. Next step: socio-demography
and weights.&lt;/p&gt;
&lt;h2 id=&#34;socio-demography-and-weights&#34;&gt;Socio-demography and Weights&lt;/h2&gt;
&lt;p&gt;There are a few peculiar issues to look out for. This example shows that
survey harmonization requires plenty of expert judgment, and you cannot
fully automate the process.&lt;/p&gt;
&lt;p&gt;The Eurobarometer archives do not use all weight and demographic
variable names consistently. For example, the &lt;code&gt;wex&lt;/code&gt; variable, which is a
projected weight for the country’s 15 years old or older population is
sometimes called &lt;code&gt;wex&lt;/code&gt;, sometimes &lt;code&gt;wextra&lt;/code&gt;. The individual survey’s
post-stratification weight is the &lt;code&gt;w1&lt;/code&gt; variable, but this is not
necessarily what you need to use.&lt;/p&gt;
&lt;p&gt;The &lt;code&gt;suggest_var_names()&lt;/code&gt; function has a parameter for
&lt;code&gt;survey_program:  &amp;quot;eurobaromater&amp;quot;&lt;/code&gt; which normalizes a bit the most used
variables. For example, all variations of wex, wextra wil be noramlized
to wex. You can ignore this parameter and use your own names, too.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;eb_demography_metadata  &amp;lt;- eb_climate_metadata %&amp;gt;%
  filter ( grepl( &amp;quot;rowid|isocntry|^d8$|^d7$|^wex|^w1$|d25|^d15a|^d11$&amp;quot;, .data$var_name_orig) ) %&amp;gt;%
  suggest_var_names( survey_program:  &amp;quot;eurobarometer&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;As you can see, using the original labels would not help, because they
also contain various alterations.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;eb_demography_metadata %&amp;gt;%
  select ( filename, var_name_orig, label_orig, var_name_suggested ) %&amp;gt;%
  filter (var_name_orig %in% c(&amp;quot;wex&amp;quot;, &amp;quot;wextra&amp;quot;) )

##            filename var_name_orig                                  label_orig
## 1 ZA5877_v2-0-0.sav        wextra      weight extrapolated population 15 plus
## 2 ZA6595_v3-0-0.sav        wextra      weight extrapolated population 15 plus
## 3 ZA6861_v1-2-0.sav           wex weight extrapolated population aged 15 plus
## 4 ZA7488_v1-0-0.sav           wex weight extrapolated population aged 15 plus
## 5 ZA7572_v1-0-0.sav           wex weight extrapolated population aged 15 plus
##   var_name_suggested
## 1                wex
## 2                wex
## 3                wex
## 4                wex
## 5                wex

demography &amp;lt;- harmonize_var_names ( waves:  eb_waves, 
                                    metadata:  eb_demography_metadata ) 
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Socio-demographic variables like level of highest education or
occupation are rather country-specific. Eurobarometer uses standardized
occupation and marital status scales, and a proxy for education levels,
age of leaving full-time education.&lt;/p&gt;
&lt;p&gt;This is a particularly tricky variable, because it’s coding in fact
contains three different variables - school leaving age, except for
students, and except for people who did not finish their compulsory
primary school. And while school leaving age was a good proxy since the
1970s, in the age when the EU is promoting life-long-learning becomes
less and less useful, as people stop and re-start their education
throughout their lives.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;example &amp;lt;- demography[[1]] %&amp;gt;%
  mutate ( across ( -any_of(c(&amp;quot;rowid&amp;quot;, &amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;)), as_character) ) %&amp;gt;%
  mutate ( across (any_of(c(&amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;)), as_numeric) )
unique ( example$age_education )

##  [1] &amp;quot;22&amp;quot;                     &amp;quot;25&amp;quot;                     &amp;quot;17&amp;quot;                    
##  [4] &amp;quot;19&amp;quot;                     &amp;quot;12&amp;quot;                     &amp;quot;23&amp;quot;                    
##  [7] &amp;quot;18&amp;quot;                     &amp;quot;20&amp;quot;                     &amp;quot;21&amp;quot;                    
## [10] &amp;quot;14&amp;quot;                     &amp;quot;24&amp;quot;                     &amp;quot;16&amp;quot;                    
## [13] &amp;quot;26&amp;quot;                     &amp;quot;15&amp;quot;                     &amp;quot;Still studying&amp;quot;        
## [16] &amp;quot;DK&amp;quot;                     &amp;quot;31&amp;quot;                     &amp;quot;29&amp;quot;                    
## [19] &amp;quot;27&amp;quot;                     &amp;quot;13&amp;quot;                     &amp;quot;32&amp;quot;                    
## [22] &amp;quot;28&amp;quot;                     &amp;quot;30&amp;quot;                     &amp;quot;53&amp;quot;                    
## [25] &amp;quot;42&amp;quot;                     &amp;quot;62&amp;quot;                     &amp;quot;40&amp;quot;                    
## [28] &amp;quot;No full-time education&amp;quot; &amp;quot;Refusal&amp;quot;                &amp;quot;37&amp;quot;                    
## [31] &amp;quot;39&amp;quot;                     &amp;quot;34&amp;quot;                     &amp;quot;35&amp;quot;                    
## [34] &amp;quot;47&amp;quot;                     &amp;quot;36&amp;quot;                     &amp;quot;45&amp;quot;                    
## [37] &amp;quot;51&amp;quot;                     &amp;quot;33&amp;quot;                     &amp;quot;43&amp;quot;                    
## [40] &amp;quot;38&amp;quot;                     &amp;quot;49&amp;quot;                     &amp;quot;46&amp;quot;                    
## [43] &amp;quot;41&amp;quot;                     &amp;quot;57&amp;quot;                     &amp;quot;7&amp;quot;                     
## [46] &amp;quot;48&amp;quot;                     &amp;quot;44&amp;quot;                     &amp;quot;50&amp;quot;                    
## [49] &amp;quot;56&amp;quot;                     &amp;quot;8&amp;quot;                      &amp;quot;11&amp;quot;                    
## [52] &amp;quot;10&amp;quot;                     &amp;quot;9&amp;quot;                      &amp;quot;75 years&amp;quot;              
## [55] &amp;quot;6&amp;quot;                      &amp;quot;3&amp;quot;                      &amp;quot;54&amp;quot;                    
## [58] &amp;quot;55&amp;quot;                     &amp;quot;60&amp;quot;                     &amp;quot;64&amp;quot;                    
## [61] &amp;quot;2 years&amp;quot;                &amp;quot;58&amp;quot;                     &amp;quot;52&amp;quot;                    
## [64] &amp;quot;72&amp;quot;                     &amp;quot;61&amp;quot;                     &amp;quot;4&amp;quot;                     
## [67] &amp;quot;63&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The seamingly trival &lt;code&gt;age_exact&lt;/code&gt; variable has its own issues, too:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;unique ( example$age_exact)

##  [1] &amp;quot;54&amp;quot;       &amp;quot;66&amp;quot;       &amp;quot;56&amp;quot;       &amp;quot;53&amp;quot;       &amp;quot;33&amp;quot;       &amp;quot;72&amp;quot;      
##  [7] &amp;quot;83&amp;quot;       &amp;quot;62&amp;quot;       &amp;quot;86&amp;quot;       &amp;quot;77&amp;quot;       &amp;quot;64&amp;quot;       &amp;quot;46&amp;quot;      
## [13] &amp;quot;44&amp;quot;       &amp;quot;59&amp;quot;       &amp;quot;60&amp;quot;       &amp;quot;67&amp;quot;       &amp;quot;63&amp;quot;       &amp;quot;20&amp;quot;      
## [19] &amp;quot;43&amp;quot;       &amp;quot;37&amp;quot;       &amp;quot;78&amp;quot;       &amp;quot;49&amp;quot;       &amp;quot;90&amp;quot;       &amp;quot;45&amp;quot;      
## [25] &amp;quot;28&amp;quot;       &amp;quot;29&amp;quot;       &amp;quot;30&amp;quot;       &amp;quot;39&amp;quot;       &amp;quot;51&amp;quot;       &amp;quot;38&amp;quot;      
## [31] &amp;quot;41&amp;quot;       &amp;quot;71&amp;quot;       &amp;quot;25&amp;quot;       &amp;quot;48&amp;quot;       &amp;quot;79&amp;quot;       &amp;quot;88&amp;quot;      
## [37] &amp;quot;61&amp;quot;       &amp;quot;85&amp;quot;       &amp;quot;70&amp;quot;       &amp;quot;35&amp;quot;       &amp;quot;81&amp;quot;       &amp;quot;52&amp;quot;      
## [43] &amp;quot;57&amp;quot;       &amp;quot;27&amp;quot;       &amp;quot;47&amp;quot;       &amp;quot;15 years&amp;quot; &amp;quot;21&amp;quot;       &amp;quot;42&amp;quot;      
## [49] &amp;quot;32&amp;quot;       &amp;quot;68&amp;quot;       &amp;quot;36&amp;quot;       &amp;quot;34&amp;quot;       &amp;quot;19&amp;quot;       &amp;quot;31&amp;quot;      
## [55] &amp;quot;26&amp;quot;       &amp;quot;23&amp;quot;       &amp;quot;24&amp;quot;       &amp;quot;22&amp;quot;       &amp;quot;16&amp;quot;       &amp;quot;84&amp;quot;      
## [61] &amp;quot;65&amp;quot;       &amp;quot;18&amp;quot;       &amp;quot;55&amp;quot;       &amp;quot;40&amp;quot;       &amp;quot;50&amp;quot;       &amp;quot;73&amp;quot;      
## [67] &amp;quot;69&amp;quot;       &amp;quot;87&amp;quot;       &amp;quot;89&amp;quot;       &amp;quot;74&amp;quot;       &amp;quot;75&amp;quot;       &amp;quot;98 years&amp;quot;
## [73] &amp;quot;76&amp;quot;       &amp;quot;80&amp;quot;       &amp;quot;58&amp;quot;       &amp;quot;82&amp;quot;       &amp;quot;17&amp;quot;       &amp;quot;93&amp;quot;      
## [79] &amp;quot;91&amp;quot;       &amp;quot;92&amp;quot;       &amp;quot;95&amp;quot;       &amp;quot;94&amp;quot;       &amp;quot;97&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s see all the strange labels attached to &lt;code&gt;age&lt;/code&gt;-type variables:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;collect_val_labels(metadata:  eb_demography_metadata %&amp;gt;%
                     filter ( var_name_suggested %in% c(&amp;quot;age_exact&amp;quot;, &amp;quot;age_education&amp;quot;)) )

##  [1] &amp;quot;2 years&amp;quot;                  &amp;quot;75 years&amp;quot;                
##  [3] &amp;quot;No full-time education&amp;quot;   &amp;quot;Still studying&amp;quot;          
##  [5] &amp;quot;15 years&amp;quot;                 &amp;quot;98 years&amp;quot;                
##  [7] &amp;quot;96 years&amp;quot;                 &amp;quot;[NOT CLEARLY DOCUMENTED]&amp;quot;
##  [9] &amp;quot;74 years&amp;quot;                 &amp;quot;99 and older&amp;quot;            
## [11] &amp;quot;Refusal&amp;quot;                  &amp;quot;87 years&amp;quot;                
## [13] &amp;quot;DK&amp;quot;                       &amp;quot;88 years&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;We must handle many exception, so we created a function for this
purpose:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;remove_years  &amp;lt;- function(x) { 
  x &amp;lt;- gsub(&amp;quot;years|and\\solder&amp;quot;, &amp;quot;&amp;quot;, tolower(x))
  stringr::str_trim (x, &amp;quot;both&amp;quot;)}

process_demography &amp;lt;- function (x) { 
  
  x %&amp;gt;% mutate ( across ( -any_of(c(&amp;quot;rowid&amp;quot;, &amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;)), as_character) ) %&amp;gt;%
    mutate ( across (any_of(c(&amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;)), as_numeric) ) %&amp;gt;%
    mutate ( across (contains(&amp;quot;age&amp;quot;), remove_years)) %&amp;gt;%
    mutate ( age_exact:  as.numeric (age_exact)) %&amp;gt;%
    mutate ( is_student:  ifelse ( tolower(age_education): = &amp;quot;still studying&amp;quot;, 
                                   1, 0), 
             no_education:  ifelse ( tolower(age_education): = &amp;quot;no full-time education&amp;quot;, 1, 0)) %&amp;gt;%
    mutate ( education:  case_when (
      grepl(&amp;quot;studying&amp;quot;, age_education) ~ age_exact, 
      grepl (&amp;quot;education&amp;quot;, age_education)  ~ 14, 
      grepl (&amp;quot;refus|document|dk&amp;quot;, tolower(age_education)) ~ NA_real_,
      TRUE ~ as.numeric(age_education)
    ))  %&amp;gt;%
    mutate ( education:  case_when ( 
      education &amp;lt; 14 ~ NA_real_, 
      education &amp;gt; 30 ~ 30, 
      TRUE ~ education )) 
}

demography &amp;lt;- lapply ( demography, process_demography )

## Warning in eval_tidy(pair$rhs, env:  default_env): NAs introduced by coercion

## Warning in mask$eval_all_mutate(quo): NAs introduced by coercion

## Warning in eval_tidy(pair$rhs, env:  default_env): NAs introduced by coercion

## Warning in eval_tidy(pair$rhs, env:  default_env): NAs introduced by coercion

## Warning in eval_tidy(pair$rhs, env:  default_env): NAs introduced by coercion

## Warning in eval_tidy(pair$rhs, env:  default_env): NAs introduced by coercion

## WE&#39;ll full join and not use rbind, because we have different variables in different waves.
demography &amp;lt;- Reduce ( full_join, demography )

## Joining, by:  c(&amp;quot;rowid&amp;quot;, &amp;quot;isocntry&amp;quot;, &amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;, &amp;quot;marital_status&amp;quot;, &amp;quot;age_education&amp;quot;, &amp;quot;age_exact&amp;quot;, &amp;quot;occupation_of_respondent&amp;quot;, &amp;quot;occupation_of_respondent_recoded&amp;quot;, &amp;quot;respondent_occupation_scale_c_14&amp;quot;, &amp;quot;type_of_community&amp;quot;, &amp;quot;is_student&amp;quot;, &amp;quot;no_education&amp;quot;, &amp;quot;education&amp;quot;)
## Joining, by:  c(&amp;quot;rowid&amp;quot;, &amp;quot;isocntry&amp;quot;, &amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;, &amp;quot;marital_status&amp;quot;, &amp;quot;age_education&amp;quot;, &amp;quot;age_exact&amp;quot;, &amp;quot;occupation_of_respondent&amp;quot;, &amp;quot;occupation_of_respondent_recoded&amp;quot;, &amp;quot;respondent_occupation_scale_c_14&amp;quot;, &amp;quot;type_of_community&amp;quot;, &amp;quot;is_student&amp;quot;, &amp;quot;no_education&amp;quot;, &amp;quot;education&amp;quot;)
## Joining, by:  c(&amp;quot;rowid&amp;quot;, &amp;quot;isocntry&amp;quot;, &amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;, &amp;quot;marital_status&amp;quot;, &amp;quot;age_education&amp;quot;, &amp;quot;age_exact&amp;quot;, &amp;quot;occupation_of_respondent&amp;quot;, &amp;quot;occupation_of_respondent_recoded&amp;quot;, &amp;quot;respondent_occupation_scale_c_14&amp;quot;, &amp;quot;type_of_community&amp;quot;, &amp;quot;is_student&amp;quot;, &amp;quot;no_education&amp;quot;, &amp;quot;education&amp;quot;)
## Joining, by:  c(&amp;quot;rowid&amp;quot;, &amp;quot;isocntry&amp;quot;, &amp;quot;w1&amp;quot;, &amp;quot;wex&amp;quot;, &amp;quot;marital_status&amp;quot;, &amp;quot;age_education&amp;quot;, &amp;quot;age_exact&amp;quot;, &amp;quot;occupation_of_respondent&amp;quot;, &amp;quot;occupation_of_respondent_recoded&amp;quot;, &amp;quot;respondent_occupation_scale_c_14&amp;quot;, &amp;quot;type_of_community&amp;quot;, &amp;quot;is_student&amp;quot;, &amp;quot;no_education&amp;quot;, &amp;quot;education&amp;quot;)
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now let’s see what we have here:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set.seed(2021)
sample_n(demography, 12)

## # A tibble: 12 x 14
##    rowid    isocntry    w1    wex marital_status        age_education  age_exact
##    &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt;    &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt;                 &amp;lt;chr&amp;gt;              &amp;lt;dbl&amp;gt;
##  1 ZA7488_~ SI       0.828  1428. (Re-)Married: withou~ 19                    43
##  2 ZA7488_~ PL       1.01  32830. (Re-)Married: withou~ 19                    64
##  3 ZA6861_~ DK       0.641  3100. (Re-)Married: withou~ 22                    78
##  4 ZA6861_~ FI       1.83   8601. (Re-)Married: childr~ 30                    38
##  5 ZA7572_~ SE       0.342  2645. (Re-)Married: withou~ 17                    68
##  6 ZA7572_~ IT       0.630 32287. (Re-)Married: childr~ 20                    40
##  7 ZA6861_~ IE       0.868  3054. (Re-)Married: childr~ 32                    42
##  8 ZA6861_~ RO       0.724 11805. (Re-)Married: withou~ 14                    59
##  9 ZA7488_~ CY       0.691  1013. (Re-)Married: childr~ 18                    67
## 10 ZA6595_~ HR       0.580  2098. Single living w part~ 27                    30
## 11 ZA7572_~ CZ       1.86  16908. Single: without chil~ still studying        20
## 12 ZA6861_~ PT       0.932  7448. Widow: with children  no full-time ~        84
## # ... with 7 more variables: occupation_of_respondent &amp;lt;chr&amp;gt;,
## #   occupation_of_respondent_recoded &amp;lt;chr&amp;gt;,
## #   respondent_occupation_scale_c_14 &amp;lt;chr&amp;gt;, type_of_community &amp;lt;chr&amp;gt;,
## #   is_student &amp;lt;dbl&amp;gt;, no_education &amp;lt;dbl&amp;gt;, education &amp;lt;dbl&amp;gt;
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;harmonizing-variable-labels&#34;&gt;Harmonizing Variable Labels&lt;/h2&gt;
&lt;p&gt;So far we have been working with metadata, weights and socio-demography.
In other words, we have not even started the desired harmonization of
climate change awareness. The methodology is the same, but here we
really must look out for the answer options in the questionnaire. (Refer
to our data summary again
&lt;a href=&#34;http://netzero.dataobservatory.eu/post/2021-03-04-eurobarometer_data/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;here&lt;/a&gt;.)&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;climate_awareness_metadata &amp;lt;- eb_climate_metadata %&amp;gt;%
  suggest_var_names( survey_program:  &amp;quot;eurobarometer&amp;quot; ) %&amp;gt;%
  filter ( .data$var_name_suggested  %in% c(&amp;quot;rowid&amp;quot;,
                                            &amp;quot;serious_world_problems_first&amp;quot;, 
                                             &amp;quot;serious_world_problems_climate_change&amp;quot;)
  ) 

hw &amp;lt;- harmonize_var_names ( waves:  eb_waves, 
                            metadata:  climate_awareness_metadata )
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;retroharmoinze&lt;/code&gt; package comes with a generic
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/reference/harmonize_waves.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;harmonize_values()&lt;/a&gt;
function that will change the value labels of categorical variables
(including binary ones) to a unitary format. It will also take care of
various types of missing values.&lt;/p&gt;
&lt;p&gt;First, let’s go back to our metadata and collect all value labels that
will show up with
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/reference/collect_val_labels.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;collect_val_labels()&lt;/a&gt;:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;collect_val_labels(climate_awareness_metadata)

##  [1] &amp;quot;Climate change&amp;quot;                            
##  [2] &amp;quot;International terrorism&amp;quot;                   
##  [3] &amp;quot;Poverty, hunger and lack of drinking water&amp;quot;
##  [4] &amp;quot;Spread of infectious diseases&amp;quot;             
##  [5] &amp;quot;The economic situation&amp;quot;                    
##  [6] &amp;quot;Proliferation of nuclear weapons&amp;quot;          
##  [7] &amp;quot;Armed conflicts&amp;quot;                           
##  [8] &amp;quot;The increasing global population&amp;quot;          
##  [9] &amp;quot;Other (SPONTANEOUS)&amp;quot;                       
## [10] &amp;quot;None (SPONTANEOUS)&amp;quot;                        
## [11] &amp;quot;Not mentioned&amp;quot;                             
## [12] &amp;quot;Mentioned&amp;quot;                                 
## [13] &amp;quot;DK&amp;quot;
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;In this case, we want to select &lt;code&gt;Climate change&lt;/code&gt; as the mentioned &lt;em&gt;most
serious problem&lt;/em&gt;, and &lt;code&gt;Climate change&lt;/code&gt; taken from a list of three
serious problems. The first question type is a single-choice one, where
&lt;code&gt;Climate change&lt;/code&gt; is either mentioned, or the alternative answer is
labeled as &lt;code&gt;Not mentioned&lt;/code&gt;. In the multiple choice case, the alternative
may be something else, for example, &lt;code&gt;Spread of infectious diseases&lt;/code&gt;, as
we all well know by 2021.&lt;/p&gt;
&lt;p&gt;We want to see who thought &lt;code&gt;Climate change&lt;/code&gt; was the most serious
problem, or one of the most serious problems, so we label each mentions
of &lt;code&gt;Climate change&lt;/code&gt; as &lt;code&gt;mentioned&lt;/code&gt; and we pair it with a numeric value
of &lt;code&gt;1&lt;/code&gt;. All other cases are labeled as &lt;code&gt;not_mentioned&lt;/code&gt;, with the
exceptions of various missing observations, which in these cases are
&lt;code&gt;Do not know&lt;/code&gt; answers, &lt;code&gt;Declined to answer&lt;/code&gt; cases, and &lt;code&gt;Inappropriate&lt;/code&gt;
cases [The latter one is Eurobarometer’s label for questions that were
for one reason or other not asked from a particular interviewee – for
example, because the Turkish Cypriot community received a different
questionnaire.]&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;# positive cases
label_1:  c(&amp;quot;^Climate\\schange&amp;quot;, &amp;quot;^Mentioned&amp;quot;)
# missing cases 
na_labels &amp;lt;- collect_na_labels( climate_awareness_metadata)
na_labels

## [1] &amp;quot;DK&amp;quot;                             &amp;quot;Inap. (10 or 11 in qa1a)&amp;quot;      
## [3] &amp;quot;Inap. (coded 10 or 11 in qc1a)&amp;quot; &amp;quot;Inap. (coded 10 or 11 in qb1a)&amp;quot;

# negative cases
label_0 &amp;lt;- collect_val_labels( climate_awareness_metadata)
label_0 &amp;lt;- label_0[! label_0 %in% label_1 ]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;The &lt;code&gt;harmonize_serious_problems()&lt;/code&gt; function harmonizes the labels within
the special labeled class of &lt;code&gt;retroharmonize&lt;/code&gt;. This class retains all
information to give categorical variables a character or numeric
representation, and various processing metadata for documentation
purposes. While this class is very reach (it contains whatever was
imported from SPSS’s proprietary data format and the history), it is not
suitable for statistical analysis. We could, of course, directly call
the
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/reference/harmonize_values.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;harmonize_values()&lt;/a&gt;
from the retroharmonize package, but the parameterization would be very
complicated even in a simple function call, not to mention a looped
call. Because this function is the heart of the
&lt;code&gt;retroharmonize package&lt;/code&gt;, it has &lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/harmonize_labels.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;a tutorial
article&lt;/a&gt;
on its own.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;harmonize_serious_problems &amp;lt;- function(x) {
  label_list &amp;lt;- list(
    from:  c(label_0, label_1, na_labels), 
    to:  c( rep ( &amp;quot;not_mentioned&amp;quot;, length(label_0) ),   # use the same order as in from!
            rep ( &amp;quot;mentioned&amp;quot;, length(label_1) ),
            &amp;quot;do_not_know&amp;quot;, &amp;quot;inap&amp;quot;, &amp;quot;inap&amp;quot;, &amp;quot;inap&amp;quot;), 
    numeric_values:  c(rep ( 0, length(label_0) ), # use the same order as in from!
                       rep ( 1, length(label_1) ),
                       99997,99999,99999,99999)
  )
  
  harmonize_values(x, 
                   harmonize_labels:  label_list, 
                   na_values:  c(&amp;quot;do_not_know&amp;quot;=99997,
                                 &amp;quot;declined&amp;quot;=99998,
                                 &amp;quot;inap&amp;quot;=99999), 
                   remove:  &amp;quot;\\(|\\)|\\[|\\]|\\%&amp;quot;
  )
}
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Our objects are rather big in memory, so first, let’s remove the surveys
that do not contain these world problem variables. In this cases, the
subsetted and harmonized surveys in the nested list have only one
columns, i.e. the &lt;code&gt;rowid&lt;/code&gt;.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hw &amp;lt;- hw[unlist ( lapply ( hw, ncol)) &amp;gt; 1 ]
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Now we have a smaller problem to deal with. With many surveys, it is
easy to fill up your computer’s memory, so let’s start building up our
joined panel data from a smaller set of nested, subsetted surveys.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hw &amp;lt;- lapply ( hw, function (x) x %&amp;gt;% mutate ( across ( contains(&amp;quot;problem&amp;quot;), harmonize_serious_problems) ) )
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Our &lt;code&gt;lapply&lt;/code&gt; loop calls an anonymous function which in turn calls the
&lt;code&gt;harmonize_serious_problems&lt;/code&gt; parameterized version of the
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/reference/harmonize_values.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;harmonize_values()&lt;/a&gt;
on all variables that have &lt;code&gt;problem&lt;/code&gt; in their names.&lt;/p&gt;
&lt;p&gt;once we are done, our variables have harmonized names, and harmonized
values, and harmonized label, but they are stored in the complex
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/harmonize_labels.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize_labelled_spss_survey&lt;/a&gt;
class, inherited from the &lt;code&gt;haven_labelled_spss&lt;/code&gt; in
&lt;a href=&#34;https://haven.tidyverse.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;haven&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;We reduced our single and multiple choice questions to binary choice
variables. We can now give them a numeric representation. Be mindful
that &lt;code&gt;retroharmonize&lt;/code&gt; has special methods for its special labeled class
that retains metadata from SPSS. This means that &lt;code&gt;as_character&lt;/code&gt; and
&lt;code&gt;as_numeric&lt;/code&gt; knows how to handle various types of missing values,
whereas the base R &lt;code&gt;as.character&lt;/code&gt; and &lt;code&gt;as.numeric&lt;/code&gt; may coerce special
values to unwanted results. This is particularly dangerous with numeric
variables – and this is the reason why we introduced a new set of S3
objects and methods in the package.&lt;/p&gt;
&lt;p&gt;We will ignore the differences between various forms of missingness,
i.e. the person said that she did not know, or did not want to answer,
or for some reason was not asked in the survey. In a more descriptive,
non-harmonized analysis you would probably want to explore them as
various ‘categories’ and use a character representation.&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;hw &amp;lt;- lapply ( hw, function(x) x %&amp;gt;% mutate ( across ( contains(&amp;quot;problem&amp;quot;), as_numeric) ))

hw &amp;lt;- Reduce ( full_join, hw) # we must use joins instead of binds because the number of columns vary.
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;Let’s see what we have:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;set.seed(2021)
sample_n (hw, 12)

## # A tibble: 12 x 3
##    rowid             serious_world_problems_fi~ serious_world_problems_climate_~
##    &amp;lt;chr&amp;gt;                                  &amp;lt;dbl&amp;gt;                            &amp;lt;dbl&amp;gt;
##  1 ZA6595_v3-0-0_23~                          0                               NA
##  2 ZA7572_v1-0-0_70~                          0                                0
##  3 ZA6595_v3-0-0_18~                          0                               NA
##  4 ZA6861_v1-2-0_27~                          0                                0
##  5 ZA6595_v3-0-0_26~                          0                               NA
##  6 ZA7572_v1-0-0_19~                          0                                1
##  7 ZA5877_v2-0-0_16~                          0                                0
##  8 ZA6861_v1-2-0_12~                          0                                0
##  9 ZA7572_v1-0-0_17~                          0                                0
## 10 ZA5877_v2-0-0_17~                          0                                1
## 11 ZA6861_v1-2-0_41~                          0                                0
## 12 ZA6861_v1-2-0_61~                          0                                1
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;creating-the-longitudional-table&#34;&gt;Creating the Longitudional Table&lt;/h2&gt;
&lt;p&gt;Now we just need to join the partial table by the &lt;code&gt;rowid&lt;/code&gt; together:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;#start from the smallest (we removed the survey that had no relevant questionnaire item)
panel &amp;lt;- hw %&amp;gt;%
  left_join ( geography, by:  &#39;rowid&#39; ) 

panel &amp;lt;- panel %&amp;gt;%
  left_join ( demography, by:  c(&amp;quot;rowid&amp;quot;, &amp;quot;isocntry&amp;quot;) ) 

panel &amp;lt;- panel %&amp;gt;%
  left_join ( interview_dates, by:  &#39;rowid&#39; )
&lt;/code&gt;&lt;/pre&gt;
&lt;p&gt;And let’s see a small sample:&lt;/p&gt;
&lt;pre&gt;&lt;code&gt;sample_n(panel, 12)

## # A tibble: 12 x 19
##    rowid  serious_world_pr~ serious_world_pr~ isocntry geo   region    w1    wex
##    &amp;lt;chr&amp;gt;              &amp;lt;dbl&amp;gt;             &amp;lt;dbl&amp;gt; &amp;lt;chr&amp;gt;    &amp;lt;chr&amp;gt; &amp;lt;chr&amp;gt;  &amp;lt;dbl&amp;gt;  &amp;lt;dbl&amp;gt;
##  1 ZA686~                 0                 0 ES       ES41  Casti~ 1.21  46787.
##  2 ZA686~                 0                 0 RO       RO31  South~ 0.724 11805.
##  3 ZA686~                 0                 0 SK       SK02  Zapad~ 0.774  3499.
##  4 ZA757~                 0                 1 PT       PT16  Centr~ 1.11   9336.
##  5 ZA659~                 1                NA HR       HR041 Grad ~ 0.580  2098.
##  6 ZA659~                 1                NA RO       RO21  North~ 1.21  20160.
##  7 ZA686~                 0                 0 PT       PT17  Lisboa 0.932  7448.
##  8 ZA659~                 0                NA GB-GBN   UKI   London 0.994 50133.
##  9 ZA757~                 0                 0 CY       CY    REPUB~ 0.594   874.
## 10 ZA686~                 0                 0 LT       LT003 Klaip~ 0.623  1564.
## 11 ZA757~                 0                 0 IE       IE013 West ~ 0.490  1651.
## 12 ZA659~                 0                NA LT       LT003 Klaip~ 1.16   2917.
## # ... with 11 more variables: marital_status &amp;lt;chr&amp;gt;, age_education &amp;lt;chr&amp;gt;,
## #   age_exact &amp;lt;dbl&amp;gt;, occupation_of_respondent &amp;lt;chr&amp;gt;,
## #   occupation_of_respondent_recoded &amp;lt;chr&amp;gt;,
## #   respondent_occupation_scale_c_14 &amp;lt;chr&amp;gt;, type_of_community &amp;lt;chr&amp;gt;,
## #   is_student &amp;lt;dbl&amp;gt;, no_education &amp;lt;dbl&amp;gt;, education &amp;lt;dbl&amp;gt;,
## #   date_of_interview &amp;lt;date&amp;gt;

saveRDS ( panel, file.path(tempdir(), &amp;quot;climate_panel.rds&amp;quot;), version:  2)

# not evaluated
saveRDS( panel, file:  file.path(&amp;quot;data-raw&amp;quot;, &amp;quot;climate-panel.rds&amp;quot;), version=2)
&lt;/code&gt;&lt;/pre&gt;
&lt;h2 id=&#34;putting-it-on-a-map&#34;&gt;Putting It on a Map&lt;/h2&gt;
&lt;p&gt;This is not the end of the story. If you put all this on a map, the
results are a bit disappointing.&lt;/p&gt;
&lt;img src=&#34;featured.png&#34; width=&#34;660&#34; /&gt;
&lt;p&gt;Why? Because sub-national (provincial, state, county, district, parish)
borders are changing all the time - within the EU and everywhere. The
next step is to harmonize the geographical information. We have another
CRAN released package to help you with. See the next post: &lt;a href=&#34;https://rpubs.com/antaldaniel/regions-OOD21&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Regional
Climate Change Awareness
Dataset&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>What is Retrospective Survey Harmonization?</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-04_retroharmonize_intro/</link>
      <pubDate>Thu, 04 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-04_retroharmonize_intro/</guid>
      <description>&lt;h2 id=&#34;reproducible-ex-post-harmonization-of-survey-microdata&#34;&gt;Reproducible ex post harmonization of survey microdata&lt;/h2&gt;
&lt;p&gt;Retrospective survey harmonization allows the comparison of opinion poll
data conducted in different countries or time. In this example we are
working with data from surveys that were ex ante harmonized to a certain
degree – in our tutorials we are choosing questions that were asked in
the same way in many natural languages. For example, you can compare
what percentage of the European people in various countries, provinces
and regions thought climate change was a serious world problem back in
2013, 2015, 2017 and 2019.&lt;/p&gt;
&lt;p&gt;We developed the
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt; R package
to help this process. We have tested the package with about 80
Eurobarometer, 5 Afrobarometer survey files extensively, and a bit with
Arabbarometer files. This allows the comparison of various survey
answers in about 70 countries. This policy-oriented survey programs were
designed to be harmonized to a certain degree, but their ex post
harmonization is still necessary, challenging and errorprone.
Retrospective harmonization includes harmonization of the different
coding used for questions and answer options, post-stratification
weights, and using different file formats.&lt;/p&gt;
&lt;p&gt;&lt;a href=&#34;https://ec.europa.eu/commfrontoffice/publicopinion/index.cfm&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer&lt;/a&gt;,
&lt;a href=&#34;https://www.afrobarometer.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Afrobaromer&lt;/a&gt;, &lt;a href=&#34;https://www.arabbarometer.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Arab
Barometer&lt;/a&gt; and
&lt;a href=&#34;https://www.latinobarometro.org/lat.jsp&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Latinobarómetro&lt;/a&gt; make survey
files that are harmonized across countries available for research with
various terms. Our
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;retroharmonize&lt;/a&gt; is not
affiliated with them, and to run our examples, you must visit their
websites, carefully read their terms, agree to them, and download their
data yourself. What we add as a value is that we help to connect their
files across time (from different years) or across these programs.&lt;/p&gt;
&lt;p&gt;The survey programs mentioned above publish their data in the
proprietary SPSS format. This file format can be imported and translated
to R objects with the haven package; however, we needed to re-design
&lt;a href=&#34;https://haven.tidyverse.org/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;haven’s&lt;/a&gt;
&lt;a href=&#34;https://haven.tidyverse.org/reference/labelled_spss.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;labelled_spss&lt;/a&gt;
class to maintain far more metadata, which, in turn, a modification of
the &lt;a href=&#34;&#34;&gt;labelled&lt;/a&gt; class. The haven package was designed and tested with
data stored in individual SPSS files.&lt;/p&gt;
&lt;p&gt;The author of labelled, Joseph Larmarange describes two main approaches
to work with labelled data, such as SPSS’s method to store categorical
data in the &lt;a href=&#34;http://larmarange.github.io/labelled/articles/intro_labelled.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Introduction to
labelled&lt;/a&gt;.&lt;/p&gt;
















&lt;figure  id=&#34;figure-two-main-approaches-of-labelled-data-conversion&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;img/larmarange_approaches_to_labelled.png&#34; alt=&#34;Two main approaches of labelled data conversion.&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption data-pre=&#34;Figure&amp;nbsp;&#34; data-post=&#34;:&amp;nbsp;&#34; class=&#34;numbered&#34;&gt;
      Two main approaches of labelled data conversion.
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;Our approach is a further extension of &lt;strong&gt;Approach B&lt;/strong&gt;. Survey
harmonization in our case always means the joining data from several
SPSS files, which requires a consistent coding among several data
sources. This means that data cleaning and recoding must take place
before conversion to factors, character or numeric vectors. This is
particularly important with factor data (and their simple character
conversions) and numeric data that occasionally contains labels, for
example, to describe the reason why certain data is missing. Our
tutorial vignette
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/labelled_spss_survey.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;labelled_spss_survey&lt;/a&gt;
gives you more information about this.&lt;/p&gt;
&lt;p&gt;In the next series of tutorials, we will deal with an array of problems.
These are not for the faint heart – you need to have a solid
intermediate level of R to follow.&lt;/p&gt;
&lt;h2 id=&#34;tidy-joined-survey-data&#34;&gt;Tidy, joined survey data&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;The original files identifiers may not be unique, we have to create
new, truly unique identifiers. Weighting may not be straightforward.&lt;/li&gt;
&lt;li&gt;Neither the number of observations or the number of variables (which
represents the survey questions and their translation to coded data)
is the same. Certain data may be only present in one survey and not
the other. This means that you will likely to run loops on lists and
not data.frames, but eventually you must carefully join them.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;class-conversion&#34;&gt;Class conversion&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Similar questions may be imported from a non-native R format, in our
case, from an SPSS files, in an inconsistent manner. SPSS’s variable
formats cannot be translated unambiguously to R classes.
&lt;code&gt;retroharmonize&lt;/code&gt; introduced a new S3 class system that handles this
problem, but eventually you will have to choose if you want to see a
numeric or character coding of each categorical variable.&lt;/li&gt;
&lt;li&gt;The harmonized surveys, with harmonized variable names and
harmonized value labels, must be brought to consistent R
representations (most statistical functions will only work on
numeric, factor or character data) and carefully joined into a
single data table for analysis.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;harmonization-of-variables-and-variable-labels&#34;&gt;Harmonization of variables and variable labels&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;Same variables may come with dissimilar variable names and variable
labels. It may be a challenge to match age with age. We need to
harmonize the names of variables.&lt;/li&gt;
&lt;li&gt;The harmonized variables may have different labeling. One may call
refused answers as &lt;code&gt;declined&lt;/code&gt; and the other &lt;code&gt;refusal&lt;/code&gt;. On a simple
choice, climate change may be ‘Climate change’ or
&lt;code&gt;Problem: Climate change&lt;/code&gt;. Binary choices may have survey-specific
coding conventions. Value labels must be harmonized. There are good
tools to do this in a single file - but we have to work with several
of them.&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;missing-value-harmonization&#34;&gt;Missing value harmonization&lt;/h2&gt;
&lt;ul&gt;
&lt;li&gt;There are likely to be various types of &lt;code&gt;missing values&lt;/code&gt;. Working
with missing values is probably where most human judgment is needed.
Why are some answers missing: was the question not asked in some
questionnaires? Is there a coding error? Did the respondent refuse
the question, or sad that she did not have an answer?
&lt;code&gt;retroharmonize&lt;/code&gt; has a special labeled vector type that retains this
information from the raw data, if it is present, but you must make
the judgment yourself – in R, eventually you will either create a
missing category, or use &lt;code&gt;NA_character_&lt;/code&gt; or &lt;code&gt;NA_real_&lt;/code&gt;.&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;That’s a lot to put on your plate.&lt;/p&gt;
&lt;p&gt;It is unlikely that you will be able to work with completely unfamiliar
survey programs if you do not have a strong intermediate level of R. Our
package comes with tutorials for
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/eurobarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer&lt;/a&gt;,
&lt;a href=&#34;https://retroharmonize.dataobservatory.eu/articles/afrobarometer.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Afrobarometer&lt;/a&gt;
and our development version already covers Arab Barometer, highlighting
some peculiar issues with these survey programs, that we hope to give a
head start for less experienced R users.&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Open Data Day Interview: Mapping Data with Milos Popovic</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-03-ood_interview_maps/</link>
      <pubDate>Wed, 03 Mar 2021 22:23:00 +0200</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-03-ood_interview_maps/</guid>
      <description>&lt;p&gt;&lt;em&gt;Milos Popovic is a researcher, a data scientist, Marie Curie postdoc &amp;amp; Top 10 dataviz &amp;amp; R contributor on Twitter according to NodeXL. He took part in policy debates about terrorism and military intervention and appeared on a number of TV channels including N1 (the CNN affiliate in the Western Balkans), Serbian National Television and Al-Jazeera Balkans. My research interests are at the intersection of civil war dynamics and postwar politics in the Balkans. He is going to join the Data &amp;amp; Lyrics team on International Open Data Day to help us put harmonized environmental degradation perception and environmental sensory data on maps. We asked him four questions about his passion, mapping data. Please join us 6 March 2021 9.30 EST / 15.30 CET for an informal digital coffee.&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;As a researcher, why are you so much drawn into maps? Is this connected to your interest in territorial conflicts, or you have some other inspiration?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;That’s a great question that really makes me pause and look back at the past 5 years. My mapping story started out of curiosity: I found interesting data on the post-WWII violence in Serbia and thought how cool it would be to make a map in R. I quickly made an unimpressive choropleth map and noticed some unexpected patterns. Then I realized just how much unused violence and census data sits out there while we have no clue about geographic patterns. So, it began. I started off with map-making but my curiosity took me to the world of georeferencing and geospatial analysis. In the process, I created over 300 maps hosted on my website as well as dozens of shapefiles from the scratch.&lt;/p&gt;
&lt;p&gt;I used to think that my interest is linked to growing up in a war-torn country. But, as my map-making evolved, I discovered that my passion is to use maps as a way to democratize the data: to take the scores of unused, and often buried datasets, place them on the map and share the dataviz with people.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;Can you show us an example of the best use of mapped data, and the best map that you have personally created? What is their distinctive value?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;I’m immensely proud of my work that required making the shapefiles from the scratch. For instance, my shapefile of over 1500 Kosovo cadastral settlements came into being after I turned dozens of high-resolution raster files into a shapefile fully compatible with Open Street Maps. After months of hard work, I managed to merge the shapefile with the 2011 Kosovo census and present several laser-focused demographic maps to my audience. Same goes for the settlement shapefile of &lt;em&gt;Republika Srpska&lt;/em&gt; [the Serb-speaking entity of Bosnia-Herzegovina — the editor], which I made out of a pdf file and merged with the 2013 census data. Whereas most existing maps take a bird’s eye view, my work offers a more fine-grained view of the local dynamics to stakeholders.&lt;/p&gt;
&lt;p&gt;Another similar undertaking was my transformation of the pre-WWII German military map of Yugoslavia into a unique shapefile of a few hundred Yugoslav municipalities. I combined this shapefile with the 1931 census data, 80 years after it was first published (better late than never!). It took me almost a year to complete this tremendous project but I enjoyed every bit of it. I have teamed up with &lt;a href=&#34;https://aleksandarpopovic.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;my brother&lt;/a&gt; who is a web developer and we even made &lt;a href=&#34;https://milosp.info/maps/interactive/census1931/index.html&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;an interactive map of Yugoslavia based on the 1931 census&lt;/a&gt;.[&lt;em&gt;The screenshot of this interactive map is the top image in the post &amp;ndash; the editor&lt;/em&gt;] We hope this project would serve not only scholars but also history enthusiasts to better understand a history of the country that is no more.&lt;/p&gt;
















&lt;figure  id=&#34;figure-check-out-miloss-beautiful-static-and-interactive-maps-on-httpsmilospinfohttpsmilospinfo&#34;&gt;
  &lt;div class=&#34;d-flex justify-content-center&#34;&gt;
    &lt;div class=&#34;w-100&#34; &gt;&lt;img src=&#34;img/milos_popovic_internet_never.png&#34; alt=&#34;Check out Milos’s beautiful static and interactive maps on [https://milosp.info/]([https://milosp.info/)&#34; loading=&#34;lazy&#34; data-zoomable /&gt;&lt;/div&gt;
  &lt;/div&gt;&lt;figcaption&gt;
      Check out Milos’s beautiful static and interactive maps on &lt;a href=&#34;[https://milosp.info/&#34;&gt;https://milosp.info/&lt;/a&gt;
    &lt;/figcaption&gt;&lt;/figure&gt;
&lt;p&gt;&lt;strong&gt;What do you think about collaboration based on open data and open-source software that processes such data?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;It’s a fantastic opportunity for small teams to bypass traditional gatekeepers such as state institutions or big companies and use open source apps for the benefit of their local communities. For example, the access to Open Street Map allows small teams to map pressing communal issues as crime, deceases, or environmental degradation and come up with innovative solutions. In my work, too, I used OSM has helped me create several fine-grained maps that shed more light on local problems in Serbia such as pollution, car accidents or violence.&lt;/p&gt;
&lt;p&gt;&lt;strong&gt;We are hoping to bring together environmental, sensory data and public attitude data on environmental issues? How can mapping help? What do you expect from this project?&lt;/strong&gt;&lt;/p&gt;
&lt;p&gt;More than ever, we are compelled to figure out how maladies spreads locally. Without mapping the hotspots, our understanding of the consequences of, for example, viral transmission or pollution is shrouded with a lot of uncertainty. We might have no clue how environmental issues shape public attitudes in localities until we use the mapping to turn on the light. Mapping would help this project pin down geographic clusters that require immediate attention from the private and public stakeholders.&lt;/p&gt;
&lt;p&gt;&lt;em&gt;Please &lt;a href=&#34;https://reprex.nl/talk/reprex-open-data-day-2021/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;join us&lt;/a&gt; for a digital coffee, tea or beer on International Open Data Day - we will put never seen data on maps, and discuss how to build successful open collaborations, with little, independent contributions to build large data observatories. Make sure you check out &lt;a href=&#34;https://milosp.info/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Milos&amp;rsquo; amazing website&lt;/a&gt;, too!&lt;/em&gt;&lt;/p&gt;
&lt;p&gt;&lt;em&gt;This blogpost was originally posted on our &lt;a href=&#34;https://dataandlyrics.com/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Data &amp;amp; Lyrics&lt;/a&gt; blog and its mutation on &lt;a href=&#34;https://medium.com/data-lyrics&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Medium&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;
</description>
    </item>
    
    <item>
      <title>Eurobarometer Surveys Used In Our Project</title>
      <link>https://greendeal.dataobservatory.eu/post/2021-03-04-eurobarometer_data/</link>
      <pubDate>Wed, 03 Mar 2021 00:00:00 +0000</pubDate>
      <guid>https://greendeal.dataobservatory.eu/post/2021-03-04-eurobarometer_data/</guid>
      <description>&lt;p&gt;In our &lt;a href=&#34;http://netzero.dataobservatory.eu/post/2021-03-04_retroharmonize_intro/&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;tutorial
series&lt;/a&gt;,
we are going to harmonize the following questionnaire items from five
Eurobarometer harmonized survey files. The Eurobarometer survey files
are harmonized across countries, but they are only partially harmonized
in time.&lt;/p&gt;
&lt;p&gt;All data must be downloaded from the
&lt;a href=&#34;https://www.gesis.org/en/home&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;GESIS&lt;/a&gt; Data Archive in Cologne. We are
not affiliated with GESIS and you must read and accept their terms to
use the data.&lt;/p&gt;
&lt;h2 id=&#34;eurobarometer-802-2013&#34;&gt;Eurobarometer 80.2 (2013)&lt;/h2&gt;
&lt;p&gt;GESIS Data Archive, Cologne. ZA5877 Data file Version 2.0.0,
&lt;a href=&#34;https://doi.org/10.4232/1.12792&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.12792&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data file: &lt;a href=&#34;https://search.gesis.org/research_data/ZA5877&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA6595&lt;/a&gt;
data file (European Commission 2017).&lt;/li&gt;
&lt;li&gt;Questionnaire: &lt;a href=&#34;https://dbk.gesis.org/dbksearch/download.asp?id=54036&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer 83.4 Basic Bilingual
Questionnaire&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Citation: &lt;a href=&#34;https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA5877&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA6595
Bibtex&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;QA1a Which of the following do you consider to be the single most serious problem facing the world as a whole?&lt;/code&gt;
(single choice)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QA1b Which others do you consider to be serious problems?&lt;/code&gt; (multiple
choice)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QA2 And how serious a problem do you think climate change is at this moment? Please use a scale from 1 to 10, with &#39;1&#39; meaning it is &amp;quot;not at all a serious problem&lt;/code&gt;
(scale 1-10)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QA4 To what extent do you agree or disagree with each of the following statements? - Fighting climate change and using energy more efficiently can boost the economy and jobs in the EU&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QA4 To what extent do you agree or disagree with each of the following statements? - Reducing fossil fuel imports from outside the EU could benefit the EU economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QA5 Have   you personally  taken   any action  to  fight   climate change  over    the past    six months?&lt;/code&gt;
(binary)&lt;/p&gt;
&lt;h2 id=&#34;eurobarometer-834-2015&#34;&gt;Eurobarometer 83.4 (2015)&lt;/h2&gt;
&lt;p&gt;European Commission, Brussels; Directorate General Communication
COMM.A.1 ´Strategy, Corporate Communication Actions and
Eurobarometer´GESIS Data Archive, Cologne. ZA6595 Data file Version
3.0.0, &lt;a href=&#34;https://doi.org/10.4232/1.13146&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13146&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data file: &lt;a href=&#34;https://search.gesis.org/research_data/ZA6595&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA6595&lt;/a&gt;
data file (European Commission 2018).&lt;/li&gt;
&lt;li&gt;Questionnaire: &lt;a href=&#34;https://dbk.gesis.org/dbksearch/download.asp?id=57940&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer 83.4 Basic Bilingual
Questionnaire&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Citation: &lt;a href=&#34;https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA6595&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA6595
Bibtex&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;h2 id=&#34;eurobarometer-871-2017&#34;&gt;Eurobarometer 87.1 (2017)&lt;/h2&gt;
&lt;p&gt;European Commission, Brussels; Directorate General Communication,
COMM.A.1 ‘Strategic Communication’; European Parliament,
Directorate-General for Communication, Public Opinion Monitoring Unit
GESIS Data Archive, Cologne. ZA6861 Data file Version 1.2.0,
&lt;a href=&#34;https://doi.org/10.4232/1.12922&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.12922&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data file: &lt;a href=&#34;https://search.gesis.org/research_data/ZA6861&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA6861&lt;/a&gt;
data file.&lt;/li&gt;
&lt;li&gt;Questionnaire: &lt;a href=&#34;https://dbk.gesis.org/dbksearch/download.asp?id=65967&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer 90.2 Basic Bilingual
Questionnaire&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Citation: &lt;a href=&#34;https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA6861&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA6861
Bibtex&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;QC1a Which of the following do you consider to be the single most serious problem facing the world as a whole?&lt;/code&gt;
(single choice)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QC1b Which others do you consider to be serious problems?&lt;/code&gt; (multiple
choice)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QC2 And how serious a problem do you think climate change is at this moment? Please use a scale from 1 to 10, with &#39;1&#39; meaning it is &amp;quot;not at all a serious problem&lt;/code&gt;
(scale 1-10)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qc4 To what extent do you agree or disagree with each of the following statements? - Fighting  climate change  and using   energy  more    efficiently can boost   the economy and jobs in the EU&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qc4 To what extent do you agree or disagree with each of the following statements? - Promoting EU  expertise   in  new clean   technologies    to countries    outside the EU  can benefit the  EU economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qc4 To what extent do you agree or disagree with each of the following statements? - Reducing  fossil  fuel    imports from    outside the EU  can benefit the EU  economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qc4 To what extent do you agree or disagree with each of the following statements? - Reducing  fossil  fuel    imports from    outside the EU  can increase    the security    of  EU  energy  supplies&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qc4 To what extent do you agree or disagree with each of the following statements? - More  public  financial   support should  be  given   to  the transition to   clean   energies    even    if  it  means   subsidies   to  fossil  fuels   should  be  reduced.&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;Qc5 Have   you personally  taken   any action  to  fight   climate change  over    the past    six months?&lt;/code&gt;
(binary)&lt;/p&gt;
&lt;h2 id=&#34;eurobarometer-902-2018&#34;&gt;Eurobarometer 90.2 (2018)&lt;/h2&gt;
&lt;p&gt;European Commission, Brussels; Directorate General Communication,
COMM.A.3 ‘Media Monitoring and Eurobarometer’ GESIS Data Archive,
Cologne. ZA7488 Data file Version 1.0.0,
&lt;a href=&#34;https://doi.org/10.4232/1.13289&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13289&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data file:
&lt;a href=&#34;https://dbk.gesis.org/dbksearch/sdesc2.asp?db=e&amp;amp;no=7488&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA7488&lt;/a&gt;
data file (European Commission 2019a)&lt;/li&gt;
&lt;li&gt;Questionnaire: &lt;a href=&#34;https://dbk.gesis.org/dbksearch/download.asp?id=65967&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer 90.2 Basic Bilingual
Questionnaire&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Citation: &lt;a href=&#34;https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA7488&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA7488
Bibtex&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;QB5 To what extent do you agree or disagree with each of the following statements? - Fighting  climate change  and using   energy  more    efficiently can boost   the economy and jobs in the EU&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB5 To what extent do you agree or disagree with each of the following statements? - Promoting EU  expertise   in  new clean   technologies    to countries    outside the EU  can benefit the  EU economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB5 To what extent do you agree or disagree with each of the following statements? - Reducing  fossil  fuel    imports from    outside the EU  can benefit the EU  economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB5 To what extent do you agree or disagree with each of the following statements? - Reducing  fossil  fuel    imports from    outside the EU  can increase    the security    of  EU  energy  supplies&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB5 To what extent do you agree or disagree with each of the following statements? - More  public  financial   support should  be  given   to  the transition to   clean   energies    even    if  it  means   subsidies   to  fossil  fuels   should  be  reduced.&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;h2 id=&#34;eurobarometer-913-2019&#34;&gt;Eurobarometer 91.3 (2019)&lt;/h2&gt;
&lt;p&gt;European Commission, Brussels; Directorate General Communication,
COMM.A.3 ‘Media Monitoring and Eurobarometer’ GESIS Data Archive,
Cologne. ZA7572 Data file Version 1.0.0,
&lt;a href=&#34;https://doi.org/10.4232/1.13372&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13372&lt;/a&gt;&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;Data file:
&lt;a href=&#34;https://dbk.gesis.org/dbksearch/sdesc2.asp?db=e&amp;amp;no=7572&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA7572&lt;/a&gt;
data file (European Commission 2019b).&lt;/li&gt;
&lt;li&gt;Questionnaire: &lt;a href=&#34;https://dbk.gesis.org/dbksearch/download.asp?id=66774&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;Eurobarometer 91.3 Basic Bilingual
Questionnaire&lt;/a&gt;&lt;/li&gt;
&lt;li&gt;Citation: &lt;a href=&#34;https://search.gesis.org/ajax/bibtex.php?type=research_data&amp;amp;docid=ZA7572&amp;amp;lang=en&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;ZA7572
Bibtex&lt;/a&gt;&lt;/li&gt;
&lt;/ul&gt;
&lt;p&gt;&lt;code&gt;QB4 To what extent do you agree or disagree with each of the following statements? - Taking action on climate change will lead to innovation that will make EU companies more competitive (N)&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB4 To what extent do you agree or disagree with each of the following statements? - Promoting EU  expertise   in  new clean   technologies    to countries    outside the EU  can benefit the  EU economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB4 To what extent do you agree or disagree with each of the following statements? - Reducing  fossil  fuel    imports from    outside the EU  can benefit the EU  economically&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB4 To what extent do you agree or disagree with each of the following statements? - Adapting to the adverse impacts of climate change can have positive outcomes for citizens in the EU&lt;/code&gt;
(agreement-disagreement 4-scale)&lt;/p&gt;
&lt;p&gt;&lt;code&gt;QB5 Have   you personally  taken   any action  to  fight   climate change  over    the past    six months?&lt;/code&gt;
(binary)&lt;/p&gt;
&lt;h2 id=&#34;references&#34;&gt;References&lt;/h2&gt;
&lt;p&gt;European Commission, Brussels. 2017. “Eurobarometer 80.2 (2013).” GESIS
Data Archive, Cologne. ZA5877 Data file Version 2.0.0,
&lt;a href=&#34;https://doi.org/10.4232/1.12792&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.12792&lt;/a&gt;. &lt;a href=&#34;https://doi.org/10.4232/1.12792&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.12792&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;———. 2018. “Eurobarometer 83.4 (2015).” GESIS Data Archive, Cologne.
ZA6595 Data file Version 3.0.0, &lt;a href=&#34;https://doi.org/10.4232/1.13146&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13146&lt;/a&gt;.
&lt;a href=&#34;https://doi.org/10.4232/1.13146&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13146&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;———. 2019a. “Eurobarometer 90.2 (2018).” GESIS Data Archive, Cologne.
ZA7488 Data file Version 1.0.0, &lt;a href=&#34;https://doi.org/10.4232/1.13289&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13289&lt;/a&gt;.
&lt;a href=&#34;https://doi.org/10.4232/1.13289&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13289&lt;/a&gt;.&lt;/p&gt;
&lt;p&gt;———. 2019b. “Eurobarometer 91.3 (2019).” GESIS Data Archive, Cologne.
ZA7572 Data file Version 1.0.0, &lt;a href=&#34;https://doi.org/10.4232/1.13372&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13372&lt;/a&gt;.
&lt;a href=&#34;https://doi.org/10.4232/1.13372&#34; target=&#34;_blank&#34; rel=&#34;noopener&#34;&gt;https://doi.org/10.4232/1.13372&lt;/a&gt;.&lt;/p&gt;
</description>
    </item>
    
  </channel>
</rss>
