08.10.2016 Views

Measuring the internet economy in The Netherlands a big data analysis 2016 | 14

measuring-the-internet-economy

measuring-the-internet-economy

SHOW MORE
SHOW LESS

You also want an ePaper? Increase the reach of your titles

YUMPU automatically turns print PDFs into web optimized ePapers that Google loves.

As <strong>the</strong> Bean Review (<strong>2016</strong>) demonstrates, <strong>the</strong> <strong><strong>in</strong>ternet</strong> has complicated our economic<br />

systems, but it is also a source of vast amounts of <strong>data</strong> with which <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong><br />

can be studied. With this <strong>in</strong> m<strong>in</strong>d, a three-way partnership between Google, Statistics<br />

Ne<strong>the</strong>rlands and Dataprovider has been set up to study <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong>. Our partner<br />

Dataprovider has extensive experience <strong>in</strong> crawl<strong>in</strong>g <strong>the</strong> <strong><strong>in</strong>ternet</strong> to collect <strong>data</strong> on websites<br />

on a regular basis, <strong>in</strong> particular companies’ websites. In <strong>the</strong> current research project, <strong>the</strong>ir<br />

<strong>data</strong> are made available to Statistics Ne<strong>the</strong>rlands. This <strong>data</strong> source constitutes <strong>the</strong> <strong>in</strong>novative<br />

‘<strong>big</strong> <strong>data</strong>’ aspect of <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong>. In addition, Statistics Ne<strong>the</strong>rlands possesses a<br />

large amount of statistical <strong>data</strong> on <strong>the</strong> bus<strong>in</strong>esses with<strong>in</strong> <strong>the</strong> <strong>economy</strong>. It rema<strong>in</strong>s <strong>the</strong>n to l<strong>in</strong>k<br />

Dutch websites to Dutch bus<strong>in</strong>esses. Do<strong>in</strong>g so facilitates a deeper <strong>analysis</strong> of <strong>the</strong> <strong><strong>in</strong>ternet</strong><br />

<strong>economy</strong>.<br />

Ano<strong>the</strong>r important reason for this research is that <strong>the</strong>re is not yet a broadly accepted<br />

def<strong>in</strong>ition of <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong>. This research contributes to this debate by construct<strong>in</strong>g<br />

a pragmatic def<strong>in</strong>ition with<strong>in</strong> <strong>the</strong> context of <strong>the</strong> available web-based source. Importantly,<br />

<strong>the</strong> def<strong>in</strong>ition was formulated <strong>in</strong> cooperation with stakeholders from Dutch government,<br />

bus<strong>in</strong>ess-world, academia, Google and Dataprovider. <strong>The</strong> result<strong>in</strong>g def<strong>in</strong>ition classifies<br />

bus<strong>in</strong>esses with websites <strong>in</strong>to various categories depend<strong>in</strong>g on how a bus<strong>in</strong>ess makes use of<br />

<strong>the</strong> <strong><strong>in</strong>ternet</strong>. <strong>The</strong>se categories are:<br />

−−<br />

A: Bus<strong>in</strong>esses without websites<br />

−−<br />

B: Bus<strong>in</strong>esses with a passive (category B1) or active onl<strong>in</strong>e presence (B2)<br />

−−<br />

C: Onl<strong>in</strong>e stores<br />

−−<br />

D: Onl<strong>in</strong>e services<br />

−−<br />

E: Internet related ICT<br />

Websites are allocated to <strong>the</strong>se categories predom<strong>in</strong>antly accord<strong>in</strong>g to <strong>the</strong> <strong>in</strong>formation<br />

available from Dataprovider. Fur<strong>the</strong>r, Categories C, D and E as a group constitute <strong>the</strong> ‘core’<br />

of <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong>. <strong>The</strong> core consists of onl<strong>in</strong>e stores, onl<strong>in</strong>e services such as dat<strong>in</strong>g<br />

sites, price comparison sites, or onl<strong>in</strong>e enterta<strong>in</strong>ment, and of <strong><strong>in</strong>ternet</strong> related ICT such as<br />

app developers, web-host<strong>in</strong>g and <strong><strong>in</strong>ternet</strong> market<strong>in</strong>g. Outside of <strong>the</strong> core we dist<strong>in</strong>guish two<br />

fur<strong>the</strong>r types of onl<strong>in</strong>e presence for bus<strong>in</strong>esses: active and passive. Active onl<strong>in</strong>e presence<br />

means that bus<strong>in</strong>esses provide a manner to <strong>in</strong>teract with <strong>the</strong>m directly, such as mak<strong>in</strong>g a<br />

reservation or order<strong>in</strong>g a brochure. Passive onl<strong>in</strong>e presence means that bus<strong>in</strong>esses purely use<br />

<strong>the</strong> <strong><strong>in</strong>ternet</strong> to provide <strong>in</strong>formation about <strong>the</strong>ir activities and to publicise <strong>the</strong>ir organisation.<br />

To analyse <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong> characteristics <strong>in</strong> a coherent way, <strong>the</strong> characteristics of <strong>the</strong><br />

websites need to be l<strong>in</strong>ked to statistical <strong>in</strong>formation on <strong>the</strong> bus<strong>in</strong>esses beh<strong>in</strong>d <strong>the</strong> website.<br />

This implies a nontrivial methodological challenge which is dealt with us<strong>in</strong>g two key pieces<br />

of <strong>in</strong>formation. Firstly, Statistics Ne<strong>the</strong>rlands records <strong>the</strong> websites of bus<strong>in</strong>ess <strong>in</strong> its General<br />

Bus<strong>in</strong>ess Register (GBR). Secondly, bus<strong>in</strong>esses often report <strong>the</strong>ir Chamber of Commerce (CoC)<br />

number on <strong>the</strong>ir website. <strong>The</strong>se identifiers provide <strong>the</strong> basis upon which websites can be<br />

l<strong>in</strong>ked to bus<strong>in</strong>esses. <strong>The</strong> successful l<strong>in</strong>k<strong>in</strong>g to <strong>the</strong> GBR facilitates fur<strong>the</strong>r l<strong>in</strong>ks to a variety of<br />

Statistics Ne<strong>the</strong>rlands <strong>data</strong> sources. <strong>The</strong>se <strong>data</strong> sources allow us to build an understand<strong>in</strong>g of<br />

<strong>the</strong> characteristics of <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong> from a variety of perspectives <strong>in</strong>clud<strong>in</strong>g, turnover,<br />

employment, and geography.<br />

Our <strong>analysis</strong> identifies circa 550,000 bus<strong>in</strong>esses which are <strong>in</strong> some way present on <strong>the</strong><br />

<strong><strong>in</strong>ternet</strong>. This constitutes 36% of all bus<strong>in</strong>esses. Of <strong>the</strong> bus<strong>in</strong>esses which do not have a<br />

website, 83% represent self-employed persons. Of all self-employed persons, we f<strong>in</strong>d<br />

that almost 70% do not have a website. <strong>The</strong> characteristics of <strong>the</strong> <strong><strong>in</strong>ternet</strong> <strong>economy</strong> are<br />

CBS | Discussion Paper, <strong>2016</strong> | <strong>14</strong> 4

Hooray! Your file is uploaded and ready to be published.

Saved successfully!

Ooh no, something went wrong!