About DoKS      NL  |  EN Search: Advanced Search
  Part of a word (e.g. tele*)    Exact wordgroup (e.g. "wireless communication")
folder Authors
folder Departments
folder Help
folder Years
Most popular theses: 2014 2015 2016 2017 2018 2019

1,572 theses on-line.

Doks PHL

Open Archives Initiative

Data gathering and analysis: recognising Personally Identifiable Information

Vanbrabant, Casper
Professionele bachelor in de toegepaste informatica

Abstract :
The internship assignment consists of the building and maintenance of a web scraper. The goal of this assignment is to collect Big Data sets from social media websites. The Institute then uses the Big Data sets to perform dialect analyses on them.

There are many web scraping tools available, they often have different features, and sometimes they can be quite costly. At first a web scraping technology has to be selected in according to the needs of the assignment. In this case the tool has to be able to scrap data, filter data and subsequently store it in a database. After the comparison of some web scraping tools, the best one is selected and implemented.

The focus of the research assignment is on Personally Identifiable Information. This type of information can be found almost everywhere on the worldwide web, especially on social media. Most people do not understand the possible danger of having their personal information falling into the wrong hands. A literature study explains the definition of Personally Identifiable Information, the difference with Personal Data defined by the General Data Protection Regulation, and demonstrates how criminals can (ab)use Personally Identifiable Information.
Furthermore, there is a basic principle for training a model that could be used to recognise PII in the data sets that are collected by the web scraper.

Full text:
File Size Type Checksum  
Eindwerk_Vanbrabant_Casper_Definitief.pdf 2 MB PDF MD5 Open file

Dit eindwerk werd 3857 keer bekeken.
Translate to English (Google translate)

Show record details

Show ETD - Dublin Core

If you want to cite this thesis in your own thesis, paper, or report, use this format (APA):

Vanbrabant, C. (2019). Data gathering and analysis: recognising Personally Identifiable Information. Unpublished thesis, Hogeschool PXL, PXL-Digital.
Retrieved from http://doks.pxl.be/doks/do/record/Get?dispatch=view&recordId=SEtd8ab2a8216cd2dafb016cd2eb92f602a1.

©2004-2008 - Hogeschool PXL - webmaster - Contact - Disclaimer