Data Camp: Big data on big machines

| Michaela Nesvarova

What roads are good for cycling and what is the behavior of tourists in the Netherlands? Those are just two of several research questions, which are currently being studied at UT’s Data Camp, course focused on the use of big data.

The second edition of Data Camp is taking place at the University of Twente from the 6th until the 9th of December 2016. The data camp is a joint event organized by the Central Bureau for Statistics of the Netherlands (CBS), the University of Twente and the Netherlands School for Information and Knowledge Systems (SIKS).

Petabytes of information

‘In its essence, the Data Camp is a hands-on course on analyzing a lot of data for answering various research questions,’ explains Djoerd Hiemstra, an Associate Professor in Database and Search Engine Technology at the UT. ‘We have about 40 participants: PhD’s, PostDocs and other researchers, whom we are teaching how to analyze data on big cluster of machines. The technology allows the participants to operate this big cluster with petabytes of information as if it was one big computer. This will enable them to scale up research that might have been previously done only using questionnaires or small administrative data.’

For example, the participants have access to all Dutch tweets from the last six years and other big data. They use these to answer research questions they define themselves. ‘We try to focus on research questions that directly benefit the participants in their work. They will also have access to the cluster with data after the Camp and can use it to work on their thesis or produce papers. Also, the event provides participants with new ways of thinking about statistics,’ continues Djoerd Hiemstra.

Mixed teams

Participants at the Data Camp have been divided into several groups. ‘We tried to form teams involving people from various faculties – EWI, BMS and ITC – to get a mix of people with different backgrounds, who would otherwise not meet, but now can learn about a new technology together and cooperate,’ says Hiemstra.

The groups were free to choose their own research questions that they will work on during the four days course. Their results will later be available and should include topics such as DDoS attacks, what roads in the Netherlands are good for cycling, how do tourists in the Netherlands behave or which Dutch companies make sustainable products.