d3hack2019 Review

 · 6 mins read

The 2019 Deep Learning Hackathon (d3hack2019) took place from Sep 9 to 13, 2019, at the Saxon State Library SLUB in Dresden. Many might wonder now: what is a hackathon and how does it differ from any other form of workshop? Here is a brief rundown of the conceptual idea.

Our Hackathon

Scientific teams of up to four members from any discipline were ask to apply for the hackathon. As I organized two Deep Learning Bootcamps (2017 and 2018), we had some crowd to draw from.

To those accepted, we promised five consecutive days of consultation by an experienced Deep Learning Practitioner. The goal of this mentoring is to help the team solve their scientific problem with Deep Learning. Mentors and teams typically do not know each other before the hackathon. Aside helping the team to get started with Deep Learning or to improve existing solutions, the team is motivated to publish a paper on their findings in collaboration with the mentor. This way, both sides gain something from this large time investment.

Preparations

We had space for eleven teams. By the end of June we received 32 applications. As such, the selection process was tough. Each application had to demonstrate the academic goal(s) of their project, the availability of enough (labelled) data, the prior education and experience of each team member and the algorithmic motifs they are interested in. All of this was based on experiences made with GPU hackathons pioneered by Oak Ridge National Lab. For a condensed report based on several years of experience, see this IEEE publication.

Topics

The topics of our teams had a wide variety. We had:

  • remote sensing (analysis of satellite images)
  • segmentation of historic maps (for urban and rural settlement analysis)
  • optical layout recognition (based digitized historic books)
  • quality control of data from autonomous oceanographic sensors
  • domain adaptation of microscopy images of human lung tissue (biology/medicine)
  • multi-modal classification of microscopy images and RNA sequences from lung tumor patients
  • improved simulation of modern particle physics experiments (Belle2)
  • full event classification at (Belle2)
  • reconstruction of Grazing-incidence small-angle scattering imagery for material science
  • forecasting river floods of the Elbe river
  • improving the track detection of an autonomous driverless electric car

As you can see, the topics and teams couldn’t be more different and yet the topics are of high impact on society and science. If you are interested, see Our Teams in more detail.

Hack away!

Our Hardware

Thanks to our sponsors AWS and the Centre for Information Services and High Performance Computing of TU Dresden, we were able to offer two compute resources to our teams.

AWS provided a large amount of credits to us. However, they needed to be backed by the credit card of the core organizer of the event - me. To my surprise many teams didn’t want to use cloud instances for this exact reason. So we mostly left, the cloud where it is.

The majority of teams worked on the largest HPC cluster in Saxony, taurus. This was quite a learning curve for many. When it comes to data science, many people are used to working with jupyter on their local laptop. However, some things like file paths and local packages are quite different on an HPC installation.

Moreover, the working horse of our hackathon was a IBM Power9 based partition of taurus which turned out to be a good choice from the hardware perspective. However, due to the fact that power9 is an incompatible architecture to x86, we encountered tremendous difficulties when using community packages from PyPI and Anaconda. This kept many teams back.

Daily Work

Each team was mostly made responsible for their way of organising themselves. Each day only had two fixed dates:

  • the daily scrum
    A stand-up style meeting just before lunch, where each team had three minutes to report what they are working on, what holds them back and what they plan to do until the next day.

  • a break-out session in the afternoon
    This session was used for impromptu presentations from mentors and/or teams on tricks they know to improve a network with regards to convergence speed, data ingest, runtime, etc.

Both of these activities proved to be valuable to keep track of progress and to foster knowledge exchange between the teams.

It is worth mentioning that Klemperar-Saal at SLUB proved to be an excellent venue. The room was available from 8 am - 8 pm. There was enough space for the external caterer, Genussart, who served an excellent range of food and drinks to keep everyone happy. We would like to thank CASUS for sponsoring both.

Finals

The last day of the course concluded with the final presentations of the teams. It was awesome to see that many teams made big leaps forward or opened up a lot of avenues to continue on. It became clear that not only progress of the project was a focus, but also to learn as much as possible from the mentors: which network architecture is best, which loss function works, which programming paradigm lends itself to rapid prototyping, etc.

Last but not least, the mentors were asked to vote for the most accurate and most creative teams. The first turned out with a tie: The PARROTs team (quality control of oceanographic data) won for achieving an accuracy of over 98%. The DeepHydro team (predicting floods of the Elbe river) also won as most accurate as they not only established point estimates with their networks, but also came up with a way to calculate uncertainties of the latter. For more details on DeepHydro, see this blog post.

The particle physics team (faster simulations in their field), named skimulators, won the price for the most creative team for their way of expressing particle physics events as graphs. They could hence exploit graph based CNNs.

On top, two mentors were voted to be most valuable mentors: Jeffrey Kelling and Sebastian Starke from HZDR.

Summary

Our anonymous feedback survey after the course established the notion, that our teams and mentors enjoyed being at the hackathon.

result of anonymous feedback survey

On top, I was informed that 4 out of 11 teams already started to work on a publication or already published their work at the hackathon. We hope that this number is still to increase as many teams planned to add more data to their experiments.

I personally had a blast organizing the event. I was a bit sceptical in the beginning that such an data science event would succeed. But our mentors and teams proved me wrong. Thank you for that! I hope we can continue next year and improve even more.

Written by Peter Steinbach (HZDR), organizer of #d3hack2019