vitrivr Logo
GSoC Logo

What is GSoC?

Google Summer of Code is designed to encourage student participation in open source development. Over the past 14 years, Google Summer of Code has brought together over 14,000 students and 651 open source projects to create over 35 millions of lines of code. Google will accept applications from student applications from March 25 to April 9, 2019.


vitrivr at GSoC

In GSoC 2019, we want you to focus primarily on the content analysis capabilities of vitrivr. To do this, we plan to teach vitrivr to read text on screen, listen to people speaking, watch them gesturing and so on. This year again, we will collaborate with Red Hen Labs who will use vitrivr’s capabilities on a giant video dataset.

This is vitrivr’s third participation to Google Summer of Code. Visit our GSoC 2016 or GSoC 2018 pages to learn more about our participation to GSoC in the past.

Student Application

To apply for one of our projects, we ask you for the following information:

Problem Statement
State very briefly the problem you are trying to solve, the goal of the project and the expected outcome. Be specific and realistic about the outcome and about what you want to achieve in three months.
State the steps you think are required for solving the problem. Be specific, choose small doable parts that you can oversee and of which you can give good time estimate.
Project Plan
We want to see a detailed weekly project plan with well-defined milestones and deliverables (and potential breaks or absences).

In addition, we ask you to submit code to one of the following pre-tasks. For each of the tasks you are asked to write a piece of source code in your language of choice, in good style, no longer than 150 lines. Upload your code snippet anywhere and send the link to it to with subject “GSoC 2018 - Task X” (where X denotes the task you have solved). In case you have not yet submitted your application via the Google Summer of Code website, please provide a draft of your application (via link or attachment) in the email together with the solution of your chosen task.

It is important that you know well about vitrivr and about potential things to improve. Get in contact with the mentors to pick a good project that fits well to vitrivr and to you.


Below, you can find some of our ideas on the directions in which we could push vitrivr together. Please consider them as starting points for your proposal. Of course, if you have other ideas, we would be very happy to hear them. Be creative and send us proposals that extend the ideas below – or that are advancing the vitrivr multimedia retrieval stack in another way. Feel free to contact us and get feedback on what you plan to do beforehand.

Task Software Details Technologies Mentors Difficulty
vitrivr reads Cineast Text in a video often conveys information which is not easily expressed otherwise. This project deals with the integration of state of the art scene-text transcription into vitrivr. Java, TensorFlow, DNNs Silvan, Ralph, Mahnaz
vitrivr listens Cineast Spoken words are a very salient and easily remembered part of any video and therefore interesting for search. The idea behind this project is to integrate state of the art speech transcription methods such as DeepSpeech into vitrivr to make video transcription an inherent part of the retrieval pipeline. Java, TensorFlow, DeepSpeech, DNNs Ralph, Silvan, Mahnaz
vitrivr meets the humanities Cineast, Vitrivr-NG The international image interoperability framework (iiif) describes a standard to interact with images which is being adopted by an increasing number of museums, archives and other institutions which store large collections of digital images. The aim of this project would be to make vitrivr compatible with the apis specified by iiif so it can be used to interact with any iiif capable data sources easily. Java, TypeScript Luca
vitrivr explores Cineast, Vitrivr-NG While vitrivr works great if you know what you are looking for, its capacity for exploration is somewhat limited. The aim of this project would be to expand the browsing and exploration capabilities of vitrivr, for instance by offering visualizations leveraging semantic topologies possibly based on Machine Learning. This can be done in collaboration with the Red Hen organization, on the basis of the newscape video collections and the semantic annotations. Java, TypeScript Luca, Ralph
Gesture retrieval Cineast Gestures are a common component of daily communications where can carry some of the weight of spoken language. Query by gesture can be used in different contexts to search for gestures that accompanied the spoken words. This can be done in collaboration with the Red Hen organization, on the basis of the newscape video collections and the semantic annotations. Java, TensorFlow Mahnaz, Ralph
vitrivr watches Cineast, Vitrivr-NG One of the few search modalities vitrivr does not yet support is query-by-video. There are various ways to implement this, e.g. by using gestures detected in video, semantic concepts (i.e. using Machine Learning), motion or visual similarity. For this, a user friendly UI is needed that accepts both video and webcam input, streams it to Cineast for querying, displays the results to the user and possibly accept feedback from the user. The existing UI already supports some of this capability, so the project can both build upon that or design a new UI. Java, WebRTC Mahnaz, Luca
Temporal localization and gesture detection Cineast An important step in gesture recognition is to detect gestures and to localize them in temporal dimension. Temporal Segmentation Network (TSN) is one of the methods which is used for segmenting and localizing activities in videos. This method with a little modification can be used to localize and segment multiple gestures in one video. This can be done in collaboration with the Red Hen organization, on the basis of the newscape video collections and the semantic annotations. Java, TensorFlow Mahnaz
Real-time gesture recognition and labeling Cineast Nowcasting gestures in videos is an important feature if gesture-based queries have to be performed in real time. In this case, the results will be shown to the user in real time, and the wait time for retrieving the results (be it labeling, or retrieval) will be minimal. The aim of this project is to integrate the classification of hand gestures in real time into Cineast with machine learning tools. Java Mahnaz


This list has the contact information of some of our mentors. It will be updated over time.

Mentor Profile Contact Languages
Mahnaz Amiri Parian https://dbis.dmi.unibas.ch/team/mahnaz-amiri-parian/ English
Ralph Gasser http://dbis.dmi.unibas.ch/team/ralph-gasser English, German
Silvan Heller http://dbis.dmi.unibas.ch/team/silvan-heller English, German
Luca Rossetto http://dbis.dmi.unibas.ch/team/luca-rossetto English, German
Heiko Schuldt http://dbis.dmi.unibas.ch/team/heiko-schuldt English, German
Francis Steen http://commstudies.ucla.edu/content/francis-steen-phd English


Please feel free to contact us if you have any questions. To make it easier for us to organize our emails, please add “GSoC 2019” in your subject line. You can contact us via our email address or you can contact the mentors directly.

Mailing List

We have set up a mailing list for the vitrivr project. You can use the mailing list for questions regarding the code, bugs, etc. Note that the mailing list is publicly visible.

chevron_rightMailing List