Google Summer of Code is designed to encourage student participation in open source development. Over the past 14 years, Google Summer of Code has brought together over 14,000 students and 651 open source projects to create over 35 millions of lines of code. Google will accept applications from student applications from March 25 to April 9, 2019.
In GSoC 2019, we want you to focus primarily on the content analysis capabilities of vitrivr. To do this, we plan to teach vitrivr to read text on screen, listen to people speaking, watch them gesturing and so on. This year again, we will collaborate with Red Hen Labs who will use vitrivr’s capabilities on a giant video dataset.
To apply for one of our projects, we ask you for the following information:
State very briefly the problem you are trying to solve, the goal of the project and the expected outcome. Be specific and realistic about the outcome and about what you want to achieve in three months.
State the steps you think are required for solving the problem. Be specific, choose small doable parts that you can oversee and of which you can give good time estimate.
We want to see a detailed weekly project plan with well-defined milestones and deliverables (and potential breaks or absences).
In addition, we ask you to submit code to one of the following pre-tasks. For each of the tasks you are asked to write a piece of source code in your language of choice, in good style, no longer than 150 lines. Upload your code snippet anywhere and send the link to it to with subject “GSoC 2018 - Task X” (where X denotes the task you have solved). In case you have not yet submitted your application via the Google Summer of Code website, please provide a draft of your application (via link or attachment) in the email together with the solution of your chosen task.
It is important that you know well about vitrivr and about potential things to improve. Get in contact with the mentors to pick a good project that fits well to vitrivr and to you.
Below, you can find some of our ideas on the directions in which we could push vitrivr together. Please consider them as starting points for your proposal. Of course, if you have other ideas, we would be very happy to hear them. Be creative and send us proposals that extend the ideas below – or that are advancing the vitrivr multimedia retrieval stack in another way. Feel free to contact us and get feedback on what you plan to do beforehand.
|vitrivr reads||Cineast||Text in a video often conveys information which is not easily expressed otherwise. This project deals with the integration of state of the art scene-text transcription into vitrivr.||Java, TensorFlow, DNNs||Silvan, Ralph, Mahnaz||
|vitrivr listens||Cineast||Spoken words are a very salient and easily remembered part of any video and therefore interesting for search. The idea behind this project is to integrate state of the art speech transcription methods such as DeepSpeech into vitrivr to make video transcription an inherent part of the retrieval pipeline.||Java, TensorFlow, DeepSpeech, DNNs||Ralph, Silvan, Mahnaz||
|vitrivr meets the humanities||Cineast, Vitrivr-NG||The international image interoperability framework (iiif) describes a standard to interact with images which is being adopted by an increasing number of museums, archives and other institutions which store large collections of digital images. The aim of this project would be to make vitrivr compatible with the apis specified by iiif so it can be used to interact with any iiif capable data sources easily.||Java, TypeScript||Luca||
|vitrivr explores||Cineast, Vitrivr-NG||While vitrivr works great if you know what you are looking for, its capacity for exploration is somewhat limited. The aim of this project would be to expand the browsing and exploration capabilities of vitrivr, for instance by offering visualizations leveraging semantic topologies possibly based on Machine Learning. This can be done in collaboration with the Red Hen organization, on the basis of the newscape video collections and the semantic annotations.||Java, TypeScript||Luca, Ralph||
|Gesture retrieval||Cineast||Gestures are a common component of daily communications where can carry some of the weight of spoken language. Query by gesture can be used in different contexts to search for gestures that accompanied the spoken words. This can be done in collaboration with the Red Hen organization, on the basis of the newscape video collections and the semantic annotations.||Java, TensorFlow||Mahnaz, Ralph||
|vitrivr watches||Cineast, Vitrivr-NG||One of the few search modalities vitrivr does not yet support is query-by-video. There are various ways to implement this, e.g. by using gestures detected in video, semantic concepts (i.e. using Machine Learning), motion or visual similarity. For this, a user friendly UI is needed that accepts both video and webcam input, streams it to Cineast for querying, displays the results to the user and possibly accept feedback from the user. The existing UI already supports some of this capability, so the project can both build upon that or design a new UI.||Java, WebRTC||Mahnaz, Luca||
|Temporal localization and gesture detection||Cineast||An important step in gesture recognition is to detect gestures and to localize them in temporal dimension. Temporal Segmentation Network (TSN) is one of the methods which is used for segmenting and localizing activities in videos. This method with a little modification can be used to localize and segment multiple gestures in one video. This can be done in collaboration with the Red Hen organization, on the basis of the newscape video collections and the semantic annotations.||Java, TensorFlow||Mahnaz||
|Real-time gesture recognition and labeling||Cineast||Nowcasting gestures in videos is an important feature if gesture-based queries have to be performed in real time. In this case, the results will be shown to the user in real time, and the wait time for retrieving the results (be it labeling, or retrieval) will be minimal. The aim of this project is to integrate the classification of hand gestures in real time into Cineast with machine learning tools.||Java||Mahnaz||
This list has the contact information of some of our mentors. It will be updated over time.
|Mahnaz Amiri Parian||https://dbis.dmi.unibas.ch/team/mahnaz-amiri-parian/||English|
|Ralph Gasser||http://dbis.dmi.unibas.ch/team/ralph-gasser||English, German|
|Silvan Heller||http://dbis.dmi.unibas.ch/team/silvan-heller||English, German|
|Luca Rossetto||http://dbis.dmi.unibas.ch/team/luca-rossetto||English, German|
|Heiko Schuldt||http://dbis.dmi.unibas.ch/team/heiko-schuldt||English, German|
Please feel free to contact us if you have any questions. To make it easier for us to organize our emails, please add “GSoC 2019” in your subject line. You can contact us via our email address or you can contact the mentors directly.
We have set up a mailing list for the vitrivr project. You can use the mailing list for questions regarding the code, bugs, etc. Note that the mailing list is publicly visible.