To content

FAIR Workshop on Sequence & Streaming Data Analysis

The interdisciplinary research area FAIR organizes a two-day workshop on Sequence and Streaming Data Analysis.

The goal of this workshop is to obtain a basic understanding of similarity measures, classification and clustering algorithms for sequence data as well as streaming data analysis.

Our invited speakers are:

  • André Nusser (University of Copenhagen)
  • Chris Schwiegelshohn (Aarhus University)

When and where

Date and Time: November 22 and 23, 2022. 9:00-13:00 CET
Location: Otto-Hahn-Str. 14, Room E04, (Computer Science Building), klick to view it on a map
Virtual: Online in Zoom (a link will be sent only to registered participants by email) 

How to register

The number of on-site participants is strictly limited while online participation will be possible for a potentially large number of persons. A registration is necessary in both cases.

To register for the workshop, please send an informal email to Amer Krivošija (amer.krivosijatu-dortmundde) until November 16, stating

  1. your name, title, and status (e.g. PostDoc),
  2. how you would prefer to participate (on-site or online),
  3. your institution (e.g. TU Dortmund), and faculty (e.g. Faculty of Statistics),
  4. whether you would like to participate in a workshop dinner (Nov. 22, 7pm),
  5. (optional) a brief statement of motivation.

Invited Speakers

© FAIR​/​TU Dortmund

André Nusser

Postdoc at Basic Algorithms Research Copenhagen (BARC), Department of Computer Science (DIKU), University of Copenhagen

André obtained his PhD at the Max Planck Institute for Informatics in Saarbrücken. He now is as Postdoc at Basic Algorithms Research Copenhagen (BARC) at Copenhagen University. He is interested in algorithm design, fine-grained lower bounds, and algorithm engineering in computational geometry, and in particular, sequence and point set similarity measures.


An Overview of Geometric Sequence Similarity Measures

Abstract: Sequence data is ubiquitous in any area where any type of quantitative measurements are performed in a specific order. To understand and analyze this data, we need a way to measure the similarity between sequences. As there are multiple natural measures for this task, our focus of this series of talks is to discuss different sequence similarity measures, especially the ones that are based on a geometric view of sequences.

We analyze advantages and disadvantages of the introduced sequence similarity measures and discuss settings where each measure would be the preferred choice, respectively. One main usage of similarity measures is the classification and clustering of sequence data. To that end, we discuss different general clustering techniques and how they are applicable to sequence data.

The talks are aimed at people who have a very basic mathematical background knowledge and are new-ish to sequence similarity measures.

© FAIR​/​TU Dortmund

Chris Schwiegelshohn

Assistant professor for computer science and algorithm design, MADALGO, Department of Computer Science, Aarhus University

Chris is a home grown researcher, having completed his PhD under the supervision of Christian Sohler at TU Dortmund. Subsequently, he joined Sapienza, University of Rome, first as a Postdoc hosted by Stefano Leonardi and then as a faculty member. In 2020, he joined Aarhus University as a tenure track assistant professor. Chris' research focusses on algorithm design in general, with an emphasis on sketching, streaming and learning algorithms, as well as approximation and online algorithms.


A Painless Introduction to Coresets

Abstract: Coresets are arguably the most important paradigm used in the design and analysis of big data and data stream algorithms. Succintly, a coreset compresses the input such that for any candidate query, the query evaluation on the coreset and the query evaluation on the original data are approximately the same. For clustering, this means that a coreset is a small weighted sample of the points such that for any set of centers, the cost on the original point set and the cost on the coreset are equal up to some small multiplicative distortion. In this talk, we will give an in-depth and yet also very simple and basic introduction into coreset algorithms and their analysis.

Location & approach

The campus of the TU Dortmund University is located near the freeway junction Dortmund West, where the Sauerland line A45 crosses the Ruhr expressway B1/A40. The Dortmund-Eichlinghofen exit on the A45 leads to the South Campus, the Dortmund-Dorstfeld exit on the A40 leads to the North Campus.  The university is signposted at both exits. All FAIR PIs have offices on the North Campus. FAIR's main offices will be located a bit off the North Campus on Martin-Schmeisser-Weg 17 (parking behind the building).

The "Dortmund Universität" S-Bahn station is located directly on the North Campus. From there, the S-Bahn line S1 runs every 15 or 30 minutes to Dortmund main station and in the opposite direction to Düsseldorf main station via Bochum, Essen and Duisburg. In addition, the university can be reached by bus lines 445, 447 and 462. Timetable information can be found on the homepage of the Verkehrsverbundes Rhein-Ruhr, and DSW21 also offer an interactive route network map.

From Dortmund Airport, the AirportExpress takes just over 20 minutes to Dortmund main station and from there to the university by S-Bahn. A wider range of international flight connections is offered by Düsseldorf Airport, about 60 kilometers away, which can be reached directly by S-Bahn from the university's train station.

One of the landmarks of the TU Dortmund is the H-Bahn. Line 1 runs every 10 minutes between Dortmund Eichlinghofen and the Technology Center via Campus South and Dortmund University S, while Line 2 commutes every 5 minutes between Campus North and Campus South. It covers this distance in two minutes.