Publications

Peer-Reviewed Conferences Journals Chapters Book Misc

Peer-Reviewed Conference Proceedings

[C13]

Invited for publication in a special issue of 'Social Network Analysis and Mining'

K.A. Zweig,

How to forget the second side of the story - A new method for the one-mode projection of bipartite graphs,

Proceedings of the second International Conference on Advances in Social Network Analysis and Mining (ASONAM'10), pp. 200-207, IEEE Computer Society, 2010



PDF version

Starting as a way to understand the Netflix prize data set, this project turned into a very interdisciplinary approach that combines classical data mining techniques in market basket analysis and network analysis. To understand which products are most often bought together, market basket analysis computes how often they are bought together. This number is then evaluated by various interestingness measures that basically compare it to the expectation of this number. We show that the classically used expectation model results in too many false-positive rules (so-called beer and diaper paradoxon). Our new approach replaces the old expectation model by one that is usually found in network analysis. We show on an example from the Netflix data set that the new method reduces the number of rules significantly by retaining the intuitively correct rules. In unpublished work we have applied the method to miRNA/siRNA-gene networks and got stunning results.

.
[C12]

Sudarshan Iyengar, K.A. Zweig, Noopta Gupta and C.E. Veni Madhavan,

Understanding Human Navigation with Network Analysis,

Workshop Proceedings of the 14th PAKDD2010 by Springer-Verlag (to be published), 2010

PDF version

A very nice project by Sudarshan to which I came rather late in the process. He wanted to understand how people learn to play the so-called wordmorph game. In this game you get two words with the same number of letters, say CAT and DOG, and you are asked to navigate from first to last by only changing a letter at a time and only using valid English words. His experiment showed very clearly that humans follow a very efficient strategy (don't want to spoil things here - just read the paper!). After he designed and conducted the experiments together with Noopta, I was able to help with the analysis of the findings.

.
[C11]

Carla Binucci, Ulrik Brandes, Giuseppe Di Battista, Walter Didimo, Marco Gaertler, Pietro Palladino, Maurizio Patrignani, Antonios Symvonis and Katharina Zweig ,

Drawing Trees in a Streaming Model,

Proceedings of the 76th International Symposium on Graph Drawing (GD'09), 2010

In a streaming model, the information about a data set comes bit by bit. In this paper we explored how best to draw a graph if the edge information is streamed. We explore various alternative situations (number of nodes known, edges come sorted by BFS/DFS, etc.) and enquire about the worst case area needed with respect to the optimal one. This article had its origin in one of the famous Bertinoro Workshops on Graph Drawing in 2008.
[C10]

K.A. Zweig and Karin Zimmermann,

Wanderer between the Worlds -- Self-Organized Network Stability in Attack and Random Failure Scenarios,

Proceedings of the Second IEEE International Conference on Self-Adaptive and Self-Organizing Systems (SASO'08), 2008

PDF version

It is an old finding by Albert et al. that random networks are quite robust in the case of random failures and directed attacks against their high degree vertices while scale--free networks are very stable against random failures - but at the same time very fragile in case of directed attacks. Thus, it would be best to have a random-like structure in the case of directed attacks and a scale-free-like structure in the case of random failure. Given a set of independent, decentrally organized agents that build a network, is it possible that the network structure switches according to the situation the network is in? Of course, the problem here is that the single agents are oblivious whether a neighbor is missing because of a directed attack or a random failure. We show that there is indeed a simple, local protocol that allows the agents to switch the network's structure according to the situation.

[C9]

K.A. Zweig,

To Cluster or not to Cluster - A Meta-Analytic Approach,

Proceedings of the 5th European Conference on Complex Systems (ECCS'08), 2008

PDF version

In the last few years, dozens of different clustering techniques were developed and published. Given a graph, a clustering algorithm will of course always produce some clustering, even if the input graph is a random graph which does not contain any reasonable clustering structure. An algorithm which always gives some answer but where the answer needs a context to be interpreted correctly, we call a contextual algorithm. In this article we discuss when the context of a graph is likely to give rise to a strong clustering structure.

[C8]

Johannes Putzke, Thomas Seehawer, K.A. Zweig and Kai Fischbach,

Patent Citation and Corporate Market Value -- A Study Using Social Network Analysis,

Proceedings of the 4th Conference on Applications of Social Networks Analysis (ASNA), 2007

Some colleagues in Cologne (Johannes Putzke, Thorsten Seehawer and Kai Fischbach) had access to some really interesting data, concerning patents that cite other patents and the economical success the companies have that hold them. Since these graphs are quite large, they are not easily analyzable by publicly available applications. Thus, I could help with our light-weight software classes to compute some classic centrality values and the guys could then correlate these values with the economic success of the companies.

[C7]

K.A. Lehmann and Stephan Kottler,

Visualizing Large and Complex Networks,

Proceedings of the 14th International Symposium on Graph Drawing (GD'06), 2007

PDF version

In this paper we describe a new layout algorithm for large and clustered networks that is based on the fact that real-world networks have a nice clustering structure that can be harnessed for a clear visualization. Just have a look at the picture in the appendix! :-)

[C6]

Guiseppe DiBattista, P. Francesco Cortese, Francesco Frati, Luca Grilli, K.A. Lehmann, Guiseppe Liotta, Ian Tollis,

On the Topologies of Local Minimum Spanning Trees,

Proceedings of the 3rd Workshop on Combinatorial and Algorithmic Aspects of Networking (CAAN'06), 2006

PDF version

Consider a sensor network with n sensors where every sensor can communicate with every other sensor in its radius of transmission. As we could show in our paper with Fekete and Kroeller, this network structure leads to a high clustering coefficient with a high redundance of edges. These edges can be significantly reduced by generating a localiced version of a minimal spanning tree (LMST), proposed by Li, Hua, and Sha in Design and analysis of an MST-based topology control algorithm. Here we show that there is no simple characterization of the resulting graphs, i.e., recognition is NP-hard, but that some simple graph classes are indeed embeddable such that the edges between the vertices represent an LMST.

[C5]

Michael Kaufmann, Jan Kratochvil, K.A. Lehmann, and Amarendran Subramanian,

Max-Tolerance Graphs as Intersection Graphs: Cliques, Cycles, and Recognition,

Proceedings of the 17th ACM Symposium on Discrete Algorithms (SODA'06), 2006

PDF version

Given a set of intervals and a set of tolerances, a pair of intervals is said to tolerate each other if the length of their overlap is at least as large as the maximal tolerance of the two intervals. Representing the intervals by vertices where two vertices are connected by an each if the corresponding intervals tolerate each other, yields the max-tolerance graph. The class of min-tolerance graphs, where two vertices are connected if at least one of the corresponding interval tolerates the other, i.e., the length of their overlap is greater than at least one of the tolerance, is very well characterized. Here, we present the first non-trivial characterizations of the class of max-tolerance graphs. The most important results are that the enumeration of all cliques is in P and the recognition of these graphs is NP-hard. The first result finds an application in Bioinformatics that is presented in the above paper.

[C4]

Best paper in track

K.A. Lehmann, Michael Kaufmann,

Evolutionary Algorithms for the Self-Organized Evolution of Networks,

Proceedings of the Genetic and Evolutionary Computation Conference (GECCO'05), 2005

PDF version

(Best paper in track) While the evolution of biological networks can be modeled sensefully as a series of mutation and selection, evolution of other networks such as the social network in a city or the network of streets in a country is not determined by selection since there is no alternative network with which these singular networks have to compete. Nonetheless, these singular networks do evolve due to dynamic changes of vertices and edges. In this article we present a formal, analyzable framework for the evolution of singular networks. We show that the careful design of adaptation rules can lead to the emergence of network topologies with satisfying performance in polynomial time while other adaptation rules yield exponential runtime. We further show by example how the framework could be applied to some ad-hoc communication scenarios.

Additionally, I have participated in the Graduate Student Workshop in the same conference with a paper that explains our philosophy behind the above given paper. You can find it here: Why simulating evolutionary processes is just as interesting as applying them, Lehmann, K.A., Workshop Proceedings of GECCO 2005, Washington

I gave another talk in the "Second Workshop on self-organization in representations for evolutionary algorithms: Building complexity from simplicity", organized by Ivan and Ozlem Garibay, Sanjeev Kumar and Harold Stringer. Have a look at my abstract for this talk here: , Lehmann, K.A., Workshop Proceedings of GECCO 2005, Washington

[C3]

Sandor Fekete, Michael Kaufmann, Alexander Kröller, K.A. Lehmann,

A New Approach for Boundary Recognition in Geometric Sensor Networks,

Proceedings of the 17th Canadian Conference on Computational Geometry (CCCG'05), 2005

PDF version

This article is concerned with the following question: Given a set of n uniformly distributed nodes in a unit square, connect any two nodes with a link if there geometric distance is smaller than some given threshold. How can a node determine whether it lies on the boundary of that network with only a small number of local queries? We propose a localiced centrality measure that is based on closeness centrality. We show experimentally that this centrality is well suited to determine the boundary vertices of such a network.

[C2]

Olaf Landsiedel, Klaus Wehrle, K.A. Lehmann,

T-DHT: Topology Based Distributed Hash-Tables (Poster),

Proceedings of Fifth International IEEE Conference on Peer-to-Peer-Computing, 2005

PDF version

The main idea came from Olaf and I am happy that I could contribute something to this paper. In this paper we introduce T-DHT, a topology based Distributed Hash Table, as an infra-structure for data-centric storage, information processing, and routing in ad-hoc and sensor networks. The special thing about this algorithm is that it does not rely on location information and works even in the presence of voids in the network.

[C1]

Michael Kaufmann, K.A. Lehmann, Andreas Gerasch,

Area-optimal HV-Like Tree Drawings with a fixed aspect ratio,

Proceedings of the 31st Conference on Current Trends in Theory and Practice of Computer Science, 2005

A new paradigma for drawing trees with a wanted aspect ratio.

Journals

[J8]

invited review

K.A. Zweig,

Good versus optimal: Why network analytic methods need more systematic evaluation,

Central European Journal of Computer Science 1, 137-153, 2011

PDF version

In this article we shortly discuss three main methods of network analysis: centrality measures, clusterings and block-models, and network motifs. The field has seen many methods being proposed in these areas - but as we show in the article, not many of them have been evaluated with respect to some ground truth. We thus propose that the network analysis community should agree on benchmark data sets and ground truth or gold standard solutions to show that the proposed algorithms can be tested with regard to their quality.

[J7]

If you cannot access the PDF, give me an email!

K.A. Zweig and Michael Kaufmann,

A systematic approach to the one-mode projection of bipartite graphs ,

Social Network Analysis and Mining 1, online first, 2011

PDF version

For some years we have worked on the bipartite network of films and customer ratings to find out what is the best one-mode projection on the side of the films. A 'good' one-mode projection in our view should be able to connect those films with each other that are most similar. This is in general difficult to evaluate but for the subset of series, it can be assumed that the most similar film to one season of a series is another season of the same series. We have thus defined a 'gold standard solution' and evaluate a one-mode-projection by its ability to sort films of the same series to each other. In this article we use the following general approach: for each pair of films count the number of users that liked both of them and correct by the EXPECTED number. The null model is of course crucial here. This basic approach has been proposed in the data mining literature since approximately 15 years, but it turns out that their null model is too simple for most real-world data because it assumes that the number of customer ratings follow a normal distribution. This is not true for many real-world networks. We thus introduce the usage of a bipartite random graph model with given degree sequences and show that the resulting one-mode projection is much more intuitive.

[J6]

K.A. Zweig, Gergely Palla and Tamas Vicsek,

What makes a phase transition?,

Physical Review A 389, 1501--1511, 2010

PDF version

In this article we discuss the difference between a 'sharp threshold phenomenon' and a genuine 'phase transition'. The first is any phenomenon that emerges in a very narrow range of a certain parameter value. Moreover, the range must go to zero with increasing system size. It has been shown that many phenomena in graphs are actually of this type. Phase transitions often include a sharp threshold phenomenon, e.g., water freezing at exactly 0 degree (under normal conditions) but we argue that a genuine phase transition should be more than just that. For SAT we could not convince ourselves that this is really the case or whether it is just a sharp threshold phenomenon. We came up with an even simpler model than random 3-SAT which also shows the same behavior as random 3-SAT, namely a sharp change in behavior around a certain parameter value. The article ends with the question whether this even more simple model also exhibits a phase transition in the eye of statistical physics.

[J5]

Got the Highly Accessed label

Laszlo A. Zahoránszky, Gyula Y. Katona, Peter Hari, Andras Malnasi-Csizmadia, Katharina A. Zweig, Gergely Zahoranszky-Koehalmi,

Breaking the Hierarchy - A New Cluster Selection Mechanism for Hierarchical Clustering Methods,

Algorithms For Molecular Biology 4, 12, 2009 (open access)

Based on an idea of how to compute overlapping cliques by Palla et al., you can get a hierarchy of clusterings (where a clustering is a set of clusters). The algorithm, called k-clustering algorithm, requires a parameter k to be set.Gergely started to ask himself, how one can choose the best level, i.e., the best k. A main thing of the k-clustering algorithm is that very large clusters can emerge in which some of the nodes can be arbitrarily far from each other. By introducing a measure for coherence which needs to fulfill some properties, we can show that it is possible to start at the bottom of a hierarchy and test for the wanted coherence. Whenever a cluster cannot fulfill the coherence criterion anymore, we stop and put out its child clusters. It can then be shown that this yields a partition which contains the largest clusters that still fulfill the coherence criterion. Interestingly, the method does not need to choose a level of the hierarchy but allows different branches of the hierarchy to choose their own cut-off level.

[J4]

Telikepalli Kavitha, Christian Liebchen, Kurt Mehlhorn, Dimitrios Michail, R. Rizzi, Thomas Ueckerdt and K.A. Zweig,

Cycle bases in graphs: characterization, algorithms, complexity, and applications,

Computer Science Review 3, 199-243, 2009

This review summarizes the latest results (plus some new results) on cycle bases. My own contribution is mainly in the 'length of cycle bases' chapter which I wrote together with Kurt Mehlhorn. I also contributed an application for cycle bases with a small length in graph drawing, which is described in my article with Stephan Kottler. If you would like to have a PDF of this article, please drop me a line by email.

[J3]

K.A. Lehmann, Michael Kaufmann, Stephan Steigele and Kay Nieselt,

On the maximal cliques in c-max-tolerance graphs and their application in clustering molecular sequences,

Algorithms For Molecular Biology 1, 9, 2006

PDF version

A more practical paper, describing the application of the ideas presented in our paper on Max-Tolerance Graphs. Given a set of gene sequences together with their alignment, we provide fast and easy to implement algorithms that find all cliques of sequence that share at least x percent of their sequences mutually.

[J2]

K.A. Lehmann, Hendrik Post, Michael Kaufmann,

Hybrid Graphs as a Framework for the Small-World Effect,

Physical Review E 73, 056108, 2006

PDF version

Based on the seminal paper of Watts and Strogatz about small-worlds we have tried to find a generalized network model. The basic assumption is that small-worlds are composed of two components: a local network structure that holds most of the edges, and a sparse random graph. It is known, that local networks in a d-dimensional setting will always have a diameter that scales with the d-th square of n, the number of vertices in the graph, and sparse random graphs will not even result in a connected graph. The work here is concerned with upper bounds on the diameter of the composition of the two components. We give detailed upper bounds of the diameter of these components in dependence of the sparsity of the random graph and the underlying local network structure.

A smaller version of this paper has also appeared in the Proceedings of the 2nd European Conference on Complex Systems and as a technical report. Please cite the journal version since we have made some substantial changes. These articles are based on the Studienarbeit and Diploma thesis of Hendrik.

[J1]

Eva Herker, H. Jungwirth, K.A. Lehmann K.A., Corinna Maldener, Kai-Uwe Fröhlich, Silke Wissing, Stephan Buttner, Markus Fehr, S. Sigrist, Francesco Madeo,

Chronological aging leads to apoptosis in yeast,

Journal of Cell Biology 164(4), 501-507, 2004

This article is a relict from my time as a biochemist: Apoptosis describes the voluntary suicide of mutated cells in multi-cellular organisms. Not long ago, Madeo et al. were able to show that also single cells like yeast have a simple cell death program. In multi-cellular organism the goal of apoptosis is clear: Die and hope that the rest of the organism will make it. What help is it to die if you are a single cell? The answer might lie in the fact that most of the surrounding cells are clones. If the whole family of cells is in a dangerous situation (e.g. starvation) it may help if some of the identical cells die to make room (or leave food) for the rest of them. This article shows that in these situations 'old' cells tend to die. An 'old' cell is defined as a cell that has given birth to many cloned daughter cells. Since these cells might already be damaged and mutated the ability to survive is on average increased if they die and hope for the rest to make it.

Book Chapters

[CB5]

Michael Kaufmann, K.A. Zweig,

Modeling and Designing Real-World Networks,

In: Algorithmics of Large and Complex Networks (J. Lerner, D. Wagner, K. Zweig (Eds.)), LNCS 5515, Springer Verlag Berlin, 2009

PDF-version

This chapter summarizes some ways to model the evolution of complex networks.

[CB4]

Dirk Koschützki, K.A. Lehmann, Leon Peeters,Stefan Richter, Dagmar Tenfelde-Podehl, Oliver Zlotowski,

Centrality Indices,

In: Network Analysis - LNCS Volume 3418, editors: Ulrik Brandes and Thomas Erlebach, 2005

PDFs of all three chapters + references (zip)

This chapter summarizes a large variety of different centrality indices, i.e., functions that assign a real number to vertices or edges in a network with the idea that the higher the number the more 'central' or 'important' is the vertex/edge for the network. Naturally, this simple idea lead to dozens of different measures since it depends on the function of a network which vertex is important for it.

[CB3]

Riko Jacob, Dirk Koschützki, K.A. Lehmann, Leon Peeters, Dagmar Tenfelde-Podehl,

Algorithms for Centrality Indices,

In: Network Analysis - LNCS Volume 3418, editors: Ulrik Brandes and Thomas Erlebach, 2005

PDFs of all three chapters + references (zip)

This chapter sketches the fastest algorithms for computing centrality indices.

[CB2]

Dirk Koschützki, K.A. Lehmann, Dagmar Tenfelde-Podehl, Oliver Zlotowski,

Advanced Centrality Concepts ,

In: Network Analysis - LNCS Volume 3418, editors: Ulrik Brandes and Thomas Erlebach, 2005

PDFs of all three chapters + references (zip)

This chapter highlights the relation between the function of a network and the centrality of its vertices and edges.

The part titled "5.1 Four Dimensions of a Centrality Index" was influenced by a draft by Stephen P. Borgatti. Unfortunately, the author explicitly required that it should not be cited because it was only a sketch. It was finally published in Social Networks soon after the book emerged. Here is the reference:

[1] Stephen P. Borgatti,
Centrality and network flow,
Social Networks 27(1), 55--71, Januar 2005.

[CB1]

K.A. Lehmann and Michael Kaufmann,

Random Graphs, Small Worlds, and Scale-Free Networks,

In: "Peer-to-Peer Systems and Applications - LNCS Volume 3485", editors: Ralf Steinmetz and Klaus Wehrle, 2005.

This chapter sketches some network models (mainly small-world and scale-free networks) and their relevance for designing peer-to-peer networks.

Book

[B1]

Jürgen Lerner, Dorothea Wagner, K.A. Zweig (Eds.),

Algorithmics of Large and Complex Networks,

LNCS 5515, Springer Verlag, Berlin, 2009.

A book that summarizes the results of the DFG SPP 1126 with the same title. You can find a PDF version of our chapter above.

Thesis

K.A. Zweig,

On local behavior and global structures in the evolution of complex networks,

Ph.D. thesis, University of Tübingen, 2007.

PDF version

In my thesis I have covered different topics from the 'small-world effect' to a method how to measure the number of random edges in a given real--world network to cycle bases and how to design local rules to build a self-organized networks with a wanted global structure.

Miscellaneous

Nicolas Butko, K.A. Lehmann, and Veronica Ramenzoni,

Ricochet Robots - A Case Study for Human Complex Problem Solving,

Project thesis from the Complex System Summer School, Santa Fe Institute, 2006.

PDF version

This article is a short description of the project Nick, Vero and I have worked on during the Complex Systems Summer School 2005 in Santa Fe. Our project explores different aspects of the complexity of one of my favourite games, Ricochet Robots.

K.A. Lehmann, Michael Kaufmann,

Decentralized algorithms for evaluating centrality in complex networks,

(Technical Report of the Wilhelm-Schickard-Institut, WSI-2003-10, ISSN 0946-3852, October 2003)

PDF version

Today one of our most important communication networks is organized completely decentrally. How can a decentral network be analyzed and subsequently optimized? Most static networks are analyzed by calculating so-called centrality indices which describe the position of a node in a network. One of the most complex measures is the betweenness centrality, first introduced by Freemann in 1979. It describes the expected number of communications on shortest paths over a node. In this paper we propose a general framework for decentrally working algorithms that calculate the four most prominent centrality indices, including betweenness centrality.

Our main goal is to provide all nodes within the network with their centrality index such that they can locally choose which connections should be conserved and which should be removed in order to optimize their position in the network.