Analysis of Konect datasets

Here, we provide an analysis of datasets taken from Konect, the Koblenz Network Connection (available here). We analyzed all graphs available from Konect which are dynamic, i.e., they have timestamped edges which appear and/or disappear over time (indicated by timestamps). Each graph is considered to be either directed (D), undirected (U), or bipartite (B).

For each dataset, we start with an empty graph (no nodes, no edges). Over time, edges are added or removed based on their timestamps and weights possibly changed based on their properties. As timestamps for each action we assume the timestamps assigned to each edge.

We analyze the resulting dynamic graph at 100 equidistant points in time. At each point in time, we compute certain statistics as well as the following metrics: assortativity, degree distribution, edge weights. Assume start to be the first and end to be the last timestamp of an edge in the dataset. Then, the graph is analyzed at timestamps start - 1 + i * (end - start)/100 (i = 1..100).

We distinguish four types of edges / datasets: ADD, ADD_REMOVE, MULTI, and WEIGHTED. Their meaning is described in the following. For a detailed description of our implementation and processing of the original Konect datasets, we refer to the README below.

'ADD' datasets [<img src='img/timestamps.png' width='20' alt='timestamps' title='timestamps'/>
	<img src='img/ADD.png' width='20' alt='ADD' title='ADD'/>]'ADD' datasets [timestamps ADD]

ADD datasets consist of timestamped edges that are added over time. The timestamp assigned to an edge denotes the time of its addition. Edges are never removed in this dataset type (only NA and EA).

'ADD_REMOVE' datasets [<img src='img/timestamps.png' width='20' alt='timestamps' title='timestamps'/>
	<img src='img/ADD_REMOVE.png' width='20' alt='ADD_REMOVE' title='ADD_REMOVE'/>]'ADD_REMOVE' datasets [timestamps ADD_REMOVE]

ADD_REMOVE datasets consist of edges that are added or removed (indicated by a weight of 1 or -1). The timestamp assigned to an edge denotes the time of the respective operation. Therefore, nodes and edges can appear or be removed (NA, EA, ER). In case the flag for removing nodes without edges is enabled, nodes can also be removed (NR). In the preliminary analysis provided here, this flag was enabled.

'MULTI' datasets [<img src='img/timestamps.png' width='20' alt='timestamps' title='timestamps'/>
	<img src='img/MULTI.png' width='20' alt='MULTI' title='MULTI'/>]'MULTI' datasets [timestamps MULTI]

MULTI datasets consist of multiple unweighted edges that appear over time. The timestamp assigned to an edge denotes the time of its addition. Upon creation, we initialize each edge with a weight of '1'. When another multi edge appears, the weight of this edge is increased by '1'. Hence, the weight of an edge denotes the number of multi edges that appeared so far (NA, EA, EW). As an optional parameter, the duration of an edge can be specified. It this duration is set, weights are decreased after the specified time and edges removed in case their weight is '0' (ER, NR). In the preliminary analysis provided here, this duration was not specified.

'WEIGHTED' datasets [<img src='img/timestamps.png' width='20' alt='timestamps' title='timestamps'/>
	<img src='img/WEIGHTED.png' width='20' alt='WEIGHTED' title='WEIGHTED'/>] &  [<img src='img/timestamps.png' width='20' alt='timestamps' title='timestamps'/>
	<img src='img/SIGNED.png' width='20' alt='SIGNED' title='SIGNED'/>]'WEIGHTED' datasets [timestamps WEIGHTED] & [timestamps SIGNED]

WEIGHTED datasets consist of weighted edges that appear over time. Thereby, this type is very similar to ADD (NA, EA). In case an edge appears (or is added) again, its weight is updated to the current / new weight (EW). Optional parameters are offset and factor which are used to adapt the weights: w' = offset + factor * w. In case the parameters are not set, the default of (0,1) is used, i.e., weights are not changed.

DNA.KonectDNA.Konect
https://github.com/BenjaminSchiller/DNA.Konect
JavaDoc
...