Please use this identifier to cite or link to this item:
Title: Dense graph pattern mining and visualization
Authors: WANG NAN
Keywords: graph mining, dense graph, mining results visualization, triangulation, social network mining, stock market data mining
Issue Date: 8-Mar-2011
Citation: WANG NAN (2011-03-08). Dense graph pattern mining and visualization. ScholarBank@NUS Repository.
Abstract: A graph is an intuitive abstraction that naturally captures data entities as well as the relationships among those entities. It embeds complicated entity relationships more succinctly, compared with the tabular representation in relational databases. With the power of intuition and succinctness, the graph representations are adapted into a wide spectrum of domains. Thanks to the advantage of graph representations, researchers have employed graph representation in advanced domains like bioinformatics and social network study. Complications arise sometimes from sheer size of entities, sometimes due to varieties of relations. Discovering the underlying relationships becomes a more demanding task. This task requires not only identifying critical information (graph patterns), but also presenting it intuitively. The process of pattern identification is termed as graph mining, while presenting it in a graphical form is defined as graph visualization. There is one class of critical information within a graph that catches most research attention, and it is called the dense subgraphs. A dense subgraph (pattern) is one class of critical information within a graph that represents a high level of interactions among entities. Such high level of interactions in many applications implies outstanding level of interactions. It catches most research attention and is also the focus of this thesis. It addresses the computational difficulty, the interpretability and the results' availability during the mining process of dense graphs. Our thesis is organized in the following way. Firstly Chapter 4 introduces an algorithm called CSV, that mines dense patterns (a.k.a. cohesive subgraphs ) effectively. Besides discovering cohesive subgraphs, it also produces an ordering of the vertices for further visualizing of the mining results. As CSV needs to detect cliques (a fully connected pattern) within the graph, which runs in exponential time, we propose a technique to reduce the algorithm's running time. The technique swiftly computes an upper bound on the size of cliques within the graph instead of trying to determine the exact clique size. By this means, we reduce the running time significantly compared with a state-of-the-arts dense pattern mining algorithm CLAN\cite{WZZ06}, based on experiments performed on real datasets. Although CSV performs significantly better than CLAN\cite{WZZ06}, in the worst case, it still exhibits high running time. In Chapter 5 we employed triangle counting in dense subgraph mining, which enables us to handle large graphs more efficiently. In this chapter, we propose a set of triangulation (the process of counting triangles inside a graph) based solutions to mine $DN$-graphs \footnote{Intuitively, the $DN$-graphs are sub-graphs share more neighbors than its surroundings, Chapter 5 will cover it in detail.} from large graphs. This set of solutions target at different dense pattern mining settings, ranging from in-memory to disc based graphs, and from static to dynamics. Experimental study shows that it is able to produce high quality results within one hour for world-wide photo sharing network Flickr \cite{Fli10}. In Chapter 6 we showcase the \textbf{DVIG}, an on-demand visualization system for graph mining pattern. DVIG presents the dynamic patterns in an intuitive manner so that users can capture major trends of the target graph over time. Technical contributions include an intuitive summarization of discovered graph patterns. With above work, we conclude the thesis in Chapter 7 and discuss future work.
Appears in Collections:Ph.D Theses (Open)

Show full item record
Files in This Item:
File Description SizeFormatAccess SettingsVersion 
thesis.pdf1.86 MBAdobe PDF



Page view(s)

checked on May 23, 2019


checked on May 23, 2019

Google ScholarTM


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.