Friday, November 15, 2019

Maximizing Contrasting Opinions in Signed Social Networks

Maximizing Contrasting Opinions in Signed Social Networks


Maximizing Contrasting Opinions in Signed Social Networks


A summary of the IEEE BigData 2019 research paper by Kaivalya Rawal and
Arijit Khan.

Background: A central characteristic of social networks is that it facilitates rapid dissemination of information among large groups of individuals [1]. Online social networks, such as Facebook, Twitter, LinkedIn, Flickr, and Digg are used for spreading ideas and messages. Users’ behaviors and opinions are highly affected by their friends in social networks, which is defined as the social influence. Motivated by various real-world applications, e.g., viral marketing [2], social and political campaigning [3], social influence studies have attracted extensive research attention. The classic influence maximization problem [4], [2] identifies the top-k seed users in a social network such that the expected number of influenced users in the network, starting from those seeds and following an influence diffusion model, is maximized. The budget k on the seed set size usually depends on how many initial users the campaigner can directly influence by advertisements, re-tweets from “bots”, free samples, and discounted prices.

Problem: We investigate a novel influence diffusion problem: COSiNe(Contrasting Opinions Maximization in a Signed Social Network). We find limited influential seed nodes which maximize the adoption of two distinct, antithetical opinions in two non-overlapping user groups with opposing views.The objective behind such influence maximization is to create awareness in a population by improving the quality of the debate on naturally contentious issues.

Applications: An ideal application of our problem would be to increase awareness about infrequently discussed issues that are nonetheless controversial (such as capital punishment, nuclear energy, or affirmative action) — in a target population that naturally splits into two distinct ideological groups (such as democrats and republicans); in a forum that extensively debates topics and proposes mutually agreeable solutions based on compromise, diversity, and inclusion (such as the United States Senate or House of Representatives). Contrary to initial expectations, polarization of opinions and increased conflict can often be beneficial [5], [6], [7], [8], as discussed in the following.

The benefit of conflicting opinions among collaborators has been clearly observed in Wikipedia. Controversial articles such as those on the Syrian Civil War, Israel/Palestine, or George W. Bush attract a higher number of edits. Higher polarization in the contributing community is associated with higher article quality for a broad range of articles – from politics to science and social issues [6]. Increased diversity is often correlated also with greater business performance. Similarly, disagreements amongst co-workers have been found to improve the decision making capabilities at the organisation level. Thus, encouraging different opinions about the same topic can be leveraged to improve the productivity of the organisation [7]. When dealt with correctly, such differences in thought and opinions are a force for good.

Lastly, we illustrate an example from the world of politics that is most similar to our “ideal” application scenario. Unlike the American presidential system, in countries based upon the Westminster parliamentary system, there is an appointed head of government, different from the head of the state, and an appointed head of opposition. This balance between the government and the opposition is considered integral to the success of a functioning democracy in diverse countries such as in Britain and in India [8]. An equivalent analysis was made for the political system in the United States of America in 1950 by the American Political Science Association [5] which recommended a stronger two party system in order to strengthen the democratic process. Both these analyses point to the importance of opposition in political discourse, and go on to show that policies being enacted and implemented benefit from engagement, and even opposition. Meaningful discourse and spirited debate requires people who inherently hold opposing beliefs on a given issue, and thus maximizing opposing influences can be beneficial for a legislative body from the point of view of the general population.

Challenges: Contrasting opinions maximization, as required in our problem setting, is a non-trivial one. First, one must employ an influence cascade model that has properties different from those for commercial, one-time product purchasing based marketing strategies. For example, people’s opinions change over time; thus, activation based models, such as independent cascade (IC) and linear threshold (LT) models [4] are less appropriate in political contexts. Second, in reality a signed social network might not be perfectly balanced [9], that is, there may not exist a partition V1, V2 of the node set V, such that all edges with V1 and V2 are positive and all edges across V1 and V2 are negative. Such a network does not follow the social balance theory, and adds more complexity to the social influence cascade.

Contributions: In this work, we employ the voter model [9] [10], [11], [12] to characterize influence diffusion in the two population groups of a social network. We define our model such that opposite influences, when applied on the same user, cancel each other, leading to a decay in the influence strength on any given user. Our model does not mandate that a user’s choice be frozen upon one-time activation, explicitly allowing the user to switch opinions at later times. Moreover, voter model, being a stochastic one (it has a random walk based interpretation), can deal with signed networks that are not perfectly balanced. We then formulate our novel COSiNe problem (contrasting opinions maximization), and design an efficient, exact solution.

For more information, click here for our paper.  Codes and datasets: GitHub.
Blog post contributed by: Arijit Khan
  
[Reference]

[1] W. Chen, L. V. S. Lakshmanan, and C. Castillo, “Information and Influence Propagation in Social Networks”, Morgan & Claypool, 2013.

[2] P. Domingos and M. Richardson, “Mining the Network Value Customers”, KDD, 2001.

[3] B. A. Conway, K. Kenski, and D. Wang, “The Rise of Twitter in the Political Campaign: Searching for Intermedia Agenda-Setting Effects in the Presidential Primary”, Journal of Computer-Mediated Communication, vol. 20(4), 2015, pp. 363–380.

[4] D. Kempe, J. Kleinberg, and E. Tardos, “Maximizing the Spread of Influence through Social Network”, KDD, 2003.

[5] A. Schlesinger JR, “Toward A More Responsible Two-Party System: A Report”, American Political Science Association, 1950.

[6] F. Shi, M. Teplitskiy, E. Duede, and J. A. Evans, “The Wisdom of Polarized Crowds”, Human Behaviour, 2019.

[7] K. Ferrazzi, “The Benefits of Conflict at Work”, 2014, http://fortune.com/2014/03/11/the-benefits-of-conflict-at-work.

[8] A. Beteille, “Democracy and It’s Institutions”, Oxford University Press, Chapter Government and Opposition, 2012.

[9] Y. Li, W. Chen, Y. Wang, and Z.-L. Zhang, “Influence Diffusion Dynamics and Influence Maximization in Social Networks with Friend and Foe Relationships”, WSDM, 2013.

[10] P. Clifford and A. Sudbury, “A Model for Spatial Conflict”, Biometrika, vol. 60(3), 1973, pp. 581–588.

[11] R. A. Holley and T. M. Liggett, “Ergodic Theorems for Weakly Interacting Infinite Systems and the Voter Model”, Ann. Probab., vol. 3(4), 1975, pp. 643–663.

[12] E. Even-Dar and A. Shapira, “A Note on Maximizing the Spread of Influence in Social Networks”, “Internet and Network Economics”, 2007.

Sunday, April 28, 2019

Distance-generalized Core Decomposition

Distance-generalized Core Decomposition

A summary of the SIGMOD 2019 research paper by Francesco Bonchi, Arijit Khan, and Lorenzo Severini

Background: Extracting dense structures from large graphs has emerged as a key graph-mining primitive in a variety of application scenarios, ranging from web mining, to biology, and finance. Many different definitions of dense subgraphs have been proposed (e.g., cliques, n-cliques, n-clans, k-plexes, f-groups, n-clubs, lambda sets), but most of them are NP-hard or at least quadratic. In this respect, the concept of core decomposition is particularly appealing because (i) it can be computed in linear time, and (ii) it is related to many of the various definitions of a dense subgraph and it can be used to speed-up or approximate their computation. The k-core of a graph is defined as a maximal subgraph in which every vertex is connected to at least k other vertices within that subgraph. The set of all k-cores of a graph, for each k, forms its core decomposition. The core index of a vertex v is the maximal k for which v belongs to the k-core.

While core decomposition is based on the number of immediate connections that a vertex has within a subgraph (its degree), the importance of studying network structures beyond the horizon of the distance-1 neighborhood of a vertex is well established since several decades, especially in social sciences [1]. Following this observation, in this paper we introduce an important generalization of the notion of core decomposition. Looking through the lens of shortest-path distance, one can see the degree of a vertex v as the number of vertices in the graph which have distance ≤ 1 from v, or equivalently, the size of the 1-neighborhood of v. From this perspective, a natural generalization is to consider a distance threshold h > 1. This leads smoothly to the notions of h-neighborhood, h-degree, and in turn, to the distance-generalized notion of (k,h)-core, i.e., the maximal subgraph in which every vertex has at least k other vertices at distance ≤ h within that subgraph. As we formally prove in this work, the (k,h)-core is unique and it is contained in the (k − 1,h)-core: these facts allow us to define the notion of distance-generalized core decomposition.


Figure 1 shows the differences between the classic core decomposition (on the left side) and the (k, 2)-core decomposition (on the right side). For this example graph, the classic core decomposition (i.e., the (k, 1)-core decomposition in our setting) puts all the vertices in core k = 2. On the contrary, by considering distance-2 neighborhood, the (k, 2)-core decomposition is able to capture structural differences among the vertices, thus providing a finer-grained analysis. In particular, it allows detecting the fact that the vertices from 4 to 13 form a denser and more structured region (the (6, 2)-core), while vertices 2 and 3 are in the (5, 2)-core, and vertex 1 only belongs to the (4, 2)-core.

Why Core Decomposition of Power Graph Does Not Work? One could think to obtain the distance-generalized (k,h)-core decomposition, by first computing the h-power of the input graph (as in Figure 2), and then applying the state-of-the-art algorithms for core decomposition. However, this does not provide the correct decomposition, as shown next. Figure 2 shows the power-graph G^2 of the graph in Figure 1. We can observe that according to the classic core decomposition of G^2 , the vertices 2 and 3 have core-index 6, while in the (k, 2)-core decomposition of G (right side of Figure 1) they have core-index 5. This is due to the fact that in G^2, vertices 2 and 3 becomes adjacent due to vertex 1, but this vertex does not belong to the (5, 2)-core. 

Challenges: In the classic core decomposition, when a vertex is removed, the degree of its neighbors is decreased by 1: this observation allows several optimizations which makes the classic case easy to compute and to scale [2, 3, 4, 5, 6, 7]. Instead, in the generalized (k,h)-core decomposition the removal of a vertex can decrease the h-degree of some of its h-neighbors by more than one. This is the reason why the idea of decomposing the h-power graph does not work, and it is also the main reason why distance-generalized core decomposition (h > 1) is much harder to compute than standard core decomposition (h = 1). In fact, when a vertex is removed, we need to compute again the h-degree of a large number of vertices, i.e., we need a large number of h-bounded BFS traversals. 

Algorithmic Contribution: In spite of such challenges, we devise efficient algorithms. As a baseline, we first extend the state-of-the-art Batagelj and Zaveršnik’s algorithm [2] to deal with (k,h)-core decomposition (we dub the algorithm h-BZ). Next, we exploit a lower bound on the core-index of a vertex to avoid a large number of useless h-degree re-computations. We call this algorithm h-LB. Finally, we propose an algorithm that further improves efficiency by computing an upper boundand processing the vertices with larger h-degrees as early as possible (dubbed h-LB+UB). In order to scale to larger graphs we also exploit multi-threading provided by modern architectures to parallelize the computations of h-degrees.

Applications: We show that distance-generalized core decomposition generalizes many of the nice properties of the classic core decomposition, e.g., its connection with the notion of distance-generalized chromatic number, or its usefulness in speeding-up or approximating distance-generalized notions of dense structures, such as h-club, (distance-generalized) densest subgraph and community search problems (i.e., cocktail party) [8]. We also show that it is very informative and performs well as a heuristic for selecting landmarks to build shortest-path-distance oracles.


For more information, click here for our paper. 
Blog post contributed by: Arijit Khan


[Reference]

[1] R. Luce. Connectivity and Generalized Cliques in Sociometric Group Structure. Psychometrika, 15(2):169–190, 1950.

[2] V. Batagelj and M. Zaveršnik. Fast Algorithms for Determining (Generalized) Core Groups in Social Networks. Advances in Data Analysis and Classification, 5(2):129–145, 2011.

[3] J. Cheng, Y. Ke, S. Chu, and M. T. Özsu. Efficient core decomposition in massive networks. In ICDE, 2011.

[4] W. Khaouid, M. Barsky, V. Srinivasan, and A. Thomo. K-core decomposition of large networks on a single pc. PVLDB, 9(1):13–23, 2015.

[5] A. Montresor, F. D. Pellegrini, and D. Miorandi. Distributed k-Core Decomposition. TPDS, 24(2), 2013.

[6] K. Pechlivanidou, D. Katsaros, and L. Tassiulas. MapReduce-Based Distributed K-Shell Decomposition for Online Social Networks. In SERVICES, 2014.

[7] A. E. Sariyüce, B. Gedik, G. Jacques-Silva, K. Wu, and Ü. V. Çatalyürek. Streaming Algorithms for k-Core Decomposition. PVLDB, 6(6), 2013.

[8] M. Sozio and A. Gionis. The community-search problem and how to plan a successful cocktail party. In KDD, 2010.

Wednesday, April 24, 2019

In-Depth Comparison of st Reliability Algorithms over Uncertain Graphs

An In-Depth Comparison of s-t Reliability Algorithms over Uncertain Graphs

A summary of the PVLDB 2019 research paper by Xiangyu Ke, Arijit Khan, and Leroy Lim Hong Quan

Uncertain, or probabilistic, graphs have been increasingly used to represent noisy linked data in many emerging applications, and have recently attracted the attention of the database research community [7]. A fundamental problem on uncertain graphs is the s-t reliability, which measures the probability that a target node t is reachable from a source node s in a probabilistic (or uncertain) graph, i.e., a graph where every edge is assigned a probability of existence.  This s - t reliability estimation has been used in many applications such as measuring the quality of connections between two terminals in a sensor network, finding other proteins that are highly probable to be connected with a specific protein in a protein-protein interaction (PPI) network, identifying highly reliable peers containing some file to transfer in a peer-to-peer (P2P) network, probabilistic path queries in a road network, and evaluating information diffusion in a social influence network.

Due to the inherent complexity of the problem (# P-hard), although the exact reliability detection has received attention in the past, the focus nowadays has mainly been on approximate and heuristic solutions over large-scale graphs. The large spectrum of the reliability problem is categorized in Figure 1. Notice that in this work, we have focused on sequential algorithms for the fundamental s-t reliability query. In particular, various sampling and indexing based efficient algorithms were proposed in the literature. Estimation of reliability in uncertain graphs has its beginnings with the usage of Monte Carlo (MC) sampling [1]. Subsequently, more advanced sampling methods were proposed in the form of recursive samplings [2, 3], lazy propagation sampling [4], and shared possible worlds [5], as well as other indexing methods [6]. With the wide range of algorithms available for estimating the s-t reliability over uncertain graphs, there is an urgent need to realize their trade-offs, and to employ the best algorithm for a given scenario.


Figure 1: The broad spectrum of reliability problem over uncertain graphs

As depicted in Figure 2, we find serious concerns in the existing experimental comparisons of state-of-the-art reliability estimation algorithms over uncertain graphs. (1) There is no prior work that compared all state-of-the-art methods with each other. It is, therefore, difficult to draw a general conclusion on the superiority and trade-offs of different methods. (2) As shown in Figure 2, with the exception of [4], other experimental studies in [2, 3, 5, 6] either considered a fixed number of samples (e.g., 1000), or the maximum number of samples was limited by 2000. However, we observe in our experiments that the number of samples necessary for the convergence of reliability estimation varies a lot depending on the specific algorithm used (e.g., for Recursive Stratified Sampling [3], #samples required for convergence is 250∼1000, while for Lazy Propagation [4], it is 500∼1500), and also on the underlying characteristics of the uncertain graph dataset. Therefore, the running time necessary to achieve convergence (and hence, good-quality results) should be reported differently, as opposed to using the same number of samples in all experiments. (3) The metrics used for empirically comparing these techniques were not consistent in the past literature, thereby making them apple-to-orange comparisons in the larger context. For example, [3] measured relative variance of different estimators and their running times for the same number of samples (2000). In both [5, 6], the authors reported accuracy (with respect to baseline MC sampling) and running times of different algorithms using a maximum of 1000 samples. On the other hand, [2] compared relative variance, accuracy, and running times of various estimators by considering a fixed (1000) number of samples. In addition, surprisingly none of these studies reported the online memory usage which, according to our experiments, varied a great extent. (4) Last but not least, we find certain errors and further optimization scopes in past algorithms(e.g., accuracy of [4], time complexity analysis of [5]), and by correcting (or, updating) them we significantly improve their performance.


Figure 2:  State-of-the-art reliability estimation algorithms in uncertain graphs: A directed arrow depicts reported superiority in prior works. All algorithms have not been thoroughly compared with each other. Moreover, previous works did not employ identical frameworks, datasets, and metrics for comparison. Thus, it is critical to investigate their trade-offs and superiority over each other.

Through our systematic and in-depth analysis of experimental results, we report surprising findings, such as many follow-up algorithms can actually be several orders of magnitude inefficient, less accurate, and more memory intensive compared to the one s that were proposed earlier. Our findings can be summarized as follows.

For estimator variance, both recursive methods: RHH and RSS exhibit significantly better performance than other four MC-based approaches: MC, BFSSharing, ProbTree, and LP+. Methods in the same category share very similar variance. In general, RSS is the best regarding variance, and achieves fastest convergence. To achieve convergence, there is no single sample size K that can be used across various datasets and estimators. Usually recursive methods require about 500 less samples than MC-based methods on the same dataset. 

For accuracy, all methods have similar relative error (<1.5%) at convergence. If K is set as 1000 for all estimators, some of them might not reach convergence, thus their relative errors can further be reduced by using larger K until convergence. 

For efficiency, RHH and RSS are the fastest when running time is measured at convergence. When K is set as 1000, there is no common winner in terms of running time. Overall, the running times of RHH, RSS, ProbTree, and LP+ are comparable. BFSSharing is 4× slower than MC, since it estimates all nodes’ reliability from the source node. 

The memory usage ranking (in increasing order of memory) is: MC < LP+ < ProbTree < BFSSharing < RHH ≈ RSS. 


Table 1: Summary and recommendation
Figure 3: The decision tree for selecting a proper reliability estimator under different scenarios

Recommendation. Table 1 summarizes the recommendation level of each method according to different performance metrics. The scale is from 1 to 4 stars, and larger star number stands for higher ranking. Clearly, there is no single winner. Considering various trade-offs, in conclusion we recommend ProbTree for s-t reliability estimation. It provides good performance in accuracy, online running time, and memory cost. Its index can slightly reduce the variance, compared to other MC-based estimators. Notably, we adopted MC as ProbTree’s reliability estimating component (as the original paper [6] did). However, one can replace this with any other estimator (e.g., recursive estimators: RHH and RSS) to further improve ProbTree’s efficiency and to reduce its variance (as demonstrated in our paper). 

The decision tree shown in Figure 3 demonstrates our recommended strategy for estimator selection under different constraints. Following the branch with red tick, the better estimator(s) under the current condition can be determined. Notice that the path from the root to the leaf of ProbTree consists of all red ticks, indicating its applicability and trade-off capacity considering various factors.


For more information, click here for our paper. Codebase: GitHub 
Blog post contributed by: Xiangyu Ke and Arijit Khan


[Reference]

[1] G. S. Fishman. A Comparison of Four Monte Carlo Methods for Estimating the Probability of s-t Connectedness. IEEE Tran. Rel., 1986.

[2] R. Jin, L. Liu, B. Ding, and H. Wang. Distance-Constraint Reachability Computation in Uncertain Graphs. PVLDB, 4(9):551–562, 2011.

[3] R. Li, J. X. Yu, R. Mao, and T. Jin. Recursive Stratified Sampling: A New Framework for Query Evaluation on Uncertain Graphs. IEEE TKDE, 28(2):468–482, 2016

[4] Y. Li, J. Fan, D. Zhang, and K.-L. Tan. Discovering Your Selling Points: Personalized Social Influential Tags Exploration. In SIGMOD, 2017.

[5] R. Zhu, Z. Zou, and J. Li. Top-k Reliability Search on Uncertain Graphs. In ICDM, 2015.

[6] S. Maniu, R. Cheng, and P. Senellart. An Indexing Framework for Queries on Probabilistic Graphs. ACM Trans. Database Syst., 42(2):13:1–13:34, 2017.

[7] A. Khan, Y. Ye, and L. Chen. On Uncertain Graphs. Synthesis Lectures on Data Management. Morgan & Claypool Publishers, 2018.

Wednesday, November 7, 2018

Notes on Expressibility of Vertex Centric Graph Processing Paradigm

A summary of the IEEE BigData 2018 research paper by Siyuan Liu and Arijit Khan

We study the vertex-centric (VC) paradigm for distributed graph processing, popularized by Google’s Pregel system [1]. In Pregel, which was inspired by the Bulk Synchronous Parallel (BSP) model [2], graph algorithms are expressed as a sequence of iterations called supersteps (Figure 1). Each superstep is an atomic unit of parallel computation. During a superstep, Pregel executes a user-defined function for each vertex in parallel. The user-defined function specifies the operation at a single vertex v and at a single superstep S. The supersteps are globally synchronous among all vertices, and messages are usually sent along the outgoing edges from each vertex. In 2012, Yahoo! launched the Apache Giraph [3] as an open-source project, which clones the concepts of Pregel. Since then, there were several attempts to implement many graph algorithms in a vertex-centric framework, as well as efforts to design optimization techniques for improving their efficiency. Many follow up works experimentally compared the efficiency and scalability of existing VC systems (for a survey, see [10], [11], [12], [13]). However, to the best of our knowledge, there has not been any systematic study to analyze the expressibility of the VC paradigm itself.
Fig. 1: Vertex-Centric paradigm

In this work, we investigate some of the fundamental limitations of the vertex-centric paradigm itself, by implementing a few distributed graph algorithms that are difficult to be expressed in the VC paradigm. Several graph problems have multiple distributed algorithms, which vary in their efficiency. We observe that not all distributed algorithms of a graph problem can be implemented effectively in the vertex-centric framework. In particular, we focus on two specific classes of algorithms: bucketing-based and multi-phased.

Bucketing-based refers to an algorithm that relies on a bucket structure to define priorities of vertices. Vertices are processed in batch from the highest priority vertices to lowest priority vertices. Vertices can be added and removed from buckets during execution to update their priorities. Example algorithms in this class include ∆-Stepping algorithm [4] for single-source shortest path, weighted BFS [5], k-core [6], and approximate set cover [7].

Multi-phased implies those algorithms which consists of several distinct phases. Each phase itself may be iterative. However, overall the algorithm’s execution follows the predefined order of phases. Example algorithms that fall under this category include distributed Brandes’ algorithm [8] for betweenness centrality, finding bi-connected and strongly connected components [9].

We find that often the more efficient distributed algorithm of a graph problem under the above two categories cannot be effectively implemented in the vertex-centric paradigm. We highlight the key expressibility limitations (e.g., more work, many active vertices, large message overheads, etc.) of VC paradigm in regards to both bucketing-based and multi-phase algorithms. We conduct extensive experiments to demonstrate these expressibility challenges specific to vertex-centric paradigm, while also presenting that such bottlenecks do not exist in traditional message passing interface (MPI) distributed systems. 

We conclude by discussing our recommendations on the road ahead for VC systems, such as (1) directly incorporating priority-based data structures (e.g., buckets) in each worker node, that will permit defining and dynamically updating priorities of vertices, and (2) providing more control to the master node, so that it can re-activate vertices during phase changes.


For more information, click here for our paper.
Blog post contributed by: Arijit Khan

[Reference]

[1] G. Malewicz, M. H. Austern, A. J. Bik, J. C. Dehnert, I. Horn, N. Leiser, and G. Czajkowski. Pregel: A System for Large-scale Graph Processing. In SIGMOD, 2010.
[2] L. G. Valiant. A Bridging Model for Parallel Computation. Commun. ACM, 33(8):103–111, 1990.
[4] U. Meyer and P. Sanders. ∆--stepping: A parallelizable shortest path algorithm. J. Algorithms, 49(1):114–152, 2003.
[5] R. Dial Algorithm 360: Shortest-path forest with topological ordering [H]. Commun. ACM, 12(11):632–633, 1969.
[6] L. Dhulipala, G. Blelloch, J. Shun Julienne: A framework for parallel graph algorithms using work-efficient bucketing. In SPAA, 2017.
[7] G. Blelloch, R. Peng, K. Tangwongsan Linear-work greedy parallel approximate set cover and variants. In SPAA, 2011.
[8] U. Brandes. A Faster Algorithm for Betweenness Centrality. Journal of Mathematical Sociology, 25(163), 2001.
[9] D. Yan, J. Cheng, K. Xing, Y. Lu, W. Ng, and Y. Bu. Pregel Algorithms for Graph Connectivity Problems with Performance Guarantees. PVLDB, 7(14):1821–1832, 2014.
[10] M. Han, K. Daudjee, K. Ammar, M. T. Ozsu, X. Wang, and T. Jin. An Experimental Comparison of Pregel-like Graph Processing Systems. PVLDB, 7(12):1047–1058, 2014.
[11] A. Khan and S. Elnikety. Systems for Big-Graphs. PVLDB, 7(13):1709– 1710, 2014.
[12] R. R. McCune, T. Weninger, and G. Madey. Thinking Like a Vertex: A Survey of Vertex-Centric Frameworks for Large-Scale Distributed Graph Processing. ACM Comput. Surv., 48(2):25:1–25:39, 2015.
[13] D. Yan, Y. Bu, Y. Tian, A. Deshpande, and J. Cheng. Big Graph Analytics Systems. In SIGMOD, 2016.

Wednesday, May 30, 2018

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

On Smart Query Routing: For Distributed Graph Querying with Decoupled Storage

A summary of the USENIX ATC 2018 research paper by Arijit Khan, Gustavo Segovia, and Donald Kossmann

Background: Distributed Graph Querying & Graph Partitioning

Graphs with millions of nodes and billions of edges are ubiquitous to represent highly interconnected structures including the World Wide Web, social networks, knowledge graphs, genome and scientific databases, medical and government records. To support online search and query services (possibly from many clients) with low latency and high throughput, data centers and cloud operators consider scale-out solutions, in which the graph and its data are partitioned horizontally across cheap commodity servers. We assume that the graph topology and the data associated with nodes and edges are co-located, since they are often accessed together. Keeping with the modern database trends to support low-latency operations, we target a fully in-memory system, and use disks only for durability.
                                 
Fig. 1: State-of-the-art distributed graph querying systems (e.g., SEDGE [1], Trinity [2], Horton [3])

For efficiently answering online queries in a distributed environment, state-of-the-art systems (e.g., SEDGE[1], Horton[2], Trinity [3]) first partition the graph, and then place each partition on a separate server, where query answering over that partition takes place (Fig. 1). Since the server which contains the query node can only handle that request, the router maintains a fixed routing table (or, a fixed routing strategy, e.g., modulo hashing). Hence, these systems are less flexible with respect to query routing and fault tolerance, e.g., adding more machines will require updating the data partition and/or the routing table. Besides, an effective graph partitioning in these systems must achieve: (1) workload balancing to maximize parallelism, and (2) locality of data access to minimize network communication. It has been demonstrated in the literature that sophisticated partitioning schemes improve the performance of graph querying, compared to an inexpensive hash partitioning. Due to power-law degree distribution of real-world graphs, it is difficult to get high-quality partitions. Besides, a one-time partitioning cannot cope with later updates to graph structure or variations in query workloads. Several graph re-partitioning and replication-based strategies were proposed. However, online monitoring of workload changes, re-partitioning of the graph topology, and migration of graph data across servers are expensive; and they reduce the efficiency and throughput of online querying.

A Decoupled Graph Querying System: Pros and Cons

In contrast to existing systems, we consider a different architecture, which relies less on an effective graph partitioning. Instead, we decouple query processing and graph storage into two separate tiers (Fig. 2). This decoupling happens at a logical level. As an example, query processors can be different physical machines than storage servers. On the other hand, the same physical machine can also run a query processing daemon, together with storing a graph partition in its main memory as a storage server. However, the logical separation between the two layers is critical in our design.

In a decoupled framework, the graph is partitioned across servers allocated to the storage tier, and these storage servers hold the graph data in their main memory. Since a query processor is no longer assigned any fixed part of the graph, it is equally capable of handling any request, thus facilitating load balancing and fault tolerance. At the same time, the query router can send a request to any of the query processors, which adds more flexibility to query routing, e.g., more query processors can be added (or, a query processor that is down can be replaced) without affecting the routing strategy. Another benefit due to decoupled design is that each tier can be scaled-up independently. If a certain workload is processing intensive, more servers could be allocated to the processing tier. On the contrary, if the graph size increases over time, more servers can be added in the storage  tier. This decoupled architecture, being generic, can be employed in many existing graph querying systems.
Fig. 2: Decoupled architecture for graph querying

The idea of decoupling has been effectively used in the past. Facebook implemented a fast caching layer, Memcached on top of a graph database that scales the performance of graph query answering [4]. Google’s F1 [5] and ScaleDB [6] are based on a decoupling principle for scalability. Recently, Loesing et. al. [7] and Binnig et. al. [8] demonstrated the benefits of a decoupled, shared-data architecture, together with low latency and high throughput Infiniband network. Shalita et. al. [9] employed decoupling for an optimal assignment of HTTP requests over a distributed graph storage.

However, we must also consider the drawbacks of having the graph storage apart. First, query processors may need to communicate with the storage tier via the network. This includes an additional penalty to the response time for answering a query. Second, it is possible that this design causes high contention rates on either the network, storage tier, or both.

Our contribution lies in designing a smart query routing logic to utilize the cache of query processors over such decoupled architecture. Achieving more cache hits is critical in a decoupled architecture, in order to reduce the communication among query processors and storage servers. However, this is a non-trivial problem, e.g., exploiting cache locality and balancing workloads are conflicting in nature. For example, to achieve maximum cache locality, the router can send all the queries to the same processor (assuming no cache eviction happens). However, the workload of the processors will be highly imbalanced, resulting  in lower throughput. In addition, graph workloads are significantly different from traditional database applications. The interconnected nature of graph data results in poor locality, and each query usually accesses multiple neighboring nodes spreading across the distributed storage. Therefore, to maximize cache hit rates at query processors, it is not sufficient to only route the queries on same nodes to the same processor. Rather, successive queries on neighboring nodes should also be routed to the same processor, since the neighborhoods of two nearby nodes may significantly overlap. To the best of our knowledge, such smart query routing schemes for effectively leveraging the cache contents were not considered in existing graph querying systems.

h-Hop Graph Traversal Queries:

In this work, we consider online, h-hop queries that explore a small region of the entire graph, and require fast response time. These queries usually start with a query node, and traverse its neighboring nodes up to a certain number of hops (i.e., h = 2, 3). Examples include h-hop neighbor aggregation, h-step random walk with restart, h-hop reachability, etc. These queries are often used as the basis for more complex graph operations. For example, neighborhood aggregation is critical for node labeling and classification, that is, the label of an unlabeled node could be assigned as the most frequent label which is present within its h-hop neighborhood. The h-step random walk is useful in expert finding, ranking, discovering functional modules, complexes, and pathways. 

Smart Graph Query Routing Schemes:

We identify the following objectives that a smart graph query routing scheme must satisfy.

1. Leverage each processor’s cached data. Let us consider t successive queries received by the router. The router will send them to query processors in a way such that the average number of cache hits at the processors is maximized. This, in turn, reduces the average query processing latency. However, as stated earlier, to achieve maximum cache hits, it will not be sufficient to only route the queries on same nodes to the same processor. Rather, successive queries on neighboring nodes should also be routed to the same processor, since the neighborhoods of two nearby nodes may significantly overlap.

2. Balance workload even if skewed or contains hotspot. As earlier, let us consider a set of t successive queries. A na¨ıve approach will be to ensure that each query processor receives equal number of queries, e.g., a round-robin way of query dispatching by the router. However, each query might have a different workload, and would require a different processing time. We, therefore, aim at maximizing the overall throughput via query stealing, which automatically balances the workload across query processors.

3. Make fast routing decisions. The average time at the router to dispatch a query should be minimized, ideally a small constant time, or much smaller than O(n), where n is the number of nodes in the input graph. This reduces the query processing latency.

4. Have low storage overhead in the router. The router may store auxiliary data to enable fast routing decisions. However, this additional storage overhead must be a small fraction compared to the graph size.

Based on the above requirements, we design two smart routing strategies: landmark routing and graph embed routingKleinberg et. al. [10] discussed the problem of approximating graph distances using a small set of beacons (i.e., landmarks). Graph embedding algorithms, on the other hand, place nodes at points on some surface such that the inherent graph structure is preserved [11, 12]. In this post, we briefly discuss our graph embedding-based query routing strategy: We embed a graph into a lower D-dimensional Euclidean space such that the hop-count distance between graph nodes are approximately preserved via their Euclidean distance (Fig. 3).  Since each node receives D co-ordinates, it requires total O(nD) space in the router, which is linear in the number of nodes. 

Fig. 3: Example of graph embedding in 2D Euclidean plane

In embed routing, the router has access to each node's co-ordinates. By keeping an average of the query nodes’ co-ordinates that it sent to each processor, it is able to infer the cache contents in these processors. As a consequence, the router finds the distance between a query node u and a processor p, denoted as d(u, p), and defined as the distance of the query node's co-ordinates to the historical mean of the processor's cache contents. As recent queries are more likely to influence the cache contents due to LRU eviction policy, we use the exponential moving average (i.e., recent queries are given more weight) to compute the mean of the processor's cache contents. We select the processor with the smallest d(u, p) distance. Observe that the routing decision time is only O(PD), P being the number of processors and D the number of dimensions. Furthermore, the distance metric d(u, p) is useful not only in finding the best processor for a certain query, but it can also be used for load balancing, dealing with workload skew, and hotspots. As an example, let us assume that the closest processor for a certain query is busy, or is currently down. Since the distance metric gives us distances to all processors, the router is able to select the second, third, or so on closest processor (query stealing). This form of load balancing impacts nearby query nodes in the same way, hence they are routed to the same query processor, thereby leveraging neighborhood-based locality.

Our experiments with several real-world, large graphs demonstrate that the proposed framework gRouting, even with simple hash partitioning, achieves up to an order of magnitude better query throughput compared to existing distributed graph systems that employ expensive graph partitioning and re-partitioning strategies. In addition to workload balancing and deployment flexibility, gRouting is able to provide linear scalability in throughput with more number of query processors, works well in the presence of query hotspots, and is also adaptive to workload changes and graph updates.

For more information, click here for our paper.
Blog post contributed by: Arijit Khan

[Reference]

[1] YANG, S., YAN, X., ZONG, B., AND KHAN, A. Towards Effective Partition Management for Large Graphs. In SIGMOD (2012).
[2] SHAO, B., WANG, H., AND LI, Y. Trinity: A Distributed Graph Engine on a Memory Cloud. In SIGMOD (2013).
[3] SARWAT, M., ELNIKETY, S., HE, Y., AND MOKBEL, M. F. Horton+: A Distributed System for Processing Declarative Reachability Queries over Partitioned Graphs. PVLDB 6, 14 (2013). 
[4] NISHTALA, R., FUGAL, H., GRIMM, S., KWIATKOWSKI, M., LEE, H., LI, H. C., MCELROY, R., PALECZNY, M., PEEK, D., SAAB, P., STAFFORD, D., TUNG, T., AND VENKATARAMANI, V. Scaling Memcache at Facebook. In NSDI (2013).
[5] SHUTE, J., VINGRALEK, R., SAMWEL, B., HANDY, B., WHIPKEY, C., ROLLINS, E., OANCEA, M., LITTLEFIELD, K., MENESTRINA, D., ELLNER, S., CIESLEWICZ, J., RAE, I., STANCESCU, T., AND APTE, H. F1: A Distributed SQL Database That Scales. PVLDB 6, 11 (2013).
[6] http://scaledb.com/pdfs/TechnicalOverview.pdf
[7] LOESING, S., PILMAN, M., ETTER, T., AND KOSSMANN, D. On the Design and Scalability of Distributed Shared-Data Databases. In SIGMOD (2015).
[8] BINNIG, C., CROTTY, A., GALAKATOS, A., KRASKA, T., AND ZAMANIAN, E. The End of Slow Networks: It’s Time for a Redesign. PVLDB 9, 7 (2016).
[9] SHALITA, A., KARRER, B., KABILJO, I., SHARMA, A., PRESTA, A., ADCOCK, A., KLLAPI, H., AND STUMM, M. Social Hash: An Assignment Framework for Optimizing Distributed Systems Operations on Social Networks. In NSDI (2016).
[10] KLEINBERG, J., SLIVKINS, A., AND WEXLER, T. Triangulation and Embedding Using Small Sets of Beacons. J. ACM 56, 6 (2009). 
[11] DABEK, F., COX, R., KAASHOEK, F., AND MORRIS, R. Vivaldi: A Decentralized Network Coordinate System. In SIGCOMM (2004).
[12] ZHAO, X., SALA, A., WILSON, C., ZHENG, H., AND ZHAO, B. Y. Orion: Shortest Path Estimation for Large Social Graphs. In WOSN (2010).