A simple guide to graph analytics

Amy Kelly
3 min readJul 5, 2021
  1. What is graph

In recent years, big data has entered a period of accelerated development, and its amount has increased exponentially. The data generated by the association between different individuals is presented in the form of graphs. The graph here is for the “graph theory” in mathematics, and is mainly a data structure composed of points and edges. The vertices are equivalent to the nodes in the tree. The relationship between vertices is called an edge. For example, if there are three people sitting in the office, these three people are three points. The relationship between three people is called a side. For example, their relationship may be: colleagues, friends, project partners, etc.

2. What is graph analytics

Graph analytics uses graph-based methods to analyze connected data. We can query graph data, use basic statistical information, visually explore graphs, display graphs, or preprocess graph information and incorporate it into machine learning tasks. Graph query is usually used for partial data analysis, while graph calculation usually involves the entire graph and iterative analysis.

3. Applications

3.1 Social network

Social network is a very common type of graph data, which represents the social relationship between various individuals or organizations, and graph data can present complex social network relationships, which makes it easy for users to perform further analysis. For example, in a typical social network, there is often “who knows who, who has gone to what school, who lives where”. And Facebook, Twitter, and Linkedin use it to manage social relationships and achieve friend recommendations.

3.2 E-shopping

E-shopping is a core business in the Internet. In this scenario, nodes are divided into two categories: users and commodities. The existing relationships include browsing, collecting, and purchasing. There can be multiple relationships between users and commodities, such as both a collection relationship and a purchase relationship. Such complex data scenarios can be easily described with attribute graphs. E-shopping gave birth to a well-known technical application — recommendation system. The interactive relationship between the user and the product reflects the user’s shopping preferences. For example, the classic story of beer and diapers: People who love to buy beer usually also prefer to buy diapers.

3.3 Traffic Network

The transportation network has many forms. For example, in the subway network, each station is used as a node, and the connectivity between the stations is used as an edge. Usually in the transportation network we pay more attention to the problems related to path planning: such as the shortest path problem, and then we use the traffic flow as the attribute of the nodes in the network to predict the changes in the future traffic flow.

4. Advantages

Generally speaking, relationship analysis studies relationships through one-to-one or even one-to-many comparisons, while graph analysis can also compare many-to-many relationships. Relational databases are composed of strict schemas because it is difficult to add new data relationships to them. Therefore, relational analysis is most suitable for structured and unchanging data sorted by tables and columns. Graph analysis is supported by graph databases instead of relational databases. Data and data relationships can be added relatively easily in graph databases. This can save more time for graph analysis in data organization and spend less effort on merging data sources and point.

In addition, compared to most other data analysis tools and models, graphs are visually more attractive and easier to understand. It can also find indirect relationships and condense a large amount of complex data together, which can improve the accuracy of prediction and decision-making, and provide people with deeper insights.

5. Some tools

Some of the Graph database include: Some popular graph databases include-ArangoDB, Amazon Neptune, Neo4j, Orient DB, Dgraph, FlockDB.

Some of the Graph analysis platforms include: TigerGraph, which is an enterprise-level graph analysis platform. BigGraph, this is a large-scale online graph analysis platform from Ali.

Some of theGraph computing engines include: GraphX, Giraph, GraphScope. Everyone is generally familiar with GraphX and Giraph. Graph Scope is a one-stop large-scale graph computing system researched by Alibaba’s Dharma Academy. The tools related to these graph data will be introduced in detail in the next article.

--

--