[关键词]
[摘要]
网络分析作为一种新的数据可视化途径和定量方法, 能简化复杂系统而发现元素间的关系模式。它在定量社会科学、计算机科学与机器学习等领域均有众多应用实例。近年来, 在古生物学, 特别是古生物地理学相关研究中逐渐受到关注。本文介绍了网络的基本构成、常见网络类型及其数据储存方式、以及网络分析中的重要参数及其定义, 同时给出了两种实现网络分析的方法及相关工具, 即Gephi软件与R语言平台。通过分析比较两种方法的步骤及结果, 发现Gephi虽成图简洁美观, 但算法功能有限, 且无法完成成图前的数据处理以及成图后的多元分析, 因而最终推荐使用R语言编程进行网络分析。本文以奥陶纪末大灭绝后复苏期全球腕足动物数据为例, 详细展现了利用R语言及其应用包“igraph”编程进行网络分析的过程, 并实现了古生物地理学数据资料的处理以及网络分析图件的绘制。希望对即将接触此类工具的古生物学科研人员在进行网络分析时提供借鉴与参考。
[Key word]
[Abstract]
Network analysis (NA) is the analyzation of networks through graph theory. It is a new way of data visua- lization technique and quantitative method that can be breaking down a complex data system into its component parts and plotting them to show their interdependencies and interrelationships. There are many research cases in the fields of quantitative social science, com-puter science and machine learning. In recent years, this analytical method has attracted more and more attention in quantita-tive paleontology, especially in palaeobiogeography. In this paper, basic principles of NA and common network are briefly introduced, i.e. the adjacency matrix commonly used in co-occurrence network, and the incidence matrix involved in bipartite network. The important parameters and their definition of NA, such as average degree, graph density and modularity are also described. The two ways to realize the NA and their related tools are given, which are Gephi software and R project. After compared the procedure and results of the two ways, we found that although the network diagram can be easily generated by Gephi, and its diagram is more artistic than R, its algorithm function is limited because of deficiency of parameter settings. Furthermore, the Gephi is unable to carry out data processing before plotting, and projection of network nodes to map after plotting, while these related works can be proceeded in R project. Mostly because of flexibility of program with R language, we finally recommended use R project for network analysis. With taking the global dataset of brachiopods in the recovery period after the end Ordovician mass extinction as an example, we introduced the network analysis with R project with its package ‘igraph’ in detail. The study also tested NA using raw data compared with the revised information, and proved the importance of revision for raw data especially for small dataset. With the program we coded for processing the data and plot network graph automatically, we hope that it will be helpful for paleontologists and students who would use network analy-sis.
[中图分类号]
[基金项目]