From here on you may require many of Graphia’s visualisation and analytical capabilities described in the previous section. In particular you may wish to:
- Adjust the correlation threshold. Displayed at top right hand corner of the screen is a slider bar and text box for adjusting the display threshold between the minimum threshold and 1.
- Cluster the graph, if a clustering transform was not selected in the wizard.
- Display the data values associated with clusters - see below.
Correlation Graph - First view
Edges are created for any relationships with a correlation value above the minimum threshold, however only those edges with a value above the defined initial threshold will be displayed. This value is adjustable in the transform list.
There are two transforms automatically added to the transforms list. The first, Remove Edges where Pearson Correlation Value < [Initial Threshold], will remove all edges with a correlation score below the value set. This value can be adjusted here and its effects will be immediately applied to the graph. A large value will remove all but the most highly correlated relationships while a low value will display the weaker relationships. The second, Remove Components where Component Size ≤ 1, will remove singular nodes from the display that have no connections.
By adjusting the cut-off value for edges, you can ‘open up’ the graph to reveal underlying structure - what connects, and where the nodes are relative to others. Another option to help reveal the structure of a highly connected graph is to apply an edge reduction method such as k-NN. Below shows the above graph before and after the k-NN edge reduction algorithm has been applied.
When performing a correlation analysis, a data plot is also displayed adjacent to the usual attribute table. This plot displays the data values associated with selected nodes. There are numerous options to change the appearance of this plot, these are discussed below.
Once a correlation graph has been generated, a common first step is to run a clustering algorithm. This partitions the graph into clusters based on the connectivity between nodes, with the result that entities with a similar data profile are located within the same cluster. For correlation graphs where the average node degree is high, we recommend the use of the MCL algorithm. The granularity setting for clustering can be adjusted on fly by use of the slider bar, such that the clustering reflects the visible graph structure. Once satisfied with the partition of a graph, a user can then rapidly explore the profile of the resulting clusters.
Go to Edit → Find by Attribute Value and the following user interface will appear in the top left of the window. This allows finding all the nodes which have a particular value for the selected attribute. In the context of clustering, this allows the user to scroll through the calculated clusters. Multiple clusters can be selected at once using the option immediately to the right of the attribute selector; Select Multiple.
The data profile of the selected nodes is displayed in the lower right of the window. There are numerous options available to control this, examples of which are shown here:
Graphia is designed to let a user quickly explore data clusters, defining what is interesting and what is ‘noise’. Having selected a specific cluster, individual data points within that cluster may be selected in the attribute table. By scrolling up and down within the data table, individual nodes are highlighted, and their data profiles displayed in the plot. After an initial analysis it may be necessary to look again at the input data, possibly recalculating values or removing data that is not of interest, therefore focusing an analysis on what is interesting.
Graphia is designed to allow you to explore, analyse and interrogate data, it is up to you how you do this and what you do with the insights obtained.