This article concerns the generation of simple diagrams containing a series of interconnected nodes. Often, simple diagrams such as these can add a new dimension of usability to a data-driven application. For example, you can:
- Show the connections between related entities (e.g. people and relationships)
- Visualise complex structures (e.g. tree data structures)
- Follow logic flows and sequences (e.g. flow charts)
- Produce a navigation map (e.g. for a website)
An ever-present challenge in dynamically-generating diagrams, however, is the layout of the nodes. Aesthetics dictate the need for uniform spacing between nodes, radial distribution and minimal crossing of lines (among other things), yet these are very difficult to achieve using ‘blind’ techniques (rule-based layouts and other algorithms that position nodes without considering the overall diagram).
Force-directed algorithms present an exciting solution to this problem. Here, simple mechanical forces are applied to the diagram (essentially by running a physics simulation), resulting in a layout that exhibits the properties of emergent complexity. The appropriate selection of forces and formulae are capable of producing layouts that meet the aesthetic goals outlined above. In their purest form, force-directed algorithms balance the forces of charged-particle repulsion between each node (Coulomb’s Law) and spring attraction between each pair of connected nodes (Hooke’s Law). After a number of iterations (each representing a slice of time during which the forces act upon the nodes), an equilibrium is reached and the layout process is complete.
I have implemented a force-based layout algorithm in C# and GDI+. It provides a simple Diagram class which acts as a container for descendants of the Node abstract class. The structure and layout of the diagram is separated from the rendering, which is fully overridable. The implementation is not tied to Windows Forms, but is fully compatible, so diagrams can either be drawn onto a System.Windows.Forms.Control (for display and interactivity) or directly onto a System.Drawing.Bitmap (for exporting).
Classes
Node
Node is the base class for a node on the diagram. Conceptually, it has a Location on the cartesian plane and a series of Connections to other nodes on the diagram. It also makes provision for a Size (used when rendering) and abstract functionality for drawing both the node and its (child) connections. Connections between nodes can be hierarchical (parent-child) or two-way. If a diagram contains a mixture of different derived types, parent nodes are responsible for drawing the connections to their children (although a base implementation is provided).
SpotNode is an example concrete implementation of Node that has additional visual properties and contains logic to draw itself as an 8×8 circle.
Diagram
Diagram serves as both a collection of Node objects and the layout provider. The implementation of the force-directed algorithm resides in this class. Although it performs no painting itself, it provides a Draw() method that scales the diagram to match a drawing surface and instructs each node to draw in the correct bounds.
It also defines a nested class, NodeLayoutInfo, which tracks the mechanical properties (velocity, future coordinates) of each node during the force-based simulation.
Vector
I developed the Vector structure as part of an earlier project. It models (on a basic level) a vector whose Magnitude and Direction are System.Double, with the direction expressed in degrees. It uses operator overloading to support vector addition and scalar multiplication. In the force-based algorithm, it is necessary to calculate and express the forces acting on each node in terms of both magnitude and direction, hence a vector representation is desirable.
The Layout Algorithm
At the heart of the force-directed algorithm are the methods that calculate the forces themselves:
The repulsion force is exerted by every node in the diagram, and each node is repelled (however slightly) by every other node in the diagram. This force is likened to the repulsion of charged particles, and so it is based on Coulomb’s Law; F = k(Q1Q2/r2) – where k is a constant, Q is the charge and r is the distance between the particles. Since all of our nodes are equivalent, we can ignore the charge term. In other words, as the distance between the nodes increases, the repulsion force decreases quadratically. This calculation is performed in the CalcRepulsionForce() method:
private Vector CalcRepulsionForce(Node x, Node y) { int proximity = Math.Max(CalcDistance(x.Location, y.Location), 1); double force = -(REPULSION_CONSTANT / Math.Pow(proximity, 2)); double angle = GetBearingAngle(x.Location, y.Location); return new Vector(force, angle); }
The attraction force is exerted on each node by the nodes that are connected to it; isolated nodes are therefore unaffected by this force. The connectors between the nodes are regarded as spring-like, hence the force calculated using Hooke’s Law; F = −kx – where k is a constant and x is the difference between the length of the spring and the distance between the nodes (i.e. the extension of the spring). In other words, as the distance between the nodes increases, the force pulling them back together increases proportionally (at least until the spring is fully relaxed, in which case the force stops applying). This calculation is performed in the CalcAttractionForce() method:
private Vector CalcAttractionForce(Node x, Node y, double springLength) { int proximity = Math.Max(CalcDistance(x.Location, y.Location), 1); double force = ATTRACTION_CONSTANT * Math.Max(proximity - springLength, 0); double angle = GetBearingAngle(x.Location, y.Location); return new Vector(force, angle); }
The force-directed algorithm applies these forces to the nodes via an iterative simulation, as follows:
- Randomise the initial coordinates of each Node – even a tiny variation will be sufficient.
- Until the maximum number of iterations is exceeded (or the stop condition is satisfied):
- Initialise the totalDisplacement to zero.
- For each node:
- Initialise the netForce Vector to zero.
- Call CalcRepulsionForce() on every other node in the diagram, adding the result to netForce.
- Call CalcAttractionForce() on every node connected to the current node, adding the result to netForce.
- Add netForce to the node’s velocity vector.
- Let the node’s new position be set to the vector sum of its current position and its velocity.
- For each node:
- Calculate the displacement caused by moving the node to its new position, adding the result to totalDisplacement.
- Move the node to its new position.
- If totalDisplacement is less than a threshold value, trigger the stop condition.
- Offset each node such that the diagram is centered around the origin (0,0). (During iteration, the diagram may creep.)
Usage Examples
Drawing a basic diagram to a bitmap
Diagram diagram = new Diagram(); // create some nodes Node fruit = new SpotNode(); Node vegetables = new SpotNode(); Node apple = new SpotNode(Color.Green); Node orange = new SpotNode(Color.Orange); Node tomato = new SpotNode(Color.Red); Node carrot = new SpotNode(Color.OrangeRed); diagram.AddNode(fruit); diagram.AddNode(vegetables); // create connections between nodes fruit.AddChild(apple); fruit.AddChild(orange); fruit.AddChild(tomato); vegetables.AddChild(tomato); vegetables.AddChild(carrot); // run the force-directed algorithm diagram.Arrange(); // draw to a Bitmap and save Bitmap bitmap = new Bitmap(640, 480); Graphics g = Graphics.FromImage(bitmap); diagram.Draw(g, new Rectangle(Point.Empty, bitmap.Size)); bitmap.Save("demo.bmp");
Sample application
In the source code package (at the bottom of the article), I have included a simple Windows Forms application that demonstrates how to generate a random hierarchical diagram and paint it persistently on a Form. The layout is performed in a separate thread, and can optionally be displayed in a real-time, animated fashion.
Final Words
Force-directed algorithms are not perfect when it comes to arranging the nodes on a diagram; they are subject to conforming to local extremes rather than finding a 100%-optimal layout. This means that diagrams with a large number of nodes will typically have a higher occurrence of crossed lines. However, even when a less-optimal result is obtained, said result is often still visually striking. Running time can be a problem with particularly dense diagrams, but as already discussed, simpler diagrams produce a more aesthetically-pleasing result anyway. In most practical applications, a force-based approach provides a very good result for dynamically-generated diagrams.
With respect to my implementation, it is a very bare-bones approach, but it is also very scalable; for example, if appropriate node subclasses were written, it could be used to produce flow-charts, people-relationship diagrams, site maps for websites, brainstorming charts or even database diagrams (with some additional functionality). The ability to automatically generate such diagrams can open up a whole new world in terms of application usability, data analysis and the production of deliverables.
A force-directed algorithm will definitely drive my future diagramming/graphing projects.
Download
If you find my code useful, please consider making a donation.
ForceDirected.zip – rev 1 (source code and example application)
- 2013/02/24 – Improved Vector class by replacing Math.Tanh() approach with Math.Atan2()
Vector.zip (drop-in replacement for the Vector class above)
- 2015/02/28 – Added an alternative Vector implementation that produces a ~33% performance boost