I recently attempted to transform a boring, alphabetical list of tags into a more visually-appealing (and more informational) tag cloud. No doubt you’ve seen tag clouds on blogs and other websites (there’s even one on here!) but for the uninformed:

A tag cloud is a form of ‘weighted list’ where reusable tags (keywords, categories, etc) are arranged on a diagram in such a way that more frequently-used tags appear in a larger font. The arrangement of the tags themselves can be ordered (usually alphabetically) or a more organic layout can be used (e.g. a force-based algorithm). Their purpose is to show an extra dimension of information (i.e. frequency) and thus draw attention to more popular tags.

This blog has already dedicated a lot of time to the generation and layout of diagrams, so this article will focus on another important part of any tag cloud algorithm: how to assign a suitable font size to each tag. This is a somewhat tricky problem in that:

  • The total number of tags and upper bound of each tag’s frequency are unknown
  • We obviously want to constrain the font size so that it falls within a sensible range
  • Regardless of the distribution of the frequencies, we want to have strong variation in font sizes

Before I settled on a solution, I tried a number of other approaches:

Proportional sizing

Given a query to determine the frequency of use for tags, such as:

SELECT Tag, COUNT(*) as [Frequency]
FROM PostsTags
GROUP BY Tag

We can use a formula like this to assign font sizes:

size = minSize + ((maxSize-minSize) * (frequency / maxFrequency))
Proportional sizing
Results using proportional sizing

This formula scales each tag according to its proportion to the maximum value, ensuring the font size is between minSize and maxSize. The problem is that it is adversely affected by outliers; e.g. if 90% of tags vary by only 10^1 and 10% vary by 10^3, there will be one or two enormous tags and lots of very small ones (with not much variance between them).

Sizing by random sample

In this approach, n tags are randomly selected and their frequencies become the threshold values for each font size (where n is the number of different font sizes that can be used). This takes advantage of the fact that psuedo-random numbers are uniformly-distributed, so querying for the number of frequency levels we want (instead of all values) will give a good indication of the different values in the set.

The SQL query for a random sample (in this case, of size 10) is of the form:

SELECT TOP(10) Tag, COUNT(*) as [Frequency]
FROM PostsTags
GROUP BY Tag
ORDER BY newid()
Sizing by random sample
Results using sizing by random sample

The result, however, is less-than ideal; the resulting font sizes are random (since we are working with a random sample) and outliers are likely to be ignored (in particular the minimum and maximum, e.g. the most frequent tag will appear as large as the second most-frequent).

Sizing by order using ROW_NUMBER()

This approach uses the T-SQL ROW_NUMBER() windowing function to assign a sequential number to each tag, sorted according to frequency; so, the least-used tag will be numbered as 1, the most-used tag as n (total number of tags). This means that, regardless of the gap between each tag’s frequency, we will get a uniform increase in font size.

The SQL query would look something like this:

SELECT Tag, COUNT(*) as [Frequency], ROW_NUMBER() OVER (ORDER BY COUNT(*)) as [Order]
FROM PostsTags
GROUP BY Tag

Then, sizes would be assigned using the following formula:

size = minSize + ((maxSize - minSize) * ((order - 1) / maxOrder)))

(Since order starts at 1, we subtract 1 because we want the right-hand size of the expression to evaluate to zero, i.e. size = minSize)

Sizing using ROW_NUMBER()
Results using sizing by ROW_NUMBER()

The problem with this approach is that tags with equal frequencies will not be assigned the same font size; rather, they will increase in font size according to table-order. Since a large number of tags are likely to have been used between 0 and 1 times, the result will be unsatisfactory.

The solution: sizing using DENSE_RANK()

The approach above is based on a good premise, but isn’t quite right. If only there was some way of sizing according to order where equal frequencies were assigned the same value… oh wait, there is! The T-SQL DENSE_RANK() windowing function behaves much like ROW_NUMBER(), except that it assigns the same value when there are ties in the set. (We could have used the normal RANK() function, but this leaves gaps in the numbering)

The SQL looks something like this:

SELECT Tag, COUNT(*) as [Frequency], DENSE_RANK() OVER (ORDER BY COUNT(*)) as [Rank]
FROM PostsTags
GROUP BY Tag

Then, sizes are assigned using the following formula:

size = minSize + ((maxSize - minSize) * ((rank - 1) / maxRank))

(I’ve opted to make maxSize an exclusive boundary, to avoid a situation where maxRank is 1 and we risk dividing by zero)

Sizing using DENSE_RANK()
Results using sizing by DENSE_RANK()

The result is a tag cloud where:

  • Font sizes increase uniformly from the least-used tag to the most-used tag
  • Outliers are not ignored, nor do they result in hugely disproportionate font sizes
  • Tags with a frequency of zero are always assigned minSize
  • Upon first use, a tag immediately stands apart from the unused tags

Keeping font sizes low until tags reach maturity

If you don’t want the maximum font size to be assigned until a certain frequency is reached, you can apply a transformation to the ratio:

size = minSize + ((maxSize - minSize) * ((rank - 1) / maxRank) * Math.Min(1, maxFrequency / threshold))

This will allow font sizes to gradually approach maxSize as the most-used tag’s frequency approaches threshold.

Final words

Hopefully this article gives you an idea of some of the complexities involved in producing good-looking tag clouds, and allows you to avoid some of the common pitfalls associated with this.

This is just a quick announcement that i’ve enabled comment voting on my blog. I often get interesting and thoughtful comments from people, and i’d like to be able to show my appreciation – and allow other users to do so as well. I’m not going to retrospectively vote on old comments, but from now on, i’ll be upvoting comments that are insightful and constructive. I encourage my readers to do so as well!

Animation of force-based tessellation algorithm

When I first blogged about force-based algorithms, my focus was firmly on charting and graphing; specifically, producing aesthetically-pleasing diagrams consisting of nodes and connectors. Force-based algorithms have many other applications in computer science, from graphics to simulations to problem-solving and beyond. In this post, i’m going to look at the application of a force-based algorithm to the task of arranging images in a tessellating pattern.

The forces in play

As with my force-directed diagram algorithm, this force-based algorithm requires at least two forces to be in play; if there is no opposing force, then the layout would either collapse or ‘fly away’ from the viewport. In this case, the dominant force is a pulling force that acts towards the middle of the screen; regardless of where an image tile is situated, it will be dragged towards the center point. In this way, the force is much like a localised gravitational field (although it does not strengthen as objects approach the point). The opposing force is that of collision; when one image tile would otherwise cross into the bounds of another, a reaction force is applied in the opposite direction (some of the energy is absorbed during the collision, making it only partially elastic).

Furthermore, when a collision occurs, image tiles are not permitted to overlap. Instead, one of three things will happen:

  • The tile will slide against the horizontal edge of the colliding tile (towards the center)
  • The tile will slide against the vertical edge of the colliding tile (towards the center)
  • The tile will remain in its original position

The choice of action will be determined according to whether the tile’s path is unobstructed.

Together, these forces encourage the image tiles to reach a stable configuration, in which they are clustered together with minimal gaps between them.

The algorithm

LET center_screen represent the middle of the screen
LET damping represent the factor by which velocity decreases between iterations
LET absorbance represent the factor by which energy is absorbed during a collision

INITIALISE the each image tile with a random (or uniformly-distributed) placement

REPEAT
    FOR EACH image tile:
        LET net_force equal zero
        LET pulling_force act at the angle formed between the image tile and center_screen

        SET net_force = net_force + pulling_force
        SET velocity{i} = (velocity{i-1} * damping) + net_force
        SET position{i} = position{i-1} + velocity{i}

        IF a collision occurs
            LET reaction_force act at the angle formed between the colliding image tiles

            SET net_force = (net_force * absorbance) + reaction_force
            SET velocity{i} = (velocity{i-1} * damping) + net_force

            LET proposed_horizontal represent the coordinate (position{i}.x, position{i-1}.y)
            LET proposed_vertical represent the coordinate (position{i-1}.x, position{i}.y)

            IF proposed_horizontal collides with no other image tile
                SET position{i} = proposed_horizontal
            ELSE IF proposed_vertical collides with no other image tile
                SET position{i} = proposed_vertical
            ELSE
                SET position{i} = position{i-1}
            END IF
        END IF
    END FOR EACH
UNTIL (total displacement drops below threshold) OR (maximum iterations reached)

The force, velocity and position variables are represented using vectors. Collisions take into account the entire bounds of each tile (implemented using Rectangle.IntersectsWith() in .NET).

The initial placement of the image tiles is significant; they must be placed in such a manner that they do not overlap. The implementation offers two initial arrangement types; random (in which positions are completely randomised), or uniform (where image tiles are assigned to positions on a uniform grid).

The implementation

The algorithm is implemented in the ImageCollage class. Each instance of ImageCollage holds a collection of ImageInfo objects, which represent the image tiles.

ImageCollage:

  • Images – Collection of ImageInfo objects.
  • Arrange() – Runs the force-based algorithm.
  • Distribute() – Responsible for the initial placement of image tiles.
  • Render() – Renders the montage using the image files and resulting layout.

ImageInfo:

  • Path – Path to the image file.
  • OriginalWidth, OriginalHeight – Dimensions of the image, used to preserve the aspect ratio.
  • Bounds – The bounds of the image tile (will change during layout).
  • Velocity – The velocity of the image tile during the layout algorithm.
  • RelativeSize – Controls the size of the image tile with respect to the others on the canvas.

The full implementation can be found on the project page for the Image Collage Generator, where I have developed a complete Windows Forms application around this concept.

The results

The implementation of the force-based algorithm for image tessellation appears to work well. It copes with small numbers of images as well as larger counts (150+ images) without significant degradation in performance or the quality of layout. Further enhancements would be required to cope with very large number of images (500+) – this algorithm was designed with fewer number of images in mind.

In general, uniform initialisation seems to produce fewer gaps than random initialisation; however, the random positioning tends to produce more organic-looking layouts. Depending on the variation in aspect ratio between the images used, one approach may produce better results than the other.

Final words

Be sure to check out the Image Collage Generator to see this force-based algorithm in action. Once again, the force-based approach lends itself well to a task which would otherwise require manual placement. Genetic algorithms may also be an alternative to the force-based approach; tessellation is a problem for which fitness criteria can be clearly evaluated – however, I would wager than a genetic algorithm would take longer to find a solution and would be more computationally-intensive than the force-based approach.

I hope you find this application of force-based algorithms useful. Perhaps you can think of other problems that could be solved in this manner?

Okay, so batch scripting may be regarded as a bit of a computing dinosaur these days (particularly following the rise of PowerShell)… but I argue that it still has legitimacy on the developer/power-user’s workstation. Batch scripts are still well-suited to:

  • Startup scripts
  • Scheduled tasks
  • Macros
  • …and they even function well as small utilities, installers, configuration wizards, etc

Of course, batch files have come a long way since their first appearance in MS-DOS and OS/2; Windows NT (and later versions) expanded the range of built-in commands that were available, as well as offering a set of console programs to broaden what could be done from the command prompt. There are still a lot of operations (e.g. various shell commands, system configuration, etc) that cannot be included in batch files, but it is possible – sometimes requiring a bit of creativity – to do some very useful things in batch scripts.

This post focuses on a few practical examples relating to the management of desktop applications; including ClickOnce apps, mutual exclusivity, running programs as Administrator, etc

Mutually exclusive applications

I keep several batch scripts on my desktop that each start a set of programs; for example, I have a developer script that launches Visual Studio, SQL Management Studio and a few third-party tools. I also have a social script that launches my twitter client, instant messenger, etc. Some of these scripts overlap (i.e. the same program is included in different scripts) and, also, I may already have one or two of these applications open when I run the batch file.

In light of these requirements, I need to start the programs on my list only if they are not already running. Thankfully, this is not too difficult to do in batch scripts:

tasklist /FI "IMAGENAME eq [process name]" 2>NUL | find /I /N "[process name]">NUL
if "%ERRORLEVEL%" NEQ "0" {cmd /c} start [path to executable]

Where [process name] is the name of the process as it appears in the Windows Task Manager; e.g. firefox.exe – and [path to executable] is either the name of the executable to run (if it falls within the PATH variable) or the full path to the program you want to run. Note that the ‘cmd /c’ is optional (see the explanation below).

So, how does this work?

  • The tasklist command is the command-line equivalent of the Windows Task Manager. By invoking it with the /FI switch, you can apply a filter. You can filter according to a range of criteria, such as the process name, user, window title, etc – here, we just want to get an exact match (using the eq operator) on the name of the process.

The output of the tasklist command looks like this:

C:\Windows\system32>tasklist /FI "IMAGENAME eq firefox.exe"

Image Name                     PID Session Name        Session#    Mem Usage
========================= ======== ================ =========== ============
firefox.exe                   2404 Console                    1    434,096 K
  • By piping the output of the tasklist command into the find command, we can determine whether there was a match; by using the /I and /N switches, we perform a case-insensitive match and place the success/failure in the ERRORLEVEL variable (which is used by most command-line tools for this purpose).
  • We don’t want to output the results of the find command to the console, so we redirect it to NUL (i.e. nowhere).
  • If the find command returns a match, the ERRORLEVEL will be zero; therefore, we only need to run the program if its value is non-zero, i.e. NEQ “0”.
  • Running a command with cmd /c ensures that the script will go on running without waiting for the operation to complete. Some programs (e.g. Firefox) will tie up the console window if started from the command line, and we want to avoid this.
  • The start command is the preferred way to run GUI applications, and can also be used to open documents and URLs – we use it here because it supports the widest variety of targets.

Starting ClickOnce applications

Unlike normal executables, applications deployed using ClickOnce are started using a bootstrapper (which is normally handled by the shell). This includes applications written in Windows Forms, WPF or Silverlight. Shortcuts to ClickOnce apps have the extension .appref-ms (rather than .lnk for regular shortcuts), and these files are not recognised by the start command (as in the previous example).

Thankfully, they can be run using the following syntax:

rundll32.exe dfshim.dll,ShOpenVerbShortcut [path to appref-ms file]

Where [path to appref-ms file] is the full path to the ClickOnce shortcut file. The path may include environment variables such as %USERPROFILE% if, for example, you want to point to a shortcut on the user’s desktop.

How does this work?

  • The rundll32 command executes a function in a Win32 DLL as if it were an executable file. You might want to use this when you have a function which is predominantly called from code (but still operates independently), if you want to avoid creating many executables when a single DLL will suffice, or to deliberately obfuscate a command that users are not intended to start.
  • dfshim.dll is responsible for much of the functionality in ClickOnce; it contains functions to install, remove, start and update applications, and is distributed as part of the .NET Framework.
  • The name of the function we want is ShOpenVerbShortcut, which is the same function that the windows shell uses to run .appref-ms shortcut files. You simply pass a path to the function and it takes care of the rest.

Running processes as Administrator

Unfortunately, when Microsoft designed User Account Control (UAC) in Windows Vista and beyond, they did not provide a convenient way to start processes with elevated permissions from the command line. This may be partly due to the anticipation that PowerShell would take over from batch scripting, however I still think it is a bit of an oversight.

The way that you would normally go about running a program with elevated permissions is to use the ShellExecute function in Win32. Instead of using a conventional verb like ‘Open’ or ‘Print’, you use the special runas verb. Unfortunately, this part of the API is not exposed to the command line, and its complex argument list prevents it from being called using rundll32 (as in the previous example).

Thankfully, there is a handy command-line tool called ShelExec, which you can download and place alongside batch scripts that need to run programs with elevated permissions. Using ShelExec, you can run a program as Administrator using the following syntax:

shelexec /verb:runas [path to exe]

Where [path to exe] is either the name of the process to execute (if in the PATH variable) or the full path to the executable file.

Putting it all together

Okay, so what if I want a script that starts my web browser, e-mail client, Visual Studio (which I need to run with elevated permissions) and my twitter client? Here’s the script:

@ECHO OFF

tasklist /FI "IMAGENAME eq firefox.exe" 2>NUL | find /I /N "firefox.exe">NUL
if "%ERRORLEVEL%" NEQ "0" cmd /c start firefox

tasklist /FI "IMAGENAME eq outlook.exe" 2>NUL | find /I /N "outlook.exe">NUL
if "%ERRORLEVEL%" NEQ "0" start outlook

tasklist /FI "IMAGENAME eq devenv.exe" 2>NUL | find /I /N "devenv.exe">NUL
if "%ERRORLEVEL%" NEQ "0" shelexec /verb:runas devenv

rundll32.exe dfshim.dll,ShOpenVerbShortcut %USERPROFILE%\Desktop\MetroTwit.appref-ms
  • We need the ‘cmd /c’ prefix in the case of Firefox, because it does ugly stuff with the console window if we allow it to use the same instance of cmd.exe that the script runs from.
  • Outlook can be started without having to specify the full path to the exe.
  • We need to use ShelExec to start Visual Studio as Administrator.
  • MetroTwit is a ClickOnce application, so we must run it using the bootstrapper.

I hope you find these tricks useful, and can apply them to save yourself a bit of time and effort.

When it comes to displaying the progress of a long-running operation to the user, there are many options available to a GUI designer:

  • Change the cursor to an hourglass
  • Change the caption on the status bar
  • Show an animated glyph on the window
  • Show a dialog box

A number of factors could affect which method you choose:

  • How long does the operation typically run for? A few seconds? Minutes?
  • Is the operation synchronous or asynchronous? If the operation is asynchronous, are there any restrictions on what the user can do while the operation is running?
  • Can the user cancel the operation?
  • Is the length of the operation known, or can it be easily estimated? Is it unpredictable?
  • Does the operation provide meaningful feedback to the user? (e.g. status messages)

ProgressDialog example

Shell Progress Dialogs provide a mechanism for displaying the progress of a long-running operation. They are suited to non-trivial operations that exceed 10 seconds, showing status messages and a progress bar. They facilitate both synchronous and asynchronous operations (though the latter are always preferable). They provide a mechanism for aborting the operation and for estimating the time remaining to complete the operation (computed using the progress percentage and time measurements taken as the operation progresses). If progress is not measurable (and the user is running Windows Vista or above), a marquee can be displayed instead of a normal progress bar.

This type of dialog box is part of the Windows shell, and is used by the operating system for a variety of purposes, from copying files to running the Windows Experience Index tests. Microsoft makes the dialog available to third party code via COM; exposing the IProgressDialog interface and a concrete implementation of the type (CLSID_ProgressDialog). This makes it easy to use progress dialogs in C/C++ applications, but what about managed code and .NET?

Instantiating COM Types in .NET

As with COM interfaces in general, the .NET Framework has a number of features to enable interoperability. In general, to instantiate a COM type in .NET, you need to:

  • Define the interface implemented by the type (in this case, IProgressDialog), based on its specification. Add any required metadata for marshalling method parameters and return types, as well as any enumerations required. Above all, remember to add the [ComImport] and [Guid] attributes to the interface definition.
  • Obtain a managed Type object for the COM type (not the interface), using the Type.GetTypeFromCLSID() method.
  • Use Activator.CreateInstance() on the Type to instantiate it. Cast the resulting object to the interface type, and you’re ready to start calling methods.
  • Don’t forget to call Marshal.FinalReleaseComObject() on the instance when you’re finished with it. A convenient way to ensure this is to place the cleanup code in a finally block, or implement IDisposable if the object is stored at instance scope in a class.

For IProgressDialog, the CLSID of the interface is {EBBC7C04-315E-11d2-B62F-006097DF5BD4} and the CLSID of the concrete class is {F8383852-FCD3-11d1-A6B9-006097DF5BD4}.

Using IProgressDialog

The MSDN Documentation for the Windows Shell API details the lifecycle of the IProgressDialog object and how to use it. In simple terms:

  • Set up the progress dialog before it is displayed to the user by calling SetTitle() and SetCancelMsg().
  • Show the dialog using StartProgressDialog(). A series of PROGDLG flags control the behaviour and appearance of the dialog box.
  • Show status messages using the SetLine() method. You can use up to 3 lines of text, 2 if you allow the estimated time remaining to be automatically calculated.
  • As you perform work, call SetProgress() to update the progress bar. Each call lets you specify the current value as well as the maximum value for the bar. The user can click the cancel button at any time, therefore you should also check its status using the HasUserCancelled() method.
  • When the operation has finished, call StopProgressDialog(). Note that, once you close the dialog box, you cannot show it again; you must create a new instance for each operation.
Type progressDialogType = Type.GetTypeFromCLSID(new Guid("{F8383852-FCD3-11d1-A6B9-006097DF5BD4}"));
IProgressDialog progressDialog = (IProgressDialog)Activator.CreateInstance(progressDialogType);

try {
    // set up dialog and display to the user
    progressDialog.SetTitle("Progress dialog");
    progressDialog.SetCancelMsg("Aborting...", null);
    progressDialog.StartProgressDialog(Form.ActiveForm.Handle, null, PROGDLG.AutoTime, IntPtr.Zero);

    // do work
    progressDialog.SetLine(1, "Working...", false, IntPtr.Zero);
    progressDialog.SetLine(2, "Please wait while the operation completes", false, IntPtr.Zero);
    for (uint i = 0; i < 100; i++) {
        if (progressDialog.HasUserCancelled()) {
            break;
        }
        else {
            progressDialog.SetProgress(i, 100);
            Application.DoEvents();
            Thread.Sleep(150);
        }
    }

    // close dialog
    progressDialog.StopProgressDialog();
}
finally {
    Marshal.FinalReleaseComObject(progressDialog);
    progressDialog = null;
}

A Managed Wrapper for IProgressDialog

Working directly with COM types is not recommended; due in part to the need to manually release resources, as well as having to work with the archaic programming model, marshalled types, structures, etc. COM types are also limited in comparison to .NET types, in that they do not (natively) support events or properties. For these reasons, I decided to create a managed wrapper for IProgressDialog.

As with other common dialog types already available in .NET (e.g. OpenFileDialog), my implementation inherits from the Component class, allowing it to be dropped onto the design surface of a Form or other component in Visual Studio. Component also has the advantage of implementing the IDisposable pattern, removing the need to write some boilerplate code.

My goals for the wrapper were to:

  • Replace method calls with properties (where possible).
  • Provide get accessors to complete the functionality of each property.
  • Replace the flag options with separate properties (to improve ease of use and intuitiveness).
  • Provide sensible default values for properties.
  • Simplify the operations by removing parameters and relying more on the object state.
  • Provide cleanup code in a Dispose() method.
  • Remove the need to re-instantiate the component for each operation.
  • Further simplify the use of the component by automatically closing the dialog when the progress reaches 100%.

With my wrapper class, ProgressDialog, you can configure the dialog at design time and thus write less code:

// set up dialog and display to the user
wrapper.Show();

// do work
wrapper.Line1 = "Working...";
wrapper.Line2 = "Please wait while the operation completes";
for (uint i = 0; i < 100; i++) {
    if (wrapper.HasUserCancelled) {
        break;
    }
    else {
        wrapper.Value = i;
        Application.DoEvents();
        Thread.Sleep(150);
    }
}

// close dialog
wrapper.Close();

Final Words

As I said in my introduction, there are many different ways to show progress in a GUI application. The real advantage of Shell Progress Dialogs is that they are rendered by the operating system (thus their appearance is ‘upgraded’ automatically on new versions of Windows, and they always fit the OS theme), and present users with a familiar and consistent interface. They’re not suitable for all applications, but I hope you find my implementation useful if you think they suit the needs of your app.

Download

ProgressDialogs.zip (9KB, includes example)