a weird for loop

In Bluefish, Programming on November 5, 2013 by oli4444

Today I had the weirdest debugging session ever. A bug was reported that I could reproduce on Fedora 19, but not on Fedora 18, not on OSX, not on Ubuntu 12.04 (tried both 32bits and 64bits), and not on Ubuntu 13.10.

The bug report mentioned that the highlighting engine did not end single line comments on the first newline. First I tracked this down to a difference in the compiled DFA table. After further debugging I found that a variable in the language file compiler was not set correctly. This variable, only_symbols was defined in a loop:

gboolean only_symbols = TRUE;
gint j;
for (j = 0; j <=127 ; j++) {
  if (characters[j] == 1 && !character_is_symbol(st,context, j)) {
    only_symbols = FALSE;

what I found: j not only loops from 0 to 127, but continues far beyond that number!?!? The function character_is_symbol() is a simple array lookup.

I even added an extra check inside the loop

if (j > 127) {

but although j reached values of 10000, this assertion was never reached????

The fix for the bug was to reverse the loop:
for (j = 127; j >=0 ; j--)

I am still completely baffled by this bug. How is this possible? Does this have to do with loop vectorization? Is this a bug in gcc? Is this a bug in my code? What is going on?

edit: thanks for your comments. It most likely was a combination of a 1-over array size (undefined) with the aggressive loop optimization that the latest gcc has (which then seems to remove the condition in the loop). Problem solved, code fix in svn!


Icons in the file tree

In Bluefish, Gnome, gtk+, open source, Programming on November 2, 2013 by oli4444

The file tree in Bluefish shows icons for the files. To do so, it has an “icon name” column in the tree model, using the special feature of the cell_renderer_pixbuf to render icons by name.

renderer = gtk_cell_renderer_pixbuf_new();
gtk_tree_view_column_set_attributes(column, renderer, "icon-name", COL_ICON_NAME,NULL);

The column COL_ICON_NAME just contains the name (as string) of the icon.

Currently, the icon name is retrieved from a GIcon. The GIcon is retrieved by asking for the property “standard::icon” in g_file_enumerate_children_async().

This means that, for every file, the code creates a GIcon object, just to get a string with the icon name. From browsing trough the glib and gio code I understand that the GIcon is searched for using the mime type, with a binary search in an array that defines all the icons.

I was wondering if I can make the Bluefish code more efficient by caching the icon names for each mime type in a hash table. This has two advantages:

  1. a GIcon object is only created once for each mime type; after we know the corresponding icon name we can do a lookup in the hash table
  2. This needs only one copy of the icon name in memory. In the treestore we can have a pointer to the string in the hash table. Currently 500 text files have 500 copies of the string “text-plain” in memory.

But does this have disdvantages? Any ideas / comments ?


Custom GtkTreeModel for a file browser

In Bluefish, gtk+, open source, Programming on October 17, 2013 by oli4444

Similar to other IDE’s and editors, Bluefish has a filebrowser in a side pane of the GUI. Previously Bluefish used a GtkTreeStore to store the icon and filename for each file/directory. To improve speed and reduce memory usage, I recently wrote a custom GtkTreeModel to replace the old store. This is the first post on the design of the custom TreeModel. The final code can be found here:

Bluefish Screenshot from 2013-10-17 14:14:23

I started with the excellent tutorial from Tim-Philipp Müller

First thing I did was modifying the tutorial code from a listmodel to a treemodel. Therefore each record has a pointer “parent”, and each record should store it’s children.

One of the design decisions was how to store children: as a linked list, or an array of records, or an array of pointers to records. Some thoughts: The linked list is faster if you have to insert an entry somewhere in the middle, or remove one (with an array this requires realloc()). The array is faster if you need the nth element (with a linked list you need to traverse the list). The array also brings another advantage: use qsort for sorting (but the g_list_ functions have a good sort function for lists as well), and bsearch for searching if an entry exists (a linked list requires again to traverse all entries to see if an entry exists). An array of records needs more memory copying during sort, insert or delete. An array of pointers needs more memory (one extra pointer for each record).

In a file tree, most directories have the same files for most of the times. Sometimes a new file is added, or a file is deleted, but most of the time the list is stable. So I chose the array of pointers. But I’m still doubting if I should have chosen an array of records.

The minimal record looks like this:

struct _UriRecord {
gchar *name;
UriRecord *parent;
UriRecord **rows;
guint num_rows;

The file browser pane in Bluefish shows icons, and can show a name in bold or normal weight. The compare function for sorting returns directories before files, so I need to know if a record is a directory or a file. Bluefish also has filtering possibilities based on filename or mime type. So these properties are all added to each record.

Because I chose an array of pointers, it is costly to find the next item in the array from a record. Therefore I added the position in the array as property. The next one is simply pos+1. This has another advantage: after sorting, the TreeModel needs to inform all listerers that the order has changed. For that you need the old and the new order. Since the old order is stored in pos this is easily done in a loop over the array.

In the Bluefish code I oten need to convert from a GFile to a position in the TreeModel and vice versa. I therefore added a hash table to the treemodel with the GFile as key and the record as value. Since a GFile is refcounted, It takes only a pointer to add it to the record as well.

At last I need a way to refresh directories when I re-read them. I could simply delete them all, and add them again, but that would also delete the sub-directories. So I added a “possibly_deleted” property that is set to 1 before directory re-read, and set to “0” if an entry still exists. After closing the directory every record that still has “possibly_deleted” set to 1 can be removed.

To reduce memory usage, I changed several properties to 16bit or 8bit values. This requires an extra shift when accessing the properties, but with many records it reduces memory usage. These are the properties that are only needed when the records change (and directories are usually pretty stable).

The resulting record looks like this:

struct _UriRecord {
gchar *name;
gchar *icon_name;
gchar *fast_content_type;
GFile *uri;
UriRecord *parent;
UriRecord **rows;
guint16 num_rows;
guint16 pos;
guint16 weight;
guint8 isdir;
guint8 possibly_deleted;

On 64 bit systems this results in: 6*8bytes + 3*2bytes + 2*1byte = 56 bytes per record.

On 32 bit systems this results in : 6*4bytes + 3*2bytes + 2*1byte = 32 bytes per record.

More about this code in a few days. Any comments how this design could be improved? Please post a comment.


The recent menu

In Bluefish, Gnome, gtk+, open source, Programming on July 4, 2013 by oli4444

The Bluefish mailinglist currently has a discussion on the working of the recent files menu. The current recent files menu in Bluefish shows the N most recent items that are not currently opened.

So if you have 15 items in the recent list, but 10 of them are currently open, Bluefish will only show 5.

Many other programs have a different approach: show all recent files, regardless if they are open or not. In the example above, depending on N, the list would either show 15 files, and only the last 5 would be actually useful (they open a file that is not open yet), or the list would show 5 files, all of them would be open already.

In a text editor like Bluefish you usually have many files open (I consider 10 files open very normal usage). So showing only files that are already open, or showing a very long list where only the last items are useful doesn’t look like a good user interface design to me.

But what to do now? Having a different behavior makes the learning curve for new users higher. What do you think is the best design for a recent files menu?


How to get good user-testing feedback

In Bluefish, open source, Programming on January 12, 2013 by oli4444

One of the powerful aspects of open source software development is the fantastic end-user engagement. Many users use the bleeding edge code straight from the version control system for production work. Because of this, bugs in new code are quickly detected and solved long before a release. In the case of Bluefish I guess that more than 60% of the bugs is detected before release.

However, because these users do no formal testing but do production work, there is no guarantee that a new feature is actually used, and thus no guarantee that a new feature is actually tested. So you cannot conclude that the code is bug free if there are no bugs reported.

I would like to improve on the information position by collecting inforation what the users have actually used.

To collect this information for example a web application could be used where users can select which features of an application have been used. If a feature has been used, the user should be able to select some additional detail. In the case of Bluefish a user could for example select that the “toggle comment” feature has been used, after which the user can give more detail: added a line comment, removed a line comment, added a block comment, removed a block comment, with a selection, without a selection, etc.

I was searching for a web application to support this, but I haven’t found one. I expected something like this to exist already. Does anybody have a pointer how to collect this kind of end user testing information?


Bluefish on the Raspberry Pi

In Bluefish, gtk+, Linux desktop, open source, Programming on January 1, 2013 by oli4444

After ordering in September, I finally received my Raspberry Pi a few weeks ago (the upside of the long time between order and delivery is that mine is the new revision with 512Mb RAM).


I have no specific plans with the device other than playing a bit around with it. One of the things I obviously had to try was to run Bluefish as editor on the Pi. Installing all the build dependencies and compiling takes a few hours, but Bluefish was running as expected. Entirely true? No, some bits were slow, most notably the auto completion popup. So I dug into the code to find out why.

In the auto-completion popup, Bluefish has a “reference pane”. This shows some rerefence information about the item you are trying to auto-complete. For an HTML tag this might show the valid attributes, for a C function it might shown the arguments and the return codes etc. This is implemented with a hash-table and the “changed” signal on the GtkTreeSelection: if the selection changes, a lookup in the hash table is performed to see if there is reference information available. On the next key-press, bluefish re-calucalates the possible auto-completion candidates, and re-fills the GtkListStore that lies underneath the GtkTreeView. And this is where the problem was: before filling the list of items, Bluefish has to clear the old items. And the selection changed signal is called for each item that is removed from the GtkListStore, which in it’s turn does a hash table lookup and renders the reference information in the reference pane. Do that for 15000 items and you’ll have 100% cpu load for a second on the Raspberry Pi.

So what is improved now: first, the number of items in the auto-completion popup is limited to 500 items. Second a boolean is added that is set to TRUE whenever the popup is clearing or filling items. As long as that boolean is TRUE, the selection changed signal will do nothing at all.


The result: even on the Raspberry Pi, Bluefish auto-completion is again much faster than you can type, and every bit of sluggishness is gone. We’re close to the 2.2.4 release, and this fix will be part of 2.2.4!


Written manual or video howto’s? (and how to make them)

In Bluefish, Gnome, Gnome shell, open source, Screencast, Ubuntu on November 4, 2012 by oli4444

In time, the number of features in Bluefish has grown to a very large number. More and more we receive requests for features that are somewhat superfluous: a similar feature is already present in Bluefish. Often if we explain the submitter how to use a certain existing feature he/she is quite satisfied. This means that some Bluefish features are (too) hard to find.

So what is  the best way to improve this? Is written documentation with screenshots the best way to introduce features to the users nowadays, or are video tutorials (screencasts) better?

From the community the support for our manual project has been declining over the last years (I must admit that I personally think manual writing is not the most rewarding work). Currently the Bluefish manual is mostly improved by a single volunteer, and therefore continuously behind. Is this a sign that written manuals are considered less important nowadays? At least the Bluefish screencasts on YouTube do well (194000 views, but is that high or not?).

So what do you think? Written manual or screencast?

Second subject for this post: what is the best way to create screencasts? Previously I used the built-in screen recorder in gnome-shell: excellent quality with a high compression. Merging the sound, however, always was lousy (tried Pitivi and Openshot): the quality decreased while the file size usually grew a lot. And it was a lot of work. But with the last Ubuntu release (gnome-shell 3.6) the feature seems to be missing (Ubuntu bug or Gnome-shell bug?). What is the best way to create screencasts (with sound) in Gnome-shell?


Get every new post delivered to your Inbox.