Archive for the ‘Programming’ Category

Statuses

Engaging developers

In Bluefish,Gnome,gtk+,open source,Programming on March 22, 2014 by oli4444

As Bluefish developer I’m not really tightly involved with gnome and gtk development. However, it is our platform, and although we ship OSX and Windows binaries, most of our users are on Linux, many of them using Gnome. We are as much affected by a gtk bug than any other gnome part. So if we find any, we try to contribute.

However, for some reason it feels that our “outside” contributions are not so welcome. I believe this is not intentional, but still this is not a good thing. So what makes me feel that these contributions are not so welcome? The lack of feedback and results.

An example: In 2012-09-05 I filed a bugreport https://bugzilla.gnome.org/show_bug.cgi?id=683388 with a patch to improve the GList documentation. I noticed that new developers often make trivial mistakes (for example appending in a loop). I got some feedback and reworked the patch, submitted on 2012-09-12. And then nothing happened anymore. For 1 year and 4 months there was a total silence. I was originally planning to work trough all of the glib documentation to make it easier for novice programmers to work with, but this result was not really motivating me to do any more work on glib. Until on 2014-01-20 the bug was closed with “fixed”. A tiny bit of documentation was added, but the majority of the patch was not committed. Perhaps it was considered low quality, but at least tell the contributer! And tell how to improve it!

A second example: Early this year a reproducible way to crash Bluefish was reported. We realized the bug was not in Bluefish but in Gtk, GtkTreeModelFilter to be more specific. Instead of just filing a bugreport with the problem, we tracked down the problem and found how to fix it, and filed that in bugzilla https://bugzilla.gnome.org/show_bug.cgi?id=722058 For us this is a critical bug. It makes Bluefish crash. It’s now three months later, and nothing has happened. I’m afraid that the next major gtk release will still have this bug, and Bluefish will remain a crappy application because it crashes because of this bug in gtk.

We should improve this! This is not the way to attract new developers! On the contrary, this does discourage new developers!

End note: I realize that I am not heavily involved with gtk or gnome. So there might be many valid reasons why the above two examples are completely wrong. Perhaps my complete feeling about the feedback and results is wrong. Perhaps my expectations are completely unrealistic. But if I have this feeling, there might be more occasional contributing developers that have this feeling. So we could at least then try to explain the true situation.

End note 2: I’m not a native English writer. So if my language was blunt or rude in any way to any person: this was really not intentional and I apologize. I don’t want to attack anyone. I don’t want to complain. I just want to send a signal that we can do better and should do better.

Statuses

a weird for loop

In Bluefish,Programming on November 5, 2013 by oli4444

Today I had the weirdest debugging session ever. A bug https://bugzilla.gnome.org/show_bug.cgi?id=704108 was reported that I could reproduce on Fedora 19, but not on Fedora 18, not on OSX, not on Ubuntu 12.04 (tried both 32bits and 64bits), and not on Ubuntu 13.10.

The bug report mentioned that the highlighting engine did not end single line comments on the first newline. First I tracked this down to a difference in the compiled DFA table. After further debugging I found that a variable in the language file compiler was not set correctly. This variable, only_symbols was defined in a loop:

gboolean only_symbols = TRUE;
gint j;
for (j = 0; j <=127 ; j++) {
  if (characters[j] == 1 && !character_is_symbol(st,context, j)) {
    only_symbols = FALSE;
    break;
  }
}

what I found: j not only loops from 0 to 127, but continues far beyond that number!?!? The function character_is_symbol() is a simple array lookup.

I even added an extra check inside the loop

if (j > 127) {
   g_assert_not_reached();
}

but although j reached values of 10000, this assertion was never reached????

The fix for the bug was to reverse the loop:
for (j = 127; j >=0 ; j--)

I am still completely baffled by this bug. How is this possible? Does this have to do with loop vectorization? Is this a bug in gcc? Is this a bug in my code? What is going on?

edit: thanks for your comments. It most likely was a combination of a 1-over array size (undefined) with the aggressive loop optimization that the latest gcc has (which then seems to remove the condition in the loop). Problem solved, code fix in svn!

Statuses

Icons in the file tree

In Bluefish,Gnome,gtk+,open source,Programming on November 2, 2013 by oli4444

The file tree in Bluefish shows icons for the files. To do so, it has an “icon name” column in the tree model, using the special feature of the cell_renderer_pixbuf to render icons by name.

renderer = gtk_cell_renderer_pixbuf_new();
gtk_tree_view_column_set_attributes(column, renderer, "icon-name", COL_ICON_NAME,NULL);

The column COL_ICON_NAME just contains the name (as string) of the icon.

Currently, the icon name is retrieved from a GIcon. The GIcon is retrieved by asking for the property “standard::icon” in g_file_enumerate_children_async().

This means that, for every file, the code creates a GIcon object, just to get a string with the icon name. From browsing trough the glib and gio code I understand that the GIcon is searched for using the mime type, with a binary search in an array that defines all the icons.

I was wondering if I can make the Bluefish code more efficient by caching the icon names for each mime type in a hash table. This has two advantages:

  1. a GIcon object is only created once for each mime type; after we know the corresponding icon name we can do a lookup in the hash table
  2. This needs only one copy of the icon name in memory. In the treestore we can have a pointer to the string in the hash table. Currently 500 text files have 500 copies of the string “text-plain” in memory.

But does this have disdvantages? Any ideas / comments ?

Statuses

Custom GtkTreeModel for a file browser

In Bluefish,gtk+,open source,Programming on October 17, 2013 by oli4444

Similar to other IDE’s and editors, Bluefish has a filebrowser in a side pane of the GUI. Previously Bluefish used a GtkTreeStore to store the icon and filename for each file/directory. To improve speed and reduce memory usage, I recently wrote a custom GtkTreeModel to replace the old store. This is the first post on the design of the custom TreeModel. The final code can be found here: https://sourceforge.net/p/bluefish/code/HEAD/tree/trunk/bluefish/src/file_treemodel.c

Bluefish Screenshot from 2013-10-17 14:14:23

I started with the excellent tutorial from Tim-Philipp Müller http://scentric.net/tutorial/

First thing I did was modifying the tutorial code from a listmodel to a treemodel. Therefore each record has a pointer “parent”, and each record should store it’s children.

One of the design decisions was how to store children: as a linked list, or an array of records, or an array of pointers to records. Some thoughts: The linked list is faster if you have to insert an entry somewhere in the middle, or remove one (with an array this requires realloc()). The array is faster if you need the nth element (with a linked list you need to traverse the list). The array also brings another advantage: use qsort for sorting (but the g_list_ functions have a good sort function for lists as well), and bsearch for searching if an entry exists (a linked list requires again to traverse all entries to see if an entry exists). An array of records needs more memory copying during sort, insert or delete. An array of pointers needs more memory (one extra pointer for each record).

In a file tree, most directories have the same files for most of the times. Sometimes a new file is added, or a file is deleted, but most of the time the list is stable. So I chose the array of pointers. But I’m still doubting if I should have chosen an array of records.

The minimal record looks like this:


struct _UriRecord {
gchar *name;
UriRecord *parent;
UriRecord **rows;
guint num_rows;
};

The file browser pane in Bluefish shows icons, and can show a name in bold or normal weight. The compare function for sorting returns directories before files, so I need to know if a record is a directory or a file. Bluefish also has filtering possibilities based on filename or mime type. So these properties are all added to each record.

Because I chose an array of pointers, it is costly to find the next item in the array from a record. Therefore I added the position in the array as property. The next one is simply pos+1. This has another advantage: after sorting, the TreeModel needs to inform all listerers that the order has changed. For that you need the old and the new order. Since the old order is stored in pos this is easily done in a loop over the array.

In the Bluefish code I oten need to convert from a GFile to a position in the TreeModel and vice versa. I therefore added a hash table to the treemodel with the GFile as key and the record as value. Since a GFile is refcounted, It takes only a pointer to add it to the record as well.

At last I need a way to refresh directories when I re-read them. I could simply delete them all, and add them again, but that would also delete the sub-directories. So I added a “possibly_deleted” property that is set to 1 before directory re-read, and set to “0” if an entry still exists. After closing the directory every record that still has “possibly_deleted” set to 1 can be removed.

To reduce memory usage, I changed several properties to 16bit or 8bit values. This requires an extra shift when accessing the properties, but with many records it reduces memory usage. These are the properties that are only needed when the records change (and directories are usually pretty stable).

The resulting record looks like this:


struct _UriRecord {
gchar *name;
gchar *icon_name;
gchar *fast_content_type;
GFile *uri;
UriRecord *parent;
UriRecord **rows;
guint16 num_rows;
guint16 pos;
guint16 weight;
guint8 isdir;
guint8 possibly_deleted;
};

On 64 bit systems this results in: 6*8bytes + 3*2bytes + 2*1byte = 56 bytes per record.

On 32 bit systems this results in : 6*4bytes + 3*2bytes + 2*1byte = 32 bytes per record.

More about this code in a few days. Any comments how this design could be improved? Please post a comment.

Statuses

The recent menu

In Bluefish,Gnome,gtk+,open source,Programming on July 4, 2013 by oli4444

The Bluefish mailinglist currently has a discussion on the working of the recent files menu. The current recent files menu in Bluefish shows the N most recent items that are not currently opened.

So if you have 15 items in the recent list, but 10 of them are currently open, Bluefish will only show 5.

Many other programs have a different approach: show all recent files, regardless if they are open or not. In the example above, depending on N, the list would either show 15 files, and only the last 5 would be actually useful (they open a file that is not open yet), or the list would show 5 files, all of them would be open already.

In a text editor like Bluefish you usually have many files open (I consider 10 files open very normal usage). So showing only files that are already open, or showing a very long list where only the last items are useful doesn’t look like a good user interface design to me.

But what to do now? Having a different behavior makes the learning curve for new users higher. What do you think is the best design for a recent files menu?

Statuses

How to get good user-testing feedback

In Bluefish,open source,Programming on January 12, 2013 by oli4444

One of the powerful aspects of open source software development is the fantastic end-user engagement. Many users use the bleeding edge code straight from the version control system for production work. Because of this, bugs in new code are quickly detected and solved long before a release. In the case of Bluefish I guess that more than 60% of the bugs is detected before release.

However, because these users do no formal testing but do production work, there is no guarantee that a new feature is actually used, and thus no guarantee that a new feature is actually tested. So you cannot conclude that the code is bug free if there are no bugs reported.

I would like to improve on the information position by collecting inforation what the users have actually used.

To collect this information for example a web application could be used where users can select which features of an application have been used. If a feature has been used, the user should be able to select some additional detail. In the case of Bluefish a user could for example select that the “toggle comment” feature has been used, after which the user can give more detail: added a line comment, removed a line comment, added a block comment, removed a block comment, with a selection, without a selection, etc.

I was searching for a web application to support this, but I haven’t found one. I expected something like this to exist already. Does anybody have a pointer how to collect this kind of end user testing information?

Statuses

Bluefish on the Raspberry Pi

In Bluefish,gtk+,Linux desktop,open source,Programming on January 1, 2013 by oli4444

After ordering in September, I finally received my Raspberry Pi a few weeks ago (the upside of the long time between order and delivery is that mine is the new revision with 512Mb RAM).

raspberrypi

I have no specific plans with the device other than playing a bit around with it. One of the things I obviously had to try was to run Bluefish as editor on the Pi. Installing all the build dependencies and compiling takes a few hours, but Bluefish was running as expected. Entirely true? No, some bits were slow, most notably the auto completion popup. So I dug into the code to find out why.

In the auto-completion popup, Bluefish has a “reference pane”. This shows some rerefence information about the item you are trying to auto-complete. For an HTML tag this might show the valid attributes, for a C function it might shown the arguments and the return codes etc. This is implemented with a hash-table and the “changed” signal on the GtkTreeSelection: if the selection changes, a lookup in the hash table is performed to see if there is reference information available. On the next key-press, bluefish re-calucalates the possible auto-completion candidates, and re-fills the GtkListStore that lies underneath the GtkTreeView. And this is where the problem was: before filling the list of items, Bluefish has to clear the old items. And the selection changed signal is called for each item that is removed from the GtkListStore, which in it’s turn does a hash table lookup and renders the reference information in the reference pane. Do that for 15000 items and you’ll have 100% cpu load for a second on the Raspberry Pi.

So what is improved now: first, the number of items in the auto-completion popup is limited to 500 items. Second a boolean is added that is set to TRUE whenever the popup is clearing or filling items. As long as that boolean is TRUE, the selection changed signal will do nothing at all.

2012-10-30-041644_1920x1200_scrot

The result: even on the Raspberry Pi, Bluefish auto-completion is again much faster than you can type, and every bit of sluggishness is gone. We’re close to the 2.2.4 release, and this fix will be part of 2.2.4!

Statuses

Generating language definition files from xml schema’s

In Bluefish,open source,Programming on August 27, 2012 by oli4444

The Bluefish syntax scanner / auto-completion engine can deal pretty well with xml style languages. You can create a language definition file such that only the correct attributes for a tag are highlighted and auto-completed, and only the correct hierarchy of tags is highlighted / auto-completed. Not everything is supported however: entities are for example not expanded, and the syntax scanner is not tag-order-aware (and xml is tag-order-aware).

A while ago Daniel Leidert came up with the idea to generate language definition files on demand from DTD’s, relax-ng or xml schema files. I started prototyping some ideas, and I have some basics scripts in python now that can handle simple DTD’s and slightly more advanced relax-ng files. However, the scripts are not good enough to parse the XHTML DTD and XHTML Relax-NG definition (yet).

For the impatient, the current scripts can be found here: http://bluefish.svn.sourceforge.net/viewvc/bluefish/trunk/testcode/conversion_scripts/

Statuses

Improvements for visually impaired people

In Bluefish,Gnome,gtk+,open source,Programming on April 29, 2012 by oli4444

Last week I received an email if Bluefish could be improved for people with a visual impairment. I never occurred to me that there would be people with limited vision wanting to use Bluefish. The most requested features in the email were:

  1. Zoom in/out with ctrl+ / ctrl-
  2. Maximum screen estate
  3. Better cursor visibility

The first feature was easy. Bluefish  already has zoom with ctrl-mousewheel, so I added the accelerators (it turned out that the requester was not aware of this feature).

For the second feature I created an option that automatically hides all menu bars, status bars and toolbars on fullscreen (F11). It displays them again if you hit F11 again. This way basically every bit of the screen is used by the editor itself. The only issue I found is when LXDE is used. LXDE has bound F11 to the window-manager fullscreen, so the application fullscreen never gets called. I moved my code to the configure event handler, where I can detect both the internal fullscreen as well as a window manager fullscreen.

The third feature was the hardest bit. With some help from IRC I managed to make the cursor-aspect-ratio user defined.

In gtk2 it looks like this:

style "bluefish-cursor" {GtkWidget::cursor-aspect-ratio = %f }
class "GtkTextView" style "bluefish-cursor"

which is loaded with gtk_rc_parse_string()

In gtk3it is slightly nicer:

GtkTextView {-GtkWidget-cursor-aspect-ratio: %f;}

which is loaded with gtk_css_provider_load_from_data() and gtk_style_context_add_provider()

Next to a bigger cursor I made a setting to highlight the cursor position: it paints a differently coloured background on the character left and right of the cursor. I connected that to the mark-set insert-text and delete-range signals, the last two with g_signal_connect_after() to get the new location of the cursor and not the old location.

This code does have quite a performance impact: scrolling with the arrow keys is significantly slower with this option enabled. I used this code:

     gtk_text_buffer_get_bounds(btv->buffer, &it1, &it2);
     gtk_text_buffer_remove_tag(btv->buffer, btv->cursortag, &it1, &it2);
     it1 = *location;
     it2 = it1;
     gtk_text_iter_backward_char(&it1);
     gtk_text_iter_forward_char(&it2);
     gtk_text_buffer_apply_tag(btv->buffer, btv->cursortag, &it1, &it2);

What this code causes is an update the internal structure of the GtkTextBuffer (probably something like a balanced tree) that keeps track where each tag starts and stops – for every cursor move. After rethinking this I remembered this is much easier done in the expose event!

get the coordinates with gtk_text_view_get_iter_location(), convert them with gtk_text_view_buffer_to_window_coords() and paint with cairo_rectangle() and cairo_fill():

   gtk_text_buffer_get_iter_at_mark(buffer, &it, gtk_text_buffer_get_insert(buffer));
   gtk_text_view_get_iter_location(view,&it,&itrect);
   gtk_text_view_buffer_to_window_coords(view, GTK_TEXT_WINDOW_TEXT
            , itrect.x, itrect.y, &x2, &y2);
   cairo_rectangle(cr, (gfloat)x2-width, (gfloat)y2, (gfloat)(width*2 )
            , (gfloat)itrect.height);
   cairo_fill(cr);

The result is visible below. So now it is test time!

Statuses

Debugging a reference count bug

In Bluefish,gtk+,open source,Programming,Ubuntu on February 5, 2012 by oli4444 Tagged:

Last days I have been debugging some weird reports. They all show the same characteristics:

  • the users are on Ubuntu 11.10
  • they use bluefish compiled against gtk 3.2 (so not the bluefish package that is provided by Ubuntu, but a newer one)
  • in the Bluefish run the sort function of a GtkTreeModelSort is called after the GtkTreeModelSort should have been finalized and free’ed.

First I used gobject-list.c from http://people.gnome.org/~mortenw/gobject-list.c to see all refs and unrefs on all GtkTreeModelSort objects in Bluefish (luckily there is only 1 used in Bluefish).This showed that there was indeed a GtkTreeModelSort with lots of references left after it should have been finalized. I tried the same thing on Fedora 16 (also gtk-3.2), but it can only be reproduced on Ubuntu 11.10.I tried to get backtraces with gobject-list (which uses libunwind for that) but those backtraces turned out to be useless.

Luckily I received some help on IRC #gtk+ from Company and alex. The first idea was to use systemtap, but since there is no useful kernel for systemtap available for Ubuntu I had to use something more low tech suggested by Company:  I set a breakpoint on gtk_tree_model_sort_new to retrieve the pointer of the GtkTreeModelSort. Once I got that pointer I could set a breakpoint on g_object_ref and g_object_unref with a condition on this pointer. Then I created an automatic backtrace on each breakpoint:

break g_object_ref if object == 0x123123123
commands
bt
c
end

I configured gdb to log everrything to a file, and did a bluefish run. This resulted in a 2.1 Mb logfile with backtraces. This log also showed there were more refs than unrefs.

In this logfile there were a lot of similar backtraces, with an identical function doing a ref and an unref. I wrote a short python script to parse the backtraces and skip all ‘valid pairs’

After this step I had only 15 backtraces left. And from these backtraces the leaking references were easily identified.

Because I was unsure if this is a Ubuntu specific bug or a generic gtk bug the resulting bugreport can be found both at https://bugzilla.gnome.org/show_bug.cgi?id=669376 and at https://bugs.launchpad.net/bugs/926889

Now I am wondering if this approach would work for any reference count leaking problem. I guess the most difficult issue is to find the value of the pointer that is leaking if you have many objects of the same type.. Any suggestions how to do this?

Follow

Get every new post delivered to your Inbox.