Archive for the ‘Linux desktop’ Category


New desktop GUI’s and terminal servers

In enterprise,Gnome,Linux desktop,open source on May 2, 2011 by oli4444

Gnome shell, but also Unity, make extensive use of modern video hardware possibilities. Which is a good thing. The downside is that they do not function anymore without access to the modern video hardware. In an organisation that uses thin-clients and terminal servers over a wide area network this becomes a bit of a problem. Protocols like NX (Nomachine) and VNC (many products, such as ThinLinc) that can handle the high latencies on wide area networks do not provide access to these functions of the video hardware.

This means that the thin-clients are limited to the “fallback” gnome desktop. But how long will that be maintained? When will the first open source product decide to drop support for the oldfashioned gnome desktop? What if Empathy or Networkmanager will not work anymore with the fallback desktop? Does that make our thin-clients worthless?

What is your strategy regarding thin-clients?



Storage and backup for desktops and laptops in the enterprise

In enterprise,Linux desktop,open source,security on February 17, 2011 by oli4444

For desktops it is generally considered a good idea that all user created data is stored on a NAS on which backup and restore is implemented.

For Linux desktops NFS is commonly used. However, NFSv3 is usually not acceptable because in large organisations there is too little control over IP adresses. So NFSv4 with Kerberos authentication is the answer. Large organisations also tend to have large networks, so latency is another factor, and again NFS4 (with the delegation feature) allows better client side caching. There is also FS-Cache/CacheFS that does a lot more caching on clients, but it does not improve performance in all situations (if bandwidth is not an issue don’t use it).

But now laptops. What you would like for laptops is the situation where the users work locally with their data, but whenever they have a network connection the data is synchronised to the enterprise NAS. That way they can disconnect their network at any time and continue working. There is the OFS (offline file system) that works on SMB network file systems, but that seems to be not completely mature yet. A second problem with laptops is authentication. A user may want to log on locally without network, and then connect the laptop to the network and expect it to start sychronising data. But that won’t work unless we first get our Kerberos ticket. I wonder what Windows laptops do in this situation, would they cache the password and re-use in the background to obtain a Kerberos ticket? Related to this: you need a feature sometimes called “cached credentials” to allow you to log on locally if your kerberos/ldap server is not available. There are some projects trying to adress this, but this is also still not well integrated yet.


Printing in the enterprise

In enterprise,Linux desktop,open source on January 27, 2011 by oli4444

In an enterprise organisation there might be 10000 to 100000 users, and 1000 to 5000 printers. There are a couple of tricky things in such a situation.

First is print queue scalability. Converting the print job to the right format for the right printer is quite CPU intensive. If you let the desktop handle this it scales nicely with all your desktops, but if you want the server to handle this it becomes a scalability nightmare.

Users need to select their printer. Showing a list with 5000 printers doesn’t help the user, he wants to search by location, by name, by department etc. Worse: showing 5000 printers and trying to show their status (as system-config-printer does) will eat 100% cpu.

Finally you need to configure the printers. How do you deal with 5000 printers on 50000 desktops?

If anybody has a good primer how to do such things with for example CUPS please leave a comment below!


Theoretical performance and real scalability

In Bluefish,Gnome,Linux desktop,open source,Programming on December 22, 2010 by oli4444

A theoretical scalable design can be implemented with limitations, as I found out today. while fixing

The file referred to is a 12Mb XML file, with about 200000 XML blocks.This file showed two problems in the bluefish editor widget implementation.

1) 16 bit limit overflow

The bluefish editor widget used a 16 bit integer (a glib guint16 type) to keep the reference count of found blocks and found context changes. As you can image, the reference count overflowed on this XML file with 200000 blocks.

The solution: use 32 bit integers.

2) clearing GtkTextTag’s

Every 100ms scanning run bluefish starts by clearing any leftover GtkTextTags from old syntax highlighting. However, the GtkTextView widget uses > 100ms to clear the formatting for 12Mb of data. And thus the scanning for syntax didn’t even start (the total loop may take 100ms). The syntax scanning thus never finished.

The solution: only clear old syntax highlighting once, and use a boolean to track when we have to clear old syntax highlighting again. The first run it only clears old syntax highlighting, but the next run immediately starts scanning new syntax.


Scale of the enterprise desktop – users, accounts, groups and permissions

In enterprise,Gnome,Linux desktop,open source on November 22, 2010 by oli4444

Large corporations have many employees in many departments. And many of them will have an account. 16 bits for the UIDnumber is not big enough for some enterprises (but luckily the kernel handles 32bit numbers fine – but does your app?). All those employees in different projects and different departments means there are lots of different authorizations, meaning lots of groups, again possibly beyond the 16bit limit. And you may guess that the traditional scheme with owner/group/others might not do it – ACL’s are needed.

what does that mean for a GUI? For example a GUI to set file permissions:

  • Ever thought of a dropdown with groups or users? Does that work with 50000 groups or 70000 users?
  • Does it have a search field to select the right user/group?
  • Does it display ACL’s in the GUI?

You can image that lots of users also means lots of users that forget their passwords. One solution to that is kerberos. Log on with your password, you receive a kerberos ticket, and you log on to every service using your kerberos ticket – never using your password again. Or better: logon using your PKI smartcard (with pkinit), you receive a kerberos ticket, and you never use a password at all! But this implies that all clients and all services support kerberos. The basics work well with Linux. Kerberos init on logon works and firefox understands it (so most internal web servers will work). But what about instant messaging (empathy?), voip and email clients? Lets make it worse: log on with dual factor authentication: a PKI smartcard with PIN code. Again the basics work, pkinit works perfectly on Linux, so you get a Kerberos ticket using your PKI smartcard. And even programs like gnome-screensaver can ask for your PIN code instead of a password. But GDM doesn’t understand it completely, you’re asked for a username while you enter your smartcard (that’s already passed with your certificate!). And your default gnome keyring won’t unlock anymore without a password (would be great if we could unlock it with the PKI certificate as well!).

  • does your app work with Kerberos?
  • will it work with dual factor authentication?

To manage a situation with this number of users, accounts and groups will be in a directory server, probably LDAP. In large enterprises all accounts are mostly in one level in the directory server. Smaller organizations sometimes try to organize accounts in their departments, but in large organizations there are so many people that move around to different departments, so many people that work in multiple departments, that they usually keep the departments as attributes in the account, and keep all accounts in one level. So what can go wrong. Image a ldap browser that lists all accounts per level: listing 50.000 of them won’t fir on your screen, and probably will take ages to load. You would think that most ldap browsers are designed for these situations, but they almost all suffer from this problem.

  • can your app handle 50.000 results on a ldap query?

The good thing about the LDAP server is that all users have the same account on all systems, with the same permissions, same address, etc. So once you know the email address, you know their jabber and voip account as well. But oh: my email client knows how to look up names in a directory, but my jabber client doesn’t. And I cannot start a VOIP call from my email client – even if I know that the address is the same, I have to copy & paste it into another program.

  • does your app support ldap directory lookups?

So there is some room for improvement here. And don’t get me wrong – I really like it that most things already work out of the box and how easy this is. It’s just the small things that could be improved.


The enterprise open source desktop – introduction

In enterprise,Linux desktop,open source on November 1, 2010 by oli4444

Many open source developers have never worked with large enterprise / corporate desktops. There are some important differences between a typical home or small business desktop and a enterprise / corporate desktop. Although there is noting wrong with that (you probably don’t like those desktops anyway), it might be good to know the differences, and use this knowledge when designing new software – and make it suitable for both the home and small business user and the large corporations. I’m not telling that open source is not good for them – I know there are workarounds for every item I’m going to mention – I’m only telling some things could be better!

Large enterprise corporations spend millions of dollars on desktops and have large IT operations, so they are a prime candidate for cost reduction using open source software. Enterprise corporations often see their desktop just as costs (and no benefits). The benefits are in the applications, being SAP, IBM Filenet, and the in-house developed application that has been there for 10 years already. If the open source desktop can present those most important applications this is a cost reduction without effect on the benefits – then we have a business case.

Anyway, I’ll try to do a series of posting about the most important characteristics of the enterprise desktop. Some points I will be talking about:

  • scale – many users, many groups, many systems, many administrators, many departments and sub-departments
  • security, authentication, authorization, lockdown
  • consistency
  • networking, latency, bandwidth
  • support (costs), self-service


Bluefish editor widget design

In Bluefish,Gnome,Linux desktop,open source,Programming on August 14, 2010 by oli4444

Many people ask why Bluefish developers chose not use the GtkSourceView widget. The reason: we hoped we could build an editor widget better suited for development work. If we succeeded? Try for yourself.

Before I start with the design, let me briefly describe the old editor widget design:

  • used libpcre regular expression compiler/matcher
  • every type of match (e.g. keyword, function name, etc.) needed it’s own regular expression pattern
  • because libpcre regular expression patterns are limited in size, some of these were split up in tens of regular expression patterns (for example all php function names)
  • every piece of text thus had to be scanned for matches up to 50 times (each pattern needs a separate matching run)
  • large file scanning could block the widget for a couple of seconds
  • patterns could be bound to scan only within the match of other patterns – having some kind of context sensitive matching (e.g. javascript within html), but not very flexible
  • there was no way we could re-use the scanning for autocompletion

So now the new design. Let’s start with the design requirements:

  1. syntax highlighting should be pretty fast; the user continuously changes the syntax while typing, and the highlighting should keep up with that
  2. syntax should be defined in a language file; new languages can be added and language files can be updated without a change in the scanning engine
  3. the language file should not contain any highlighting colors, it should map to textstyles that the user can define, such that all languages have a similar look
  4. the syntax scanning should support all kinds of languages, markup such as html and xml, and programming languages like javascript and php, and it should be capable of handling thousands of patterns.
  5. the syntax scanning should be context-aware (in a comment? in a php block? in a CSS block?) and block-aware (<p> opened, <b> opened, </b> closed etc.)
  6. the widget should allow context-aware autocompletion
  7. scanning large blocks of text should not block/freeze the gui

We have one additional constraint: Because we wanted to use GtkTextView as the base class the actual highlighting cannot be done in a separate thread or in the background (we have to set GtkTextTag’s from the main thread).

This resulted in the following high-level design:

  • we use a DFA engine to scan syntax because it is very fast (O(n) on the input data) independent from the number of patterns (O(1) on the number of patterns)1
  • because we want to scan context-sensitive we compile a DFA table for each context
  • the complete DFA is in a single continuous memory block to maximize CPU cache and minimize memory paging effects
  • for each context we also compile a GCompletion with all possible autocompletion strings in that context
  • all language file parsing and compiling is done in a separate thread so we exploit the possibilities of multi-core computers
  • we keep a stack of contexts and a stack of blocks during the scanning run
  • we scan for syntax in short timeslots that block the UI, but after the short timeslot we return control back to the gtk main loop.
  • we mark all text that needs scanning with a GtkTextTag, and thus we can quickly find where we should start scanning (the first bit that is marked as needscanning)
  • on text change we simply mark the changed area with the needscanning GtkTextTag and resume scanning
  • we should thus be capable to resume scanning on any given position
  • that means that we should be able to reconstruct the block-stack and the context-stack at any given position
  • a very fast way to look-up a given context-stack and block-stack at a given position is if we keep them in a balanced-tree which scales O(log n) on the number of stored positions. But we are in a worst-case situation for normal binary-tree’s: we insert data in sorted order. Glib has a nice Treap implementation that we use that is much better when data is inserted in sorted order 2
  • for autocompletion we look-up the position in the balanced tree, peek at the context stack to get the current context, and use the corresponding GCompletion to find the possible strings

Because of this design we get an additional bonus: we can look-up found syntax by a position very cheap (because of the balanced tree) and we can link additional information to that structure -> showing a tooltip when you move the mouse over a function is thus possible as well!

Some results on a 64bit 2.7GHz AMD CPU:

  • using the HTML language file (with 2575 patterns) on several fairly large HTML files the syntax highlighting parses about 1Mb per second. More than 50% of the time is spent on setting GtkTextTag’s in the GtkTextView widget, so the GtkTextView widget clearly is our main bottleneck with the new scanning engine and the scanner itself runs over 2Mb/s.
  • using the PHP language file (with over 7500 patterns) on some fairly large PHP files, the syntax highlighting parses about 500Kb of PHP per second. These files have have much more colors, and settings GtkTextTag’s in these files takes 75% of the time. The scanner itself still runs over 2Mb/s, confirming that our DFA engine performance is independent from the number of patterns

1: actually this is not completely true. It is only O(1) on the CPU, but more patterns mean more memory usage, which means that the cache size of your CPU comes into play, and virtual memory/paging comes into play.
2: because we do relatively few searches and lots of updates on the balanced tree it would be interesting to see if a red/black tree would perform better