November 29, 2005

Joyent and TextDrive

Filed under: General, News — Dimitris Giannitsaros @ 13:40

Now I don’t really have any real business experience of this kind, but it buffles me why Joyent bought TextDrive and not the other way around.

I understand the two companies share founders / owners /executives, so I make the assumption that this was a strategic decision, not a financial one. But TextDrive has a well known brand while no one knows Joyent (at least for now). It seems more logical to me for a hosting company to offer software products (Joyent’s products, Strongspace etc) than the other way around. Moreover Joyent’s products are still untried, so what happens if they don’t do well? Wouldn’t that hurt TextDrive’s name?

Anyway I guess these people know what they are doing, but I would like to read a better analysis on this.

November 27, 2005

Backups revisited

Filed under: General, Products and Services — Dimitris Giannitsaros @ 20:19

In addition to the tasks I described here, I am now testing Acronis TrueImage, to create complete images of my hard disk. So, in case of a hard disk failure I would be up and running in no time (at least in theory).

However TrueImage makes me feel a bit uneasy, as it lets me continue working while it creates the image! Anyway, I am quite sure TrueImage works correctly, as it gets rave reviews everywhere and everyone recommends it.

Also I’ve made up my mind that my next notebook will have two hard disks and a RAID controller (like this one), so I can have mirroring.

November 23, 2005

Moved features

Filed under: MagnaCRM — Dimitris Giannitsaros @ 21:45

I wrote a draft of this post some time ago titled “To do or not”. Now I’ve finally made up my mind. These features are moving to a later Magna CRM release:

  • Mail client and mailbox integration: This will probably be one of the first to implement after I release v1.0. The point is to archive mail you exchange with your clients / contacts. This is not very simple to do and requires a full blown mail client (for writing / answering an email), and integration with a mailbox (to read mails that go to specific mailboxes e.g. sales@company.com). Integration with Outlook would be nice too ;-)
  • Custom fields / form designer: A CRM application is first and foremost a data repository. So, it will be needed at some point to let the user define custom fields, design his forms the way he wants them (placing important fields on top etc) and of course get reports based on these custom fields.
  • Labels / Tags / Keywords: Because with tags I can advertise my application as Web 2.0 compliant ;-) Well, the real reason is that tagging accounts, contacts, leads etc seems like a good way to organize them. Of course I would have to try it to see how it feels, especially if custom fields get implemented (as you can certainly organize things using more and more custom fields).
  • API: I really believe an API is important. It’s also hard to get right, as I found out while experimenting. Although an incomplete API will probably be included in v1.0, the full API will have to wait some more.

I really want to finish Magna soon, even if it means I have to cut some (many) features. And I still have a lot of work to do before v1.0 is reached.

November 20, 2005

DB performance and ADODB

Filed under: MagnaCRM, PHP — Dimitris Giannitsaros @ 23:42

One of the things to take into consideration when developing a multi-user application is the Database performance.

When developing PHP applications, I always use ADODB to provide a Database abstraction layer. More specifically I use my own DB layer, build on top of ADODB, but I only do this is to make it easy to replace ADODB if the need ever arises (e.g. if it stops being maintained).

ADODB has a great module (part of the standard ADODB package) called Performance Monitoring Library. This little gem provides some great functionality and I suggest you check it out.

An SQL logger and analyzer is provided, among other things. The SQL logger, when enabled, stores all executed queries in a DB table, along with some interesting data, like the time it was needed to execute the query. The SQL analyzer gives you 3 extremely useful reports:

  • Suspicious SQL: The queries with the highest execution time
  • Expensive SQL: The queries with highest total execution times (number executions * average execution time).
  • Invalid SQL: Queries that produced an error.

During development I leave the SQL logger constantly on and use the analyzer reports to either optimize my queries or find ways to reduce the number of executions. Warning: Be careful not to leave the SQL logger on by default, because in real installations it can reduce performance. Also the analyzer will take a very long time to analyze more that ~500k queries. Anecdotal: I once left the SQL logger enabled in a health related project running in a hospital (~40 concurrent users, 24×7) for about a week. The analyzer never returned any results for the ~5 million records it had logged. So I had to discard the logged queries and leave it running for exactly 1 day.

Since I’ve made extended use of the Perf library with some really good results in the past, I have integrated it into Magna CRM. Through the use of a single flag (DEBUG_LEVEL) I can do various things like enabling the SQL logger or displaying the SQL analyzer screens to an administrator. I will leave this in, in case I ever need to find what’s wrong with a specific installation that has performance issues.

November 17, 2005

CaseDetective released

Filed under: Links, News — Dimitris Giannitsaros @ 13:16

Another mISV released their product! CaseDetective is a desktop reporting tool for FogBugz, Fog Creek’s bug tracker.

CaseDetective was developed by Ian and you can read more about it’s development process on his blog.

Congratulations!

November 14, 2005

Internationalization (i18n)

Filed under: PHP, Web development — Dimitris Giannitsaros @ 10:42

In order to offer localized versions of an application, there are two things to be taken care for:

  • Internationalization (i18n): this is more development centric and it’s all about designing your application to support translations, foreign character sets, timezones, different number / currency / date formats etc.
  • Localization (L10n): using the mechanism provided by i18n, this includes the actual translation and settings for a new language.

On this article I focus mainly on the translation part of the i18n process, although many other subjects are touched.

There are two approaches for translating strings:

  • Using a function to wrap your strings in your code. I call this the “gettext” approach (also check PHP gettext support). Of course it’s not the only tool that offers this functionality, but it supports many languages and it’s open source.
  • Using constants or variables instead of strings in your code. One or more language files contain the actual strings and these files are included by your application.

Lets see some general facts on these two approaches:

The “gettext” approach:

  • Is more complex.
  • Is suitable for large projects with thousands of strings.
  • Allows strings to be stored in a DataBase or an efficient data structure.
  • May utilize a special translating program.
  • May offer a better management of deprecated and changed strings.
  • Offers better domain management (think of domains as logical areas of the application e.g. the administrator’s area, the user’s area etc)

The “language file” approach:

  • Is significantly simpler.
  • Is better for smaller projects.
  • Stores strings in text files, so editing is easy for anyone.
  • Doesn’t force any good habits upon you, so you have to be careful.

I’ve used both approaches in web and desktop applications (most notably Cheez, a free image cataloguing tool, currently translated in 17 languages). The “language file” approach is my favorite, so what follows is some advice on using this approach:


Language scope

For multiuser applications, you must consider whether all users will have the same language (so language is a system setting) or each user will choose his language of choice (so language is a user setting).

The same rules apply to both cases, but you’ll need to make some design decisions based on that.

Single file vs. many files

It’s better to use a single file instead of many files. If this is becoming a really big file, maybe you should check the “gettext” approach. Exception: if your application supports plug-ins (e.g. user created stuff), make sure you provide a mechanism where each plug-in has its own language file (preferably a single file per plug-in). You don’t want to litter your core language file with strings specific to plug-ins.

For an example of why multiple files can get out of hand check osCommerce, which uses about 80 files per language in many directories. Moreover each plug-in is allowed to have many language files, making things even worse. Note: Other than this inconvenience, osCommerce is a very good and popular shopping cart solution.

Images

Handling localized images can be a bit tricky. You have at least 3 options:

  • All images have the same name, but are placed in different directories (/english/, /greek/, /german/). Your code uses the right directory based on a string included in the language file.
  • Images have different names, using a standard prefix / suffix (icon_delete_ENGLISH.png, icon_delete_GREEK.png).
  • Images can have any name as long as it’s defined in the language file with the rest resource strings. Your code loads images using the appropriate string. Of course it’s a good idea to use a naming convention for images (e.g. a suffix).

Personally I prefer the 3rd option. This way your code doesn’t do anything different than for normal strings, plus translators know what images must be changed just by looking the language file.

Other i18n settings

Be careful what i18n settings go into the language file. This can be a problem especially for web applications, where users can be anywhere in the world.

Many projects I’ve seen put things like the date / number / currency formats and timezone in the language file. It’s much better to have these as user settings: just because a user prefers a specific language e.g. Greek or German, it doesn’t mean he’s currently based in Greece or Germany.

Of course, which settings should go in the language file and which are made available as a user setting depends a lot on what your application does, what kind of users it has etc.


Charset

If your application supports Unicode you probably need only one charset (e.g. UTF-8), so you can skip this paragraph.

Charset is very important for two reasons: a) It allows users to correctly view localized characters. b) It allows users to correctly enter (and store) localized characters. The first thing that comes to mind is to put charset in the language file.

This is usually good enough, but has a small problem: Imagine your web application currently supports English. Inside your language file you’ve set a variable for the charset (e.g. $charset=”ISO-8859-1″) which you use for defining the charset of the html files. Imagine a Greek installs your software and tries to use it. Although he knows English, he would also like to insert data in Greek. Since you have tied the charset (ISO-8859-1) with the interface language (English) he can’t! If he could change the charset to “ISO-8859-7″ he would be able to enter and view Greek text (and of course the English strings of the UI would be displayed correctly).

I am not arguing that it’s always the right thing to offer charset as a user setting. Just remember that the language file’s purpose is to have the localized resource strings, without interfering with the way the application works.

Good resource strings

Some general guidelines for creating good resource strings:

  • Be careful to use complete sentences as resource strings while coding. Concatenation must be kept to a minimum and special language functions must be used to format strings (printf(), sprintf() for PHP).

    So instead of

    $location . " contains " . $count . " files";

    which needs 2 resource strings and the translator doesn’t see this as a complete sentence, use

    sprintf("%s contains %d files", $location, $count);

    which needs 1 resource string and actually makes sense to the translator.

  • If support for argument ordering is available, use it (PHP has it). So the above example would become:

    sprintf("%1$s contains %2$d files";

    which needs 1 resource string and the translator can change the argument order e.g.

    "%2\$d files are contained by %1\$s"

  • Try to keep related sentences in one resource string. “Command failed. Abort or retry?” should be one resource string, not two.
  • Sometimes it’s best to use different resource strings for the same word / phrase. This is hard to get right, because as a developer you don’t know which words have many different meanings in other languages. One solution is to use a different resource string for all strings. So if your application uses the word “Save” 28 times, then you have 28 different resource strings for “Save”. This can be extra work, both for you and the translator, but guarantees a better translation quality level can be achieved.

    The best solution is somewhere in the middle. This way simple words (”yes”, “no”) can be mapped to a single resource string, while more complex words (”execute”) have a separate resource string for each occurrence.


Resource strings naming convention

Obviously it’s a good idea to have a common prefix for resource strings (e.g. lc_). The rest of the name can be either an increasing number or a description:

$lc_res1
$lc_res2
$lc_res3
$lc_res4

or

$lc_yes
$lc_no
$lc_execute1
$lc_execute2

Although the 2nd group seems much clearer, after about 1000 strings it becomes difficult to think of good descriptive names and you end with things like

$lc_warn_user_after_failed_sql_execution_offer_to_retry

Domains

If you want to logically separate the resource strings based on different areas / parts of your application you may be tempted to use multiple files. I believe it’s always better to keep to one file and just use some comments for domain separation. So you can have:

// Admin area

$lc_admin_res1 = “”;
$lc_admin_res2 = “”;

// User area

$lc_user_res1 = “”;
$lc_user_res2 = “”;

Versioning

This is the single most important advice on this article.

After you release your first public version, you must never again change a resource string. Even if you find a typo or something bad you wrote about your boss or wife.

Both new and changed resource strings go at the end of the file. Moreover it’s good to keep a comment about each version:

// Version 1.0

$lc_yes = “Yed”;
$lc_no = “No”;

// Version 2.0

$lc_yes = “Yes”; // correction
$lc_new = “New”

// Version 3.0
(v3.0 resource strings will go here)

Note that the $lc_yes resource string was corrected in the new version, while the old resource string stayed the same.

There are a number of reasons to uphold this policy:

  1. Translators have a much easier job with new versions. They just go to the end of the file to find new / changed strings. Changed strings are marked with “correction”, so they can find the old translation by searching. No need to use tools like diff, to try and find what has changed between versions.
  2. You can have a single file per language that works for all versions. If you translate the v2.0 file, you can send it to someone using v1.0 and he’ll have no problem.
  3. When you release a new version (e.g. v3.0) you may not want to wait for translators to translate the new strings. So you just copy-paste everything under “Version 3.0″ from the original language file to all other languages files and you’re good to go (so for foreign languages, old strings will remain translated while new strings will be untranslated).

Used images are under a CC license. See here:
1st image, 2nd image, 3rd image, 4th image

I was writing an article on PHP and timezones (promised here), but then I changed my mind and decided to write and publish this one first. The one about timezones will be next.

November 10, 2005

State saving

Filed under: MagnaCRM — Dimitris Giannitsaros @ 21:16

Magna CRM has a lot of lists. Actually the main interface consists of tabs (Accounts, Contacts etc) that take you to corresponding lists. These lists can be navigated using filters, a letter index (All, A, B, C, …) and paging.

I implemented state saving for these lists. So, if you e.g. set some filters, go to page 4, click another tab (or insert a record or close the browser) and then return to the same list, you will be at the exact same point (same filters, same page). This was applied to all lists in the application.

While it’s a very nice feature in general, it has a small problem: you sometimes get surprised when a list comes up with preset filters and all, because you have forgotten when you did it. I can’t think of any clever way to avoid this (timing out the state information is not a good solution), not that it’s a big deal.

November 7, 2005

Better error feedback

Filed under: MagnaCRM, Web development — Dimitris Giannitsaros @ 15:38

I use Javascript to give feedback on missing required fields and invalid data (bad numbers, wrong dates, out of range values etc). Before sending user data to the server (where they are validated again of course), a function checks all fields for errors and if any are found, it reports them to the user and sets the focus to the first erroneous field.

So the user would see a javascript alert with something like this:

-Title is missing.
-Start date is missing
-End date is invalid
-Quantity is not a valid number

After clicking OK, focus would move to the “Title” field. The problem here is that the user must remember all other errors or press “Save” again to see the list of remaining errors.

So I did a minor change to the validation functions, so that the erroneous fields’ border color changes to red. This is much better but lets say the user fills the “Title” and hits “Save” again. The title’s border color must be changed back to what it was. The easiest way to do that was to set the border color property to an empty string. This removes the red color and redraws the field using the appropriate CSS. So most checks use this approach:


if (elem.value == '') {
err_txt += ('Missing value\n');
[some irrelevant stuff]
elem.style.borderColor = ‘#aa0000′;
} else {
elem.style.borderColor = ”;
}

I am not sure the empty string is the “correct” way to do this, as the w3c DOM spec doesn’t define what happens in this case, but it works in IE, Mozilla and Opera.

November 3, 2005

Internationalization issues

Filed under: MagnaCRM, PHP — Dimitris Giannitsaros @ 13:23

One thing I had to take care while developing Magna was i18n / L10n issues.

With i18n I mean the support for different time zones, date and number and currency formats (what Windows refer to as Regional Options). With L10n I refer to the actual steps needed to offer a translated version.

I want to make sure Magna works in any environment (because your host may be located anywhere in the world) and with at least 3 different databases (Access, SQL Server and MySQL). So, I rolled my own solution, learning a lot in the process. The biggest problem was probably time zones (and DST - daylight savings). The obvious solutions was to keep things in the database in a common format and encode / decode values, based on user settings, when reading or writing to the database.

The results seem good and now each user can select these regional options:

  • Timezone
  • Date format (DDMMYYYY, MMDDYYYY etc)
  • Date separator
  • Week first day (Sunday, Monday etc)
  • Number decimal symbol
  • Number grouping symbol
  • Number decimal digits
  • Currency symbol
  • Currency symbol placement (left / right)

For dates, numbers and currencies the solution was straightforward. A simple pair of functions for each was enough (encode / decode). Dates (and timezones) were more difficult to handle. I am writing an article on this issue and I hope to post it in the next days. Localization issues are also a good subject for a separate article which I also hope to write soon.


Powered by WordPress Theme by H P Nadig