Translator 101 Lesson 1: Setting the Stage

This is the first post in a series that will explain some of the details of writing a GlusterFS translator, using some actual code to illustrate.

Before we begin, a word about environments. GlusterFS is over 300K lines of code spread across a few hundred files. That’s no Linux kernel or anything, but you’re still going to be navigating through a lot of code in every code-editing session, so some kind of cross-referencing is essential. I use cscope with the vim bindings, and if I couldn’t do “crtl-\ g” and such to jump between definitions all the time my productivity would be cut in half. You may prefer different tools, but as I go through these examples you’ll need something functionally similar to follow on. OK, on with the show.

The first thing you need to know is that translators are not just bags of functions and variables. They need to have a very definite internal structure so that the translator-loading code can figure out where all the pieces are. The way it does this is to use dlsym to look for specific names within your shared-object file, as follow (from xlator.c):

        if (!(xl->fops = dlsym (handle, "fops"))) {
                gf_log ("xlator", GF_LOG_WARNING, "dlsym(fops) on %s",
                        dlerror ());
                goto out;
        }    
 
        if (!(xl->cbks = dlsym (handle, "cbks"))) {
                gf_log ("xlator", GF_LOG_WARNING, "dlsym(cbks) on %s",
                        dlerror ());
                goto out;
        }    
 
        if (!(xl->init = dlsym (handle, "init"))) {
                gf_log ("xlator", GF_LOG_WARNING, "dlsym(init) on %s",
                        dlerror ());
                goto out;
        }    
 
        if (!(xl->fini = dlsym (handle, "fini"))) {
                gf_log ("xlator", GF_LOG_WARNING, "dlsym(fini) on %s",
                        dlerror ());
                goto out;
        }

In this example, xl is a pointer to the in-memory object for the translator we’re loading. As you can see, it’s looking up various symbols by name in the shared object it just loaded, and storing pointers to those symbols. Some of them (e.g. init are functions, while others e.g. fops are dispatch tables containing pointers to many functions. Together, these make up the translator’s public interface.

Most of this glue or boilerplate can easily be found at the bottom of one of the source files that make up each translator. We’re going to use the rot-13 translator just for fun, so in this case you’d look in rot-13.c to see this:

struct xlator_fops fops = {
        .readv        = rot13_readv,
        .writev       = rot13_writev
};
 
struct xlator_cbks cbks = {
};
 
struct volume_options options[] = {
        { .key  = {"encrypt-write"},
          .type = GF_OPTION_TYPE_BOOL
        },
        { .key  = {"decrypt-read"},
          .type = GF_OPTION_TYPE_BOOL
        },
        { .key  = {NULL} },
};

The fops table, defined in xlator.h, is one of the most important pieces. This table contains a pointer to each of the filesystem functions that your translator might implement – open, read, stat, chmod, and so on. There are 82 such functions in all, but don’t worry; any that you don’t specify here will be see as null and filled with defaults from defaults.c when your translator is loaded. In this particular example, since rot-13 is an exceptionally simple translator, we only fill in two entries for readv and writev.

There are actually two other tables, also required to have predefined names, that are also used to find translator functions: cbks (which is empty in this snippet) and dumpops (which is missing entirely). The first of these specify entry points for when inodes are forgotten or file descriptors are released. In other words, they’re destructors for objects in which your translator might have an interest. Mostly you can ignore them, because the default behavior handles even the simpler cases of translator-specific inode/fd context automatically. However, if the context you attach is a complex structure requiring complex cleanup, you’ll need to supply these functions. As for dumpops, that’s just used if you want to provide functions to pretty-print various structures in logs. I’ve never used it myself, though I probably should. What’s noteworthy here is that we don’t even define dumpops. That’s because all of the functions that might use these dispatch functions will check for xl->dumpops being NULL before calling through it. This is in sharp contrast to the behavior for fops and cbks, which must be present. If they’re not, translator loading will fail because these pointers are not checked every time and if they’re NULL then we’ll segfault. That’s why we provide an empty definition for cbks; it’s OK for the individual function pointers to be NULL, but not for the whole table to be absent.

The last piece I’ll cover today is options. As you can see, this is a table of translator-specific option names and some information about their types. GlusterFS actually provides a pretty rich set of types (volume_option_type_t in options.h) which includes paths, translator names, percentages, and times in addition to the obvious integers and strings. Also, the volume_option_t structure can include information about alternate names, min/max/default values, enumerated string values, and descriptions. We don’t see any of these here, so let’s take a quick look at some more complex examples from afr.c and then come back to rot-13.

        { .key  = {"data-self-heal-algorithm"},
          .type = GF_OPTION_TYPE_STR,
          .default_value = "",
          .description   = "Select between \"full\", \"diff\". The "
                           "\"full\" algorithm copies the entire file from "
                           "source to sink. The \"diff\" algorithm copies to "
                           "sink only those blocks whose checksums don't match "
                           "with those of source.",
          .value = { "diff", "full", "" }
        },
        { .key  = {"data-self-heal-window-size"},
          .type = GF_OPTION_TYPE_INT,
          .min  = 1,
          .max  = 1024,
          .default_value = "1",
          .description = "Maximum number blocks per file for which self-heal "
                         "process would be applied simultaneously."
        },

When your translator is loaded, all of this information is used to parse the options actually provided in the volfile, and then the result is turned into a dictionary and stored as xl->options. This dictionary is then processed by your init function, which you can see being looked up in the first code fragment above. We’re only going to look at a small part of the rot-13′s init for now.

        priv->decrypt_read = 1;
        priv->encrypt_write = 1; 
 
        data = dict_get (this->options, "encrypt-write");
        if (data) {
                if (gf_string2boolean (data->data, &priv->encrypt_write) == -1) {
                        gf_log (this->name, GF_LOG_ERROR,
                                "encrypt-write takes only boolean options");
                        return -1;
                }
        }

What we can see here is that we’re setting some defaults in our priv structure, then looking to see if an “encrypt-write” option was actually provided. If so, we convert and store it. This is a pretty classic use of dict_get to fetch a field from a dictionary, and of using one of many conversion functions in common-utils.c to convert data->data into something we can use.

So far we’ve covered the basic of how a translator gets loaded, how we find its various parts, and how we process its options. In my next Translator 101 post, we’ll go a little deeper into other things that init and its companion fini might do, and how some other fields in our xlator_t structure (commonly referred to as this) are commonly used.