Separate Clouds

A blog by Ewan Klein

January 2, 2013
by Ewan
Comments Off on Yet Another Ikea DIY Standing Desk

Yet Another Ikea DIY Standing Desk

The Spec

I’ve been reading about Standing Desks for a while, and decided to take the plunge over the Christmas break. I was initially impressed by Colin Nederkoorn’s standing desk for $22, and then also came across this IKEA variant by Benson Chou.

Since my screen is a 21.5″ iMac, I decided that the 55 cm height of the Lack side table was going to raise the whole thing too high. Instead, like Jinyoung Kim, I went for the Lack coffee table, which is slightly lower at 45 cm, as well as being wider and possessing a middle shelf. Just for fun, I also changed the shelf brackets to the Ekby Bjärnum, and bought an Ekby Järpen shelf to go with it.

Construction

Although putting it all together was relatively straightforward, I bumped against a few things that are potential gotchas.

Like Benson Chou, I screwed the brackets to the front legs before attaching the latter to the table top. However, I didn’t really pay proper attention to two things. First, the legs are oriented, in the sense that there is a pre-drilled hole at one end.
IMG_1154 So you need to take this into account in measuring the height of the bracket. Second, it’s difficult to predict how many turns will be required to get the leg to fix snugly against the underside of the table top. (The two are joined by a double-ended screw.) It follows that you don’t really know which face of the leg would optimally face forward to receive the bracket. So probably the best thing is to screw in the two legs, mark the front face, unscrew and attach the bracket, then re-attach.

As Benson points out, the legs of this table are hollow! The leg wall is also pretty thin and flimsy (although the vertical strength is fine). This makes it really difficult to get a satisfactory anchorage for the brackets. The solution I tried was to use a hollow wall fixing plug (designed for plasterboard): IMG_1150

This helped to a certain extent, but was far from perfect, mainly because the plugs are designed for a thicker wall. If I was doing this again, I would adopt Jinyoung Kim’s solution of bolts, washers and nuts, which would definitely be more robust.

It is also helpful to notice that the brackets have holes on one side; these take grub screws to ensure that the shelf is firmly lodged in the brackets. IMG_1155 So obviously you want to make sure that the side with the holes is on the underneath. Since the shelf is 119 cm long, while the coffee table is 90 cm long, I had to chop a chunk off the shelf.



Here is the end result:

IMG_1176 - Version 2

The Bill

Total:   £38.00

Still Standing

Three days later, I’m still feeling my way with the experience of standing. Overall, I feel positive about it. Standing up certainly encourages focussing on the task at hand, and goes well with the pomodoro technique, since I can time my stints of standing to 25 minutes. However, I’m also finding it physically quite demanding, which probably suggests that I need to keep on working on my posture.

July 23, 2012
by Ewan
Comments Off on Vocabulary Hacking with SPARQL and UMBEL

Vocabulary Hacking with SPARQL and UMBEL

In my blog post on finding commodity terms for Trading Consequences, I described how one of the project’s tasks was to carry out Named Entity Recognition of commodities in digitised historical texts. In this post, I want to describe in a bit more technical detail some of the steps we have been taking to carry out this task. Warning: this is not a distilled set of take-away lessons, but more of in-progress log.

Requirements

We decided initially that we should construct a thesaurus which would use relevant terms from an existing vocabulary and supplement it with more obscure or archaic terms that were discovered in the course of reviewing relevant historical documents. An important design consideration was to include a limited amount of hierarchical structure in order to support querying, both in the database interface and also in the visualisation process. For example, it ought be possible to summarise the export of limes, apples and oranges under the label Fruit. We also wanted to be able to add other properties to terms, such as noting that both nuts and whales are a SourceOf Oil. Finally, it was important to be able to list multiple alternative forms for the same commodity; for example, rubber might be referred to in several ways, including not just “rubber” but also “India rubber”, “caoutchouc” and “caouchouc”. These factors made SKOS (Simple Knowledge Organization System) an obvious choice of framework for organising the thesaurus.

The following diagram (from SKOS Core Guide 2005) illustrates preferred and alternative lexical labels attached to a concept:

SKOS Preferred and Alternative Lexical Labels

SKOS Preferred and Alternative Lexical Labels

Hierarchical relations between concepts in SKOS are expressed in terms of the skos:broader property (or its inverse, skos:narrower) as seen here (also from SKOS Core Guide 2005):

skos:broader

The skos:broader property


We can read the bottom part of this diagram as saying that the concept Mammals ‘has a broader concept’ Animals.

Although skos:broader is not transitive, SKOS also contains the property skos:broaderTransitive, which is, and this is what we will be using. A concept in SKOS is explicitly intended to be a fuzzier notion than a class of things. Nevertheless, we’ll be using concepts as though they were classes, and consequently, ex:mammals skos:broaderTransitive ex:animals is effectively equivalent to saying that Mammals is a subclass of Animals.

The Base Vocabulary

I looked at existing vocabularies already represented in SKOS which could be taken as a starting point. I decided, without a huge amount of investigation, to take the UMBEL upper ontology as the starting point for our base vocabulary. It seems big enough to provide a good basis, and is small enough (just under 120Mb) to download from github.

After poking around looking at the file structure, it seemed that everything we wanted was contained in umbel_reference_concepts.n3. Note that this an RDF file, serialised in Notation 3 format (best thought of as Turtle, which is a subset of Notation 3 that has become the de facto alternative to RDF/XML). The data model for RDF is a graph in which ‘subjects’ are related by properties to ‘objects’. While UMBEL uses SKOS concepts and properties, it supplements these with selected properties from the RDF Schema language (RDFS; see http://www.w3.org/TR/rdf-schema/).

Trying to get an overview of what was in UMBEL was an initial challenge. Although I spent a little time looking at SKOS editing tools, nothing stood out as an obvious contender in terms of ease-of-installation, broad adoption and relevant functionality. However the Free Edition of TopBraid Composer worked well for browsing umbel_reference_concepts.n3. The following screenshot illustrates the interface.

Screenshot of TopBraid Composer

Pruning UMBEL

In order to construct the base vocabulary, I wanted to extract a subset of the SKOS structure that only dealt with concepts that were relevant to the commodity domain; these seemed to be all subclasses of the concepts Animals, Plants and Natural Substances. This can be carried out using the SPARQL query language. As well as supporting SQL-like queries which return tuples of values, SPARQL has a CONSTRUCT operator which returns an RDF graph. The query shown below will return all subject-predicate-object triples where the subject (shown as the variable ?s) is a subclass of Plant, Animal or NaturalSubstance:

Of course, to actually execute the query, we need to do some more work. While TopBraid Composer allows you to run SPARQL queries in a GUI pane, the results are saved using
the SPARQL Query Results XML Format, which was less convenient for me that CSV. In addition, I wanted to be able to run the queries programmatically rather than via a GUI, and to be able to save them in a version control system. In the past, I’ve had good results using the Jena ARQ library, so was disposed to try this again. However, in the last couple of years, Jena has migrated from Sourceforge to being first an Apache Incubator project and, since April 2012, a top-level Apache project. In the spirit of adventure, rather than just using the ARQ query engine again, I decided to have a crack at running the Fuseki SPARQL server. This turned out be very simple to install, and to query over HTTP. Assuming that Fuseki is running on the default port 3030 and that the SPARQL query is contained in the file subgraph.rq, this command will execute the query and save the results in subgraph.ttl:

Finding Lexical Labels in UMBEL

Now that I had extracted the relevant subgraph from UMBEL, I needed another SPARQL query to identify lexical labels — these could then be converted into a gazetteer and incorporated into the text processing pipeline. The following query extracts tuples consisting of the SKOS concept, the preferred and alternative lexical labels, and any broader concepts:

The best way to understand this query is to look at the output, the first 10 lines of which are as follows (slightly simplified to use prefixed names in place of full URIs):

There is a lot of redundancy in this format, since each separate item of information requires a separate row in the results. For example, rows 5–7 contain the information that “herbaceous plant” has three alternative labels, namely “herb”, “herbs” and “herbaceous plants”. How many distinct lexical labels are there altogether? We can use the following Unix command line to extract and count all the unique items that occur in the second field position (preferred labels) of a row (and similarly for the other three fields by changing the -f argument of cut:

The counts of the different items are as follows:

classes 3,445
preferred labels 3,414
alternative labels 5,904
superclasses 898

Next Steps

As I mentioned at the start of this post, we are supplementing the vocabulary derived from UMBEL with terms derived from nineteenth century documentary sources. Jim Clifford has been focussing on capturing and transcribing data from annual Customs reports on the quantity and value of goods arriving in Britain each year. Here’s part of a page showing imports of “Ammunition: Shot, Large and Small”:

Fragment of 1898 Customs record

In a follow-up post, I will describe how we are converting this additional set of terms into a SKOS-compatible form, and how we are integrating it with the base vocabulary from UMBEL.