Daniel Pitts’ Tech Blog

Archive for March, 2008

Look, in the sky! It’s a Relation; It’s an Object System; No, It’s a Graph!

Thursday, March 13th, 2008

Sometimes when dealing with OO design in conjunction with relational persistence (RDBMS), it becomes difficult to reconcile the differences between the two and still maintain a consistent approach.

The problem that I seem to run into time and again is that I’m an OO programmer first.  I tend to think of things top-down.  What’s the overall system look like? What are the large pieces? What fits “in” those large pieces?

When you work this way, you tend to end up with a tree of objects or even more complex graph of objects.  The interesting thing about a graph of objects is that it is, above all else, a graph, and there are many ways to represent graphs.

Fundamentally, all graphs are a set of Nodes and (directed) Edges, lets examine a few common ways to represent nodes and edges.

  • Each node knows which other nodes it connects to. In other words, the nodes know there out-going edges. This is close to how most common OO programming languages work; an object is the node, and any reference to another object is a one-way edge.
  • Each node knows which other node connects to it. The nodes know about there in-coming edges. This is the strict opposite of the above. It is also common in relation models with one-to-one relationships.
  • A list of nodes and a list of edges are maintained independently from each other.  This is common in relational models with many-to-many relationships.
  • A combination of all of the above.

Any graph can be represented in any one of the above ways (as well as many others).  This is important to realize, because it gives you the ability to refactor an OO design into something that better fits the relational model.  There will always be a disconnect between OO and RDBMS, because they look at design in fundamentally different ways, but that doesn’t mean you can’t work with both of them in the same system, leveraging their individual strengths.

Escaping and Encoding in HTML and RSS

Thursday, March 13th, 2008

Escaping and Encoding, two things that most web developers need to do quite often. Unfortunately, most people are never taught when (or even how) to do so. Depending on what you’re doing, it can be a security risk and a bad user experience.

Encoding and escaping are both similar in concept, and what they ultimately try to do is represent values (in our context, characters) that either have special meanings, or are otherwise not representable in the underlying format.

For example, the “<” character has special meaning in HTML/XML, so if you want it to actually show up, you have to escape it. To do that, you use “&lt;” in its place.

Another example is in URLs, “&” has special meaning (to separate query parameters). It is replaced by “%26″. “%” itself also has to be encoded (as “%25″).

If you wanted to link to “http://virtualinfinity.net/dictionary?word=%nfinity&fun=true” in HTML, you’d first want to URL encode “%nfinity” to “%25nfinity”, and then you’d want to HTML encode the full URL to “http://virtualinfinity.net/dictionary?word=%25nfinity&amp;fun=true”.

You’re final output would look something like <a href=”http://virtualinfinity.net/dictionary?word=%25nfinity&amp;fun=true”>Words ending in nfinity &amp; having fun</a>. Notice the “&amp;” in the href. Most web browsers are tolerant of such mistakes, but they can cause you problems down the road.

A good way to know what to encode, and which method to use, is to think of each encoding as a layer. You want to put the string “%nfinity” in the URL query parameter layer, so you need to encode it with the URL encoding. You want to put the URL into an HTML document, so you need to HTML escape it. And so on and so forth.

Things can get even more interesting with RSS feeds. The <description> elements’ text values can contain HTML within them. A naive first attempt might be something like:

<description> <a href=”http://virtualinfinity.net/dictionary?word=%25nfinity&amp;fun=true”>Words ending in nfinity &amp; having fun</a>. </description>

Unfortunately, this doesn’t work quite as expected. in XML, this actually creates an “a” element with-in the “description” element. This is *not* valid RSS. So you need to escape the contents of the description element.

<description> &lt;a href=”http://virtualinfinity.net/dictionary?word=%25nfinity&amp;amp;fun=true”&gt;Words ending in nfinity &amp;amp; having fun&lt;/a&gt;. </description>

This may look funny, but it’s actually correct.  It is not a typo to have “&amp;amp;”.  The first “&amp;” will be converted back into “&” by the XML parser, so “&amp;amp;” becomes simply “&amp;” After that, the HTML parser gets a hold of it, and converts it to the expected “&”.

So, there you have it. A brief explanation of when and what to encode.  How to encode is left as an exercise of the reader. (Hint: google is your friend)