Archive for September, 2008

Issues with a FOAF-based Authentication System

Friday, September 5th, 2008

As I’ve been working on TAAC, I’ve started to become concerned about potential weaknesses with any FOAF-based identity authentication system (be it RDFAuth, OpenID, or FOAF+SSL) and that’s that ALL systems, with the possible exception of RDFAuth (due to its reliance on PKI), have their weakest link as the integrity of the server hosting the FOAF file. All three systems rely on data in the FOAF file to ‘authenticate’ against, but this poses problems. Take, for example, the following scenario:

Alice runs a website that accepts an OpenID+FOAF system (it works easily well with FOAF+SSL). Bob is a client of Alice, and regularly uses the authentication scheme Alice has implemented. When authenticating, he traditionally authenticates against his FOAF URI, http://www.example.com/bob.rdf#bob. The file bob.rdf has information that links to Bob’s OpenID, http://www.example.com/bob, permitting him to authenticate with his (self-run) OpenID provider.

Eve wants to see the information that Bob gets to see on Alice’s website, and thanks to some shoddy system administration, finds a security hole that allows her to get access to the filesystem. Ignoring the other private information acquired in this way, Alice silently replaces bob.rdf with her own FOAF file that has one simple change: the OpenID associated with http://www.example.com/bob.rdf#bob is now http://www.example.com/eve, which is Eve’s OpenID provider. Eve authenticates agains her own OpenID provider and gets access as Bob to Alice’s website, does her dirty work, and then quietly returns the original FOAF file so that Bob is none the wiser. There’s precious little evidence that Eve intruded, and only an alert sysadmin might note the erroneous login. Meanwhile, Alice is barely aware of any difference other than that the OpenID changed for one particular login.

In summary, as Henry Story admitted (Point 5 in the FOAF+SSL description), these methods only assert that the person accessing any protected resource has ‘write access’ to their FOAF file… But that doesn’t assert that they’re the same person.

With the common weakness of many self-hosted domains having poor security protocols, a FOAF-based Authentication System could be disastrous. The only plausible ‘stopgap measure’ might be requiring the system as a whole to cache the authentication credentials (e.g. OpenID, public key URL, or X.509 hash) and refuse access to people who present credentials that have changed. This adds a layer of complication to the mix as well, as it would require out-of-band communication to ensure that the ‘cached’ credentials are removed or replaced with new credentials manually… And even so, there is still the risk of incorrect authentication credentials being presented absent any evidence they are incorrect (e.g. Eve logs in before Bob ever does, or does so in the period where Bob’s cached credentials have been deleted, establishing her credentials in place of his own). There are ways around this, but they seem a bit kludgy to me (e.g. using the old OpenID/X.509 cert, which may not exist due to security risks, to authenticate the new one; checking against a public key server to see if there’s any indication that a public key has been revoked/replaced).

Are we sure that a FOAF-based Authentication System is secure enough? At the very least, it seems like we need proactive sysadmins maintaining the system to ensure it remains secure… And can we afford that?

Back to TAAC

Wednesday, September 3rd, 2008

So I’ve finally got a chance to return to working on TAAC, an access control mechanism for the web that integrates FOAF-based identification with access control rules. I’ve been doing some more thorough testing on the slow-down issues explained two posts back, and found that the slowdown, while significant, appears to be about 13 seconds or so, on average, on this server, a Linode virtual private server which I expect typifies an average web host (if not better than average).

Several attempts at profiling (aside from creating significantly increased processing times, up to 10x longer) led to the conclusion that, in fact, most of that time is spent in the second phase (post-authentication, during reasoning), which is where I’d EXPECT the slowdown to be. Granted, this now becomes a problem that can be solved in part by Moore’s Law, but even so, some speedups would be nice to allow it to be implemented today. I plan on running the same code on a relatively modern test server that’s dedicated to doing more or less supporting these tests, so it will likely run faster on there.

It’s worth considering that this is running on a variant of the cwm reasoner on top of a re-implemented Rete reasoner, and, seeing how it’s all in interpreted Python, rewriting it in compiled C code (or even Java) would probably see a significant speed-boost, but that’s not a terribly productive line of work (except where trying to actually push out a commercial product). It might also be worth exploring other reasoning approaches to improve the speed.

Even so, I’m going to try looking at the other authentication approaches to see what the benefits and costs of them are… I think the more RESTful approach without OpenID may have some arguments in favor of it, but I doubt they’re going to be based solely on speed.