« October »
SunMonTueWedThuFriSat
      1
2345678
9101112131415
16171819202122
23242526272829
3031     
About
Categories
Recently
Syndication
Locations of visitors to this page

Powered by blojsom

Radovan Semančík's Weblog

Monday, 31 October 2005
Today is a normal working day in Slovakia, but it just fits between weekend and a state holiday, that will be tomorrow. This kind of days is usally peaceful, no ringing phones to call for help, no urgent e-mails. It's good to go through blog entries and articles that I've marked as interesting but did not have time to go through. And here is what that I've encountered:

Both Stefan Brands and Kim Cameron pointend out an editorial by Niels J Bjergstrom that deals with the results and consequences of UK government's Identity Project. I can agree with most parts of the editorial, especially those that the technology is not prepared yet. But I have some remarks:

An identity can only be substantiated through authentication.

The first question is: What is an identity?

Is "identity" the physical person itself? Is "identity" the data record that describes the person? Or what identity really is? Nobody really seems to know.

Because of this darkness in the identity definition I usually try not to use the word "identity" at all. But as others use it, I will think about "identity" as a link between a physical person and a data set that describes him or her. That's the most reasonable definition as we can get in IT, I think.

While "indentity" and authentication are closely connected, they cannot be seen as one. First of all, the "identity" can be established without authentication. That's what police and secret services frequently do. They analyse the activities of an individual or group with a goal to infer the unknown information about them. They are trying to uncover the link between an indivual (e.g. an unknown criminal) and a data record that describes him (e.g. citizen registry entry). That's "establishing identity".

Authentication aims to prove that we have the correct physical person on the other end of the wire. Or that it was the correct device that made specific digital signature (physical persons cannot do digital signatures, only devices can). The authentication allways depends on a data record (persona) already existing in the database. No matter if it is a persona of entity being authenticated entity or some other entiry (e.g. credential authority), you will allways need to have some data before you can do authentication.

But how was that data record (persona) created? How was the record about me in the birth registrar created? How it is maintained (modified, deleted)? Doesn't that record form a part of my "identity"? These questions can get very complicated in some enviromnents.

I think that it is a mistake to think of autentication and "identity" as a single technology. They may both manage the link between the physical person and the data record (that I will call "persona"), but they differ in a way how they do it.

Posted by semancik at 1:46 PM in Identity
Monday, 24 October 2005
The fact that a workflow is an essential part of Enterprise IDM was not that clear from the beginning. If it was, the Single Directory paradigm cannot ever appear. It took some years to realize this, but it is quite clear now. You just cannot sell an IDM product that does not include workflow or tightly integrates with it. Or more exactly, you can sell it, but you cannot effectively deploy it.

While still in a sales process, you promise the customer to customize the workflows exactly to his needs. The product has this neat workflow engine that even a trained monkey can configure, you see? And the customer swears that it has all the relevant workflows formalized and documented. You thing how easy will the part be and almost forget about workflows while preparing the project. As analysis begins, you will receive a pile of documets describing existing manual workflows and you start to process it and design the implementation. And there the first warning appears.

As you try to compose the mask screenshot of the workflow GUI, the customer asks:

Ah, that's nice. But we have this old tool here, in our groupware system, that supports the user provisioning process. And our people are accustomed to this. It would be really nice if you could make it look exactly like this, so we would have less trouble re-training the employees for a new application. You said that the product is that flexible, could you please customize the look&feel for us?
Well, as you are in the "customer satisfaction business", you do that customization. If you have such a great product as we had, it is indeed very easy. Few hours of work. Negligible. You do it and you forget about it. But this was the first warning sign.

Later, after the design is nearly complete, you found out that the "old groupware tool" is in fact much more important than was expected. The documenatation on the manual workflow processes is in fact a bit outdated. The processes were adapted by changing the "groupware tool" directly, and the changes were not allways documented. And later you find out that part of the processes is still carried out "intuitively" by asking someone how to do it. These were not even mentioned in the documentation at all. And even the documented processes contained steps that were not strictly algorithmic, based on the principle "if it looks right then it may proceed".

If you have such a good product as we had, you can implement most of the functionality right away, customizing the workflows and user interface. Each "piece" will take you few hours and that can easily be lost in the overall project plan. But there is a lot of these little "pieces" that you must implement or compensate for. And if you keep a detailed record of work that you spent on project (as I do), you will soon find out that it takes tens of man-days that were not expected in the project plan.

Moral:

  1. The things that are missing in the analysis are much more important that those that are included. Allways look for the missing things. Never allow yourself to consider analysis complete only because it is already 100 pages long. A 20 pages of analysis document with all important aspects mentioned in low detail may be perfectly OK, yet 200 pages of detailed descriptions may be totaly useless if it miss the important points.
  2. A negligible amount of work repeated many times equals a lot of work. Do not underestimate customer's desire for "cosmetic changes". Especially in user interface and workflows.
  3. A workflow is one of the most important parts of user provisioning. Maybe the most important part. Do not trust anoyne telling you that they will not use workflows. They will. Do not trust anyone telling you that they will use the default workflow. That will not suit them and you will have to customize the workflow anyway in order to make the solution work. And do not trust anyone telling you that they have all the existing workflows precisely documented. That's a myth.
Posted by semancik at 2:12 PM in Identity
Thursday, 13 October 2005
Automated User Provisioning using Role-Based Access Control. You know the concept as you sell it to customer:
Dear Mr. Customer. You have all this big bunch of employees. Each of them having lots of different accounts with miriads of privileges. Wouldn't it be great if you can organize that to few key roles? Look, you have this great organizational structure chart there hanging on the wall. Imagine that you will automatically assign roles to your employees based on that nice chart. Think about how great it will be. Think about the savings [substitute your favorite sales arguments here].
Sooner or later Mr. Customer decides to implement IDM. And now it is your turn to fulfill what you promised. First thing that you realize is that "few key roles" will not be that few. You will need build the "key roles" from sub-roles, many of which are system-dependent. And soon it turns out that the number of roles will exceed the number of employees. Well, you finally find a way how to reduce the number of roles (I will describe this "Role Explosion" problem later), but you have another problem now.

You need to assign the roles to the employees. You think: "Ahhh, now it's going to be easy. I just got the electronic version of that organizational chart, I'll put it somewhere and I'll hack the workflow to look at it while assigning roles." That's exactly how I looked at it. But now the reality:

First fact that you find out is that there is no single organizational structure source. One structure is in the HR system. And that's for accounting and statistics purposes. Then there is an orgstruct in groupware system. That's used by all that little cozy applications like telephone number directory and micro-workflows. And there are another (usually partial or modified) versions of organization structure in different applications, beginning with Active Directory and ending with quite-unimportant-department-application-that-was-not-really-used-in-years.

All of the responsible people would swear to you that all these structures are only replicas of the one primary organizational structure (usually that one in the HR system). "Oh, well, that's fine. I can choose the one that best suits me", you think. And then you choose one and start to use it in your workflows. All works fine on the test data, but it suddenly break up when deployed in the "real" environment. After a while (that looks like weeks to you, with all these managers screaming around) you'll find out that all the organizational structures were not that much same. And according to Murphy's law you've choosen the worst one. But that not really matters, because later you'll find out that there is no consistent source of organizational structure anyway. One version has most of the data that you need but the relations are all broken. The other has the relations fine, but there is no unique key how to merge these two. And after a while you'll find out that the data there are "a bit outdated" anyway ...

The only thing that you can do is to return back to the architecture&design phase and re-desing the modules that deal with organizational structure. Current provisioning systems are great at merging people records, but fail terribly when it comes to merging organizational structure. So we've developed our own tools to do that. We merge the structure to a single "view" in the Directory Server and use that as a database for workflow decisions. And it quite works.

Moral:

  1. Never trust anyone telling you that they have single, consistent organizational structure source that you can use.
  2. Allways analyse the quality of data, not only their availability.
  3. Provisioning system is not the only tool that you'll need. Prepare to develop your own "gadgets". Make a "padding" in the project plan for the unexpected.
Posted by semancik at 12:46 PM in Identity
More than a year ago a project was started. It took year-and-a-half of pre-sales activities to get it started, but it was finally lauched. It was an Enterprise Identity Management project. Or more specifically it was only first part of larger IDM vision. But what was so unique in this projects that it's worth writing about it?

First of all we used new and exciting technology. The basic component was Waveset Lighthouse, now called Sun Java System Identity Manager. Great user provisioning product. The other components being Sun Java System Directory Server and home-brewed integration tools - all of them heavily customized to met customer's needs.

Then, it was the first complex IDM project in the Central/Eastern Europe. It was quite rapid (less than a year) and it was successful. Something not frequently seen in these longitudes.

And finally, I've personally drafted the first preliminary pre-sales concept, architected the whole solution, technically led the project, took the most difficult technical bits for myself to implement and watched the whole project till the very end. From my egocentric point of view it is The Project to be proud of.

But it is not the success of the project that I want to write about. I think that the problems that we encountered, the little failures and unmet expectations - these are the things worth noting.

In the next few blog posts I want to describe some of the key problems that were not apparent in the early project phases and that struck with hurricane-force in the most incovenient times. These are usually not technological problems, but rather "philosophical" ones, concerning solution desing, customer understanding and things like that. I workd on several other IDM-related activities since then and all these problems are apparent for other customers also. So I decided that it is worth sharing the experience. Maybe it can save you a lot of time.

Stay tuned.

Posted by semancik at 11:21 AM in Identity
Monday, 10 October 2005
As I mentioned in my previous posts, I'm no big fan of word "identity". Think a big about the Webster's definition of "identity":
1. The state or quality of being identical, or the same; sameness.
2. The condition of being the same with something described or asserted, or of possessing a character claimed; as, to establish the identity of stolen goods.
3. (Math.) An identical equation.
This looks like a mathematical equality operator definition. It is far away from the multi-faceted polymorph ever-changing thing that is described by "digital identity" concepts.

You may also find a more "social" definition of word "identity" (WordNet):

1: the distinct personality of an individual regarded as a persisting entity; ...
2: the individual characteristics by which a thing or person is recognized or known; ...
But I think that you are not interacting with the "individual" in the digital world. You are interacting with some representation of him that may not be identical to the real-world person.

Think about this blog. How can you be sure that it is indeed Radovan Semancik that have written these lines. How you can be sure that Radovan Semancik even exists? But does it even matter, as long as you like this site? It is not the real-world identity that matters in this weird digital world. It is the "persona" that matters. The part of the personality that is visibly presented.

As long as you like the way as the persona presents itself, as it acts, you do not need to know the link to the real-world entity behind it. That's the same as in normal social interactions in the real-world. When you meet someone you will not ask for an goverment-issued ID document that will state the identity of the person you met. You will likely believe the name that he tells you. And you will judge him by his acts ("persona"), not his name or SSN ("identity").

Forget the "Identity", think about "Persona".

BTW: "Identity" is really a nice buzzword. It sound good. Say it for yourself: "Identity Management", "Identity Technology". It sounds trustworthy. It sounds like it really means something. Great buzzword, indeed. I can understand why sales people use it. But we should better avoid it, anyway.

Posted by semancik at 2:36 PM in Identity
Tuesday, 4 October 2005
I think there is no "Identity" for a physical person in the digital world. How could a computer know who I am? It's just a computer, after all.

My computer holds a data entity that describes some parts of me. It also knows that if somethink causes to press the keys in the right sequence (enter password on a keyboard), then it probably is the object that is described by this data entity. It also knows that it should associate every excuted process and created file with that data entity. And it can also claim to other computers that somethink described by this data entity caused current action.

But does the computer know me? It looks like we, people, are quite "virtual" objects from the computer viewpoint. We cannot exist in the digital world. Only the data entities that describe us can realy exist there.

I will call these data entities "personae" or "personas", as they are our masks in the digital world. And if you take a closer look at the digital world as it works today, you will notice plenty of personae there: accounts, profiles, database records, LDAP entries, sessions, ...

You as a single individual may maintain several personae that are based on your physical being: an employee persona, a citizen persona, a community persona. And you may also maintain several other personae: a role-playing-game persona or non-real community persona. (Remeber that old one: "On the Internet nobody knows you are a dog"?)

Personae propagate to ther systems using claims. One system claims that the persona, as the system believes, has certain characteristics. Other system may evaluate the claims and build its own persona based on that information. You can also link personae together. That's what domains and realms do. And federation also, but in quite a different way.

I've put together an essay that describes this model. You can find it at nLight web page. There is also a longer elaboration of the persona linking, but that was not published yet (it is being reviewed since May).

The model is a work in progress. Any comments welcomed. Please use the contact form to leave me a message.

Posted by semancik at 12:24 PM in Identity
Monday, 3 October 2005
Does anonymity really exist? I think not.

Do you think you are anonymous, when you read this blog entry? The software that runs this blog know source IP adress of your connection. How diffucult is to find out your Internet Provider? Few queries in public databases. How difficult it might be to resolve the adress further? Maybe few tricks with DNS and I'll know the company you work for or the region you live in. And if you live in the country similar to the one I live in, I may get a bit further. Maybe if I invest some money to a bottle of expensive liquor and invest that bottle in the right person at the ISP, I may experience a sudden "vision" that may reveal part of your customer record.

Do you still think that you are anonymous?

You may use redirectors and anonymizers. But these are still run by somemody. And somebody may be corruptible. Modern cryptography helps a bit, but hey, the IP address in not the only bit of information I have. You have accessed my blog and you are reading this thing about anonymity. Well, you are a techie or researcher or something like that. I have the timestamp of your request. I'll just look in which part of the world is daytime at that moment. I have your UserAgent string - that may reveal your operating system and native language sometimes. I can measure the time difference between request for the HTML page and request for images. If I'm lucky that will give me the estimation on "network distance" to you. And maybe also the estimation of the bandwidth of your connection. And what about a little JavaScript, Java applet or ActiveX control that may look around your computer? Are you sure your browser is secure? I'm sure you've got the idea now.

People that worked with "Orange Book" class-B secure systems (Common Criteria LSPP) may tell you long stories about covert and subliminal channels. And these stories have common moral: you cannot effectivelly fight the leakage of information. You may limit it, but it gets really expensive quite soon. And it limits usability. If you ever worked with class-B system you know what I'm talking about.

If you would like to be absolutely anonymous, you must eliminate all the data that the other side may gather about you. And that's not practical. It will cost fortune, you will barely be able to get the data you want and still there will be a crack that may leak some data about you. Absolute anonymity is a myth, it does not exist in practice. Anonymity is a theory, usable for theoretical research. But not for the Internet. Anonymity is a buzzword, also. Good for selling "privacy" software. But that nice piece of smart software you've just bught may turn out to be just a expertly-packaged snake oil.

I'm not trying to say that there is no privacy on the Internet (or that there might not be). I'm just saying that there is no absoulte anonymity. Part of your identity may be revealed and there is no effective mean to stop it. We should better adopt the approach that works in computer security area for decades: There is no absolute security. We should accept the fact that there is no absolute anonymity, no absolute privacy. One can always break the privacy, given sufficient resources. Only thing that matters is that the cost of breaking the privacy has to be kept really high. That is the primary goal of "privacy" technology.

Posted by semancik at 5:48 PM in Identity
Few weeks ago my friend Peter Fabian from Budapest Sun office pointed me to the OpenSSO project, but just this weekd I fond a bit of time to go through the docs.

The documents look at the project from quite a high viewpoint. There is not much details about the system, just a high-level architecture. And that architecture is much too similar to Sun Java System Access Manager . That was pretty expected, and it is good it the "open" implementation and the SJS Access Manager will be compatible.

The release of Session Module is scheduled for this month. I'm getting really eager to stick my nose to the code. And I it looks like I may use the OpenSSO code in a project very soon.

Posted by semancik at 11:21 AM in Software