« December »
SunMonTueWedThuFriSat
  12345
6789101112
13141516171819
20212223242526
2728293031  
       
About
Categories
Recently
Syndication
Locations of visitors to this page

Powered by blojsom

Radovan Semančík's Weblog

Friday, 11 December 2009

There are lots of lots of sites that horde and gather content on the web. Sites that offer you to maintain a photo album, video collection, bookmarks and whatnots. Each and every such site tries to gather a community of its own. How could you tell apart the sites that are worth your attention and the sites that would mean just plain waste of time? How you could see whether there is healthy community or just a bunch of uninteresting loosers?

I have figured out a three-seconds test that seems to work quite universally. Just go to the site and use the search input field to search for some controversial topic. I usually search for "nude". If the search results are just porn or a horde of flame-infested discussions, the site is uncontrolled wilderness. Avoid that site. If nothing relevant turns up or you can see just some carefully censured bikini shots, the site is too conservative to be useful or entertaining. Avoid such site as well. If the search results show decent selection of artistic nudes or some good texts on nudity, it is worth the time to explore the site further.

Technorati Tags:

Posted by rsemancik at 4:31 PM in misc
Tuesday, 1 December 2009

World Wide Web Architecture, and the REST architectural style as well, deal with resources. Resource is one of the central concepts in the web. Web pages are just representations of resources, resources are identified by URIs, the web is all about resources. But what is a resource? Now, that's a mystery.

The World Wide Web Architecture document provides quite vague and indirect definition:

By design a URI identifies one resource. We do not limit the scope of what might be a resource. The term "resource" is used in a general sense for whatever might be identified by a URI. It is conventional on the hypertext Web to describe Web pages, images, product catalogs, etc. as “resources”. The distinguishing characteristic of these resources is that all of their essential characteristics can be conveyed in a message. We identify this set as “information resources.” [...] However, our use of the term resource is intentionally more broad. Other things, such as cars and dogs (and, if you've printed this document on physical sheets of paper, the artifact that you are holding in your hand), are resources too. They are not information resources, however, because their essence is not information. Although it is possible to describe a great many things about a car or a dog in a sequence of bits, the sum of those things will invariably be an approximation of the essential character of the resource.
That means that anything can be a resource. Dogs, houses, books, specific version of a book, specific paper-based copy of a book, photograph of the book, files containing data scanned from that book in pixmap format, data containing content of that book in ASCII format, HTML-formatted content of that book, the web page that contains the HTML formatted content of that book and even web page describing that book in an electronic shop - all that could be resources. But wait, isn't a web page containing HTML-formatted content of the book in fact a resource representation? Yes, it is. And many of the objects and concepts mention above may be resource representations. And they may, at the same time, be themselves a resources. In fact there seems to be no difference between representation and a resource (maybe except for non-information resources). The world of web in not black-and-white with abstract resources and concrete representations (as it seems to be at least partially assumed by REST). There are many shades of gray between abstract and concrete. And maybe the pure abstractness and pure concreteness are just theoretical extremes that cannot be reached in practice. Such a fuzziness of meaning is one the most difficult parts of Web architecture to understand.

However, allowing real-world things to be resources make a awful lot of problems. The panorama of these issues starts with the problem of who is authorized to assign URI to star known as "Sirius" (as it obviously can be a resource and it should have a single URI). Then it goes through a problem of completeness, as it is quite difficult to imagine that an "information resource" would capture all aspects, characteristics, feature and (potentially conflicting) viewpoints that concern a specific real-world thing. Many more problems follow and I'm sure we do not yet see most of them. I've tried to capture the obvious problems in my paper. Semantic web activity is trying to address some of these issues, but so far it seems that the result is to make the problems machine-processable and efficiently distributable to Internet scale. I have seen no real solution so far.

Therefore I have proposed to limit the definition of resource to only include so called "information resources". The information resources may indirectly refer to the real-world things and concepts, but Web in fact does not need (and cannot) deal with the real world directly.

Technorati Tags:

Posted by rsemancik at 2:41 PM in Web architecture