Archive for December, 1969
Adsense, advertisers and spam tools
I finally caved and added Google Adsense ads my weblog. The reason for this is quite simple, I happened to make a couple of Firefox wallpapers that turned out to be quite populair. Even two years after placing them on my weblog it is still generating about 20 gigabyte of traffic each month. In the beginning it was more than four times as much, but given that I no longer host my weblog on my personal server behind an ADSL connection I now have to actually pay for the bandwidth that I use.
The sad thing is that just one day after enabling Adsense I discovered the following ad on the main page of my website:
Given that I receive more than four-thousand spam attempts every day on this weblog, wrote several anti-spam plugins for nucleus - and I just disabled the trackback functionality on my website because of the flood of trackback spam - you can imagine that I was more than a little furious. Sure, I want to earn some money to pay for this website and I even hope it will make me enough money to buy a nice present for my girlfriend. But one thing I will absolutely not allow - no matter how much money is involved - is to let spammers use this weblog to promote their crap.
So I disabled all ads from this spammer - unfortunately this may take up to four hours - and wrote a nice little email to Google informing them of a policy violation. Google doesn't like spammers either, so I really hope they ban this guys ass. I decided for now that the Adsense ads are staying, but I will keep a close lookout for any abuse from spammers. If anybody sees an ad for a spammer or spamtool on this weblog, please let me know. Leave a comment or e-mail me and I will disable the ad on this website and take it up with Google.
Finally, I want to quote from the website of the spammer:
"We Do Not Condone or Endorse Unethical Methods to Boost Search Engine Ranking"
Sigh...
Trackbacks are dead
I just disabled the trackbacks on my weblog - something that wasn't easy for me because I am the maintainer of the trackback plugin for Nucleus. The signal-to-noise ratio has simply become rediculous - about one spam attempt every 15 seconds. There is a simple way to block all that spam - with 100% accuracy even - but given that I haven't received a proper trackback in months, I just think it isn't worth it anymore. Trackbacks are dead.
Placebo
Over the weekend I read an article on digg about a way to speed up Safari by removing an initial page loading delay. It reminded me about an article Dave Hyatt – one of the developers of Safari - wrote about the FOUC problem last September. In this article he explains why browsers sometimes need to delay the rendering of the page because of the interaction between the loading of external css stylesheets and scripts accessing the not yet completely loaded DOM. Today Dave responds to the article on digg. The funniest thing I’ve read in a while.
Some of the initial reactions of the digg crowd:
wow..this really did work for me..i mean..the only useful otion (there is a handful of diff options) i used was turning off the delay to open pages. AND IT WORKED! my safari does seem a bit quicker..
Yep, page loading delay is absurd to say the least, that shoud be disabled by default.
Ya, this software saved Safari for me, I was using firefox for a while on my mac, but I found this little hack and started using Safari again simply becuase the pages load so damn fast now. It is definately a noticable difference.
Made a HUGE difference for me! Safari is usable now!
And the reaction of Dave Hyatt:
This just goes to prove how inaccurate people’s powers of perception are when it comes to measuring the performance of browsers. I say this because the preference in question is dead and does absolutely nothing in Safari 1.3 and Safari 2.0.
Never underestimate the power of the placebo effect :)
CSS Selectors: 5 months later
During my work on the CSS selector test for CSS3.info I discovered a number of bugs in the selector implementations of all popular browsers. So I filed a number of bug reports with additional information on how to fix these bugs. It's now five months later and time for a little update.
Konqueror
Konqueror only contained a single bug that resulted in 6 buggy selectors: attribute values should be treated case-sensitive in some cases and case-insensitive in other cases. I've worked together with the developers of Konqueror to come up with a definitive list of how each attribute should be treated. The solution was too late for the release of Konqueror 3.5.5, but it did make the next one. I want to congratulate the developers because this makes it the first browser to pass all of the tests and its developers can rightfully claim that Konqueror 3.5.6 fully supports all of the CSS selectors.
Firefox
Firefox suffered from a number of different bugs, including a variant of the same bug that plagued Konqueror. It did contain a list of how attributes should be treated, but it was simply not complete. I personally wrote a patch that updated the list to the same as Konqueror. The patch was accepted and has made it in Gran Paradiso, the test version of the next generation engine for Firefox 3.
Unfortunately this is the only bug that was fixed in its CSS selector support. Firefox still suffers from a number of other problems. It incorrectly considers text nodes as a child during the evaluation of :first-child, :last-child and :only-child. Additionally the rendered view is not updated every time a selector matches or does not match anymore. For example, during the building of the DOM, each element is at one time the last child of its parent – at least until the next child is processed. After the DOM is build, the :last-child selector should only match the then-current last child. And Firefox does this correctly, but it forgets to update the rendered view, so to the user (and also our CSS selector test) it still looks like every element matches the selector. When you force Firefox to re-render the page – for example by hovering over a link with a :hover selector - it will show you the correct rendered view.
Safari
Webkit – the engine on which Safari is build has made a number of changes to their CSS selector support. First of all, they now properly support the :lang() selector, and started work on some other CSS selector bugs. Unfortunately they decided to stop working on those bugs and temporarily drop support for these CSS 3 selectors.
Even though I'd rather see full support for all selectors, I do agree with their reasoning that it is better to drop support than ship a browser with buggy support. Mac OS X 10.5 and Safari 3 are coming this spring and shipping a stable browser is very important. So I was not surprised that the WebKit developers announced that they were temporarily changing their focus on stability instead of on new features. CSS 3 selector support seems to be a victim of this change in focus, but I have full confidence that they will continue working on this after the release of Safari 3.
CSS selector bugs: Case sensitivity
The previous version of the CSS selector test contained a test case for determining if the value of an attribute selector was compared in a case-insensitive way. We only tested the align attribute – which should be treated in a case-insensitive way. But there are also a lot of other attributes which should be tested in a case-sensitive way.
When the CSS attribute selectors are evaluated the browser will compare the value of an element attribute with the value specified in the selector. If there is a match it will apply certain style rules to the element to which that attribute belongs.
There are 6 ways how an attribute value could be compared:
- is it equal to the specified value,
- is it a space-separated list that contains the specified value,
- is it a hyphen-separated list of which the first word is equal to the specified value,
- does it begin with the specified value
- does it contain the specified value
- does it end with the specified value
What does the CSS specification say?
The problem is that all of these comparisons can be handled in two different ways: case-sensitive and case-insensitive. Which one should be used is not defined by the CSS standard. Instead the specification tell us explicitly that the case-sensitivity is determined by the document language: in our case HTML. So we should look at the HTML standard to find out which should be used.
The case-sensitivity of attribute names and values in selectors depends on the document language.
What does the HTML standard say about this?
The HTML standard makes it quite difficult for us. There is no single correct way. Each attribute has its own rules. So basically we need to make a list of which attributes should be handled in which way.
Each attribute definition includes information about the case-sensitivity of its values.
Fortunately there are a couple categories:
| Category | Description |
|---|---|
| CS | The value is case-sensitive (i.e., user agents interpret "a" and "A" differently). |
| CI | The value is case-insensitive (i.e., user agents interpret "a" and "A" as the same). |
| CN | The value is not subject to case changes, e.g., because it is a number or a character from the document character set. |
| CA | The element or attribute definition itself gives case information. |
| CT | Consult the type definition for details about case-sensitivity. |
The first and second categories are clear. The first should be treated in a case-sensitive way, the second in a case-insensitive way. The third is also clear: it can be treated in either way, it should not matter. The fourth and fifth are bigger problems – we should look at the attribute definition or the type of the attribute.
The only attribute that uses the CA category is the value attribute of the input element. Because we do now know how each value should be treated – it depends on the purpose of the input element. If the input is used for entering a phone number it should be considered case-neutral, but for entering most other information it is important that the information is treated in a case-sensitive way. Because we do not know the purpose of the input element we need to treat everything in a case-sensitive way.
All elements that use the CT category uses one of the following three types: script, uri, uri list. Lists should be treated the same way as the type of which it consists, so we need to look at the script and uri types:
URIs in general are case-sensitive. There may be URIs, or parts of URIs, where case doesn't matter (e.g., machine names), but identifying these may not be easy. Users should always consider that URIs are case-sensitive (to be on the safe side).
So, uris should be treated in a case-sensitive way – just to be on the safe side. But how about scripts?
The case-sensitivity of script data depends on the scripting language.
The case-sensitivity of the script type is determined by the language of the script. We do not know which language is used by the script type, but we do know that the Javascript is case-sensitive. So just like the uri type we can consider this type to be case-sensitive – again just to be sure.
The CT category should be treated in the same way as the CS category.
Some more problems
Unfortunately there are a couple of problems in the HTML specification. The category for an attribute is dependant on the element to which it belongs. For example: the name attribute should be treated in a case-sensitive way if it belongs to an a element. If it belongs to an input element it should be treated in a case-insensitive way. Luckily this only applies to a limited set of attributes: type, name, value, weight and height.
The weight and height attributes generally belong to the CN category. It does not matter how they are treated. If the weight and height attribute belong to the applet element this changes. If that is the case it should belong to the CI category. This probably is a bug in the specification, because the type is the same in both cases: length. And the length type specifically tells us:
Length values are case-neutral.
So we can treat the weight and height attribute as case-neutral regardless of the element to which it belongs.
The value attribute is can be the CS, CA or even CN. We already determined that we should treat attribute in the CA category in a case-sensitive way. We also know that it does not matter how an attribute in the CN category is treated. So we can safely treat all value attributes in a case-sensitive way.
The name attribute is one that we cannot solve. For some elements it should be treated in a case-sensitive way, for example the a element. Other elements – such as the input element expect it to be treated in a case-insensitive way. There is no way around it. We need to look at the element if we want to know how to treat this attribute.
The type attribute is an even bigger problem. Not only does this attribute depend on the element it belongs to, but in one case it even depends on the parent element of the element it belongs to. Consider the following possibilities: the object element defines the type attribute as a content-type. This should be evaluated in a case-insensitive way. The same applies to the input element and the ul element. Both are case-insensitive. The type attribute of the ol element it is case-sensitive – the case of the value directly influences how the list is displayed.
The li element is what makes it even worse. If the li element is a child of an ul element its type attribute can contain the same values as the type attribute of the ul. This makes the type attribute of the li element case-insensitive. However if it is a child of an ol element, it can only have the same values of the ol element – making the type attribute of the li element case-sensitive. There is simply no proper way to make it easier.
The last problem that we need to solve are a couple of attributes were simply forgotten in the HTML specification. Nowhere in the specification is there a definition how it should be treated:
param > valueparam > nameimg > alignobject > alignapplet > align
Given that there are a couple of other elements that do define a behaviour for the align attribute we can safely assume that these should behave in the same way. Other elements that use the align attribute treat it in a case-insensitive way. So we should also do this for the img, object and applet element.
The problem of the value attribute is also simple to solve. We already discovered that it in all other cases it sould be treated in a case-sensitive way. We should also to this for the param element.
The name attribute is a bit more difficult. There is no single way to treat this attribute. Some elements treat it in a case-sensitive way, some in a case insensitive way. We do know that the param element is dependant on external factors. It is up to the browser plugin to the determine how it should be handled. And once again, because we do not know how it should be treated we should be on the safe side. Treat it in a case-sensitive way.
Our list of attributes
| Case-sensitive | title, id, class, content, scheme, datetime, summary, headers, abbr, standby, code, object, alt, label, prompt, for, value, profile, background, cite, href, src, longdesc, usemap, classid, codebase, data, archive, action, onload, onunload, onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onfocus, onblur, onkeypress, onkeydown, onkeyup, onsubmit, onreset, onselect, onchange param:name |
|---|---|
| Case-insensitive | lang, dir, http-equiv, text, link, vlink, alink, compact, align, frame, rules, valign, scope, axis, nowrap, hreflang, rel, rev, charset, codetype, declare, valuetype, shape, nohref, media, bgcolor, clear, color, face, noshade, noresize, scrolling, target, method, enctype, accept-charset, accept, checked, multiple, selected, disabled, readonly, language, defer img:name |
| Case-neutral | version, width, start, border, cellspacing, cellpadding, char, charoff, span, rowspan, colspan, height, coords, hspace, vspace, style, size, rows, cols, frameborder, marginwidth, marginheight, maxlength, tabindex, accesskey |
What about XHTML?
Up till now we've only talked about HTML and the HTML specification. Although XHTML looks like HTML, is a different language. Because the case-sensitivity is determined by the language, we need to take a complete new look at all the attributes. What does the XHTML spec say about case-sensitivity:
HTML 4 and XHTML both have some attributes that have pre-defined and limited sets of values (e.g. the type attribute of the input element). In SGML and XML, these are called enumerated attributes. Under HTML 4, the interpretation of these values was case-insensitive, so a value of TEXT was equivalent to a value of text. Under XML, the interpretation of these values is case-sensitive, and in XHTML 1 all of these values are defined in lower-case.
If an attribute consists of a list of defined values it should be treated in a case-sensitive way. Previously these values could be treated in a case-insensitive way. The same also applies to boolean attributes. These should contain their default value, which is defined in the DTD as a case-sensitive string.
The XHTML specification does not indicate that any other attributes should be treated in a different way. We can compare all the other attributes the same way as we used to do for HTML attribute. So the table only changes a little for XHTML documents.
| Case-sensitive | title, id, class, content, scheme, datetime, summary, headers, abbr, standby, code, object, alt, label, prompt, for, value, profile, background, cite, href, src, longdesc, usemap, classid, codebase, data, archive, action, onload, onunload, onclick, ondblclick, onmousedown, onmouseup, onmouseover, onmousemove, onmouseout, onfocus, onblur, onkeypress, onkeydown, onkeyup, onsubmit, onreset, onselect, onchange, dir, compact, align, scope, nowrap, frame, rules, valign, declare, valuetype, shape, nohref, clear, noshade, noresize, scrolling, method, checked, multiple, selected, disabled, readonly, defer param:name |
|---|---|
| Case-insensitive | http-equiv, text, link, vlink, alink, lang, axis, hreflang, rel, rev, charset, codetype, media, bgcolor, color, face, target, enctype, accept-charset, accept, language img:name |
| Case-neutral | version, width, start, border, cellspacing, cellpadding, char, charoff, span, rowspan, colspan, height, coords, hspace, vspace, style, size, rows, cols, frameborder, marginwidth, marginheight, maxlength, tabindex, accesskey |
A workable solution
All these rules and exceptions are not really a workable solution for browser developers. We need something more simple. Two simple lists that determine how an attribute should be treated: one for HTML and one for XHTML. Luckily, a workable solution isn't that difficult to distill from our present lists. The two problematic attributes are name and type. By looking at how these attribute are used in the real-world we can easily classify these attributes as case-sensitive or case-insensitive.
First is the name attribute. This attribute determines the name of the variable that is created on the server after the form is submitted. Because most server-side scripting lanuguages are case-sensitive, the real world usage of this attribute is case-sensitive. Another possibility is that this attribute is used by a form element - which is deprecated and replaced by the case-sensitive id attribute. In the real word it should also not give any problem to treat this attribute in a case-sensitive way for the frame and the iframe element. Conclusion: just treat this attribute in a case-sensitive way.
The type attribute can also be easily solved. Because the type attribute is deprecated for use on the ul, ol, and li element we can simply ignore these elements. Also when XHTML is used, the use of the type attribute for button and input is easily solved in the real world. XHTML demands that you specify the value of both attributes in lower-case. If the XHTML document is valid it does not matter if it is compared using a case-insensitive method - even though is strictly not narrow enough. In the real world you are not going to run into any problems. So simply treat the type attribute in a case insensitive way.
So now we have our simple solution. Based on the information above we can produce a list of attributes that should be treated in a case-insensitive way. All other attributes can be treated in a case-sensitive way - even the neutral attributes.
| Case-insensitive attributes for HTML documents | lang, dir, http-equiv, text, link, vlink, alink, compact, align, frame, rules, valign, scope, axis, nowrap, hreflang, rel, rev, charset, codetype, declare, valuetype, shape, nohref, media, bgcolor, clear, color, face, noshade, noresize, scrolling, target, method, enctype, accept-charset, accept, checked, multiple, selected, disabled, readonly, language, defer, type |
|---|---|
| Case-insensitive attributes for XHTML documents | http-equiv, text, link, vlink, alink, lang, axis, hreflang, rel, rev, charset, codetype, media, bgcolor, color, face, target, enctype, accept-charset, accept, language, type |