Me
About
Gallery
Company
Girl


Projects
Ruby on Rails
Basecamp
Highrise
Backpack
Campfire
Ta-da List
Writeboard


More
Feed
Archives

May 06, 18:13

Google's Accelerator in need of a recall

When car companies produce faulty vehicles that cause horrible accidents or aggravated injuries on regular mishaps, they do a recall. Google's Web Accelerator is causing death and destruction on an admitted much lower level of criticality, but it too is in need of an immediate recall.

Google screwed up on this one. Really bad. But that's not what's important. Or at least not the most important thing. No, the trial of "do no evil" is how quickly and effectively Google can react to their screw-up and set things right. How fast will they make the calculation:

Take the number of accelerators in the field, A, multiply by the probable rate of failure, B, multiply by the average out-of-court outrage over loss of data, C. A times B times C equals X. If X is less than the cost of a recall, we don't do one.

Google needs to acknowledge the size of X immediately and act on it yesterday.


Challenge by cron on May 06, 19:59

GWA has a built-in mechanism to upgrade itself to new versions. The "cost of a recall" would amount to releasing a new version with the more aggressive prefetching disabled.

Challenge by floppy on May 06, 20:06

A bit hyperbolic. It seems you and backpack are the ones with the problem, as google web accellerator existed first, and is actually following the standards and recommended practices.

Challenge by cron on May 06, 20:20

Is there any detailed info on the WA's behaviour available? i.e. has somebody sat down with some access logs and/or a packet sniffer and determined exactly what it does and hence who's affected?

(Trying hard not to bitch about how much easier this would be if we had source code...)

Challenge by David Heinemeier Hansson on May 06, 20:27

floppy: Heh. Check the comments section on the linked post to 37svn. One guy had leads zapped from his CRM intranet, phpmyadmin could wipe your databases in an instance, Blogger (owned by google) could wipe all your comments.

This is a huge freaking deal. If my typographic sense (ha!) hadn't got the better of me, I'd be overloading the post with exclamation marks.

Challenge by John Nilsson on May 06, 20:45

This is the exact reason why rfc2616 states that a GET request SHOULD be safe.

Don't blame Google, if anything this will get web developers to respect standards.

Anyone who thinks this is a problem should read section 9.1 of rfc2616 a few more times.

Challenge by RealWorld on May 06, 21:33

"Anyone who thinks this is a problem should read section 9.1 of rfc2616 a few more times."

Yeah, I read it a few times.

Now point me to the section in the HTML spec that makes it just as simple to use POST as it is to use GET. Now pretend I don't have JavaScript.

The reality is that unless I want a complicated page filled with a million forms, we will continue to use GETs instead of POSTs for most links in our user interface.

Challenge by Aredridel on May 06, 21:59

It was the misuse of GET that bugged me about instiki when I first tried it.

There's a very good reason for that RFC -- it's the reason Instiki had rollback problems, and it keeps lots of neat things from evolving, because one has to be so careful of widespread bad practice.

Challenge by Michael Koziarski on May 06, 22:01

RealWorld: You obviously need a more productive programming environment.

John's right that GET should always be safe, however ignoring the confirmation javascript is a bit off. There's lots of code out there which relys on "GET plus confirmation" being safe, as buggy as that is from the spec's point of view, Google should've tested their application against a wider range of web applications.

Challenge by dru on May 06, 22:42

Since one can make up as many rel=???" as you want it would be nice if google did a rel="google:noaccel" or something. That might be the simplest solution. But maybe not.

Challenge by Scott Raymond on May 06, 22:57

It's a drag that GWA is causing all these problems, but I think it should be viewed as an opportunity to improve our web applications.

Rails makes it fairly easy to do things the "right" way, and use HTTP POST for operations with side effects. But unfortunately, it has also become standard Rails style to use GET for "dangerous" operations, like deleting posts. John is exactly right that RFC2616 warns against this, and responsible web developers should follow the guidelines.

This problem could be turned into a win for Rails. Just modify the link_to helper method to optionally specify whether to use GET or POST. If the developer wants to use a link to destroy a post, they'd enter something like . The result would be a link with a Javascript handler that simply sends the request as POST rather than GET.

In other words, Rails could make it extremely easy for the developer to do the right thing.

Challenge by Scott Raymond on May 06, 22:58

Sorry, my little hypothetical Rails code sample got cut out. It was:

"link_to('Delete', :action => 'destroy', :use_post => true)"

Challenge by jd on May 06, 23:17

Yes, GET *should* be safe - but we all know that it isn't. The problem here is that GET-with-arguments is followed, and that it's enormously difficult getting certain legacy browsers - due partly to limitations in HTML 4.0, and the relative lack of useful DOM interfaces to POSTing data - to handle an above averagely complex user interface without the use of hyperlinks in place of form buttons.

This would be less of a problem if it weren't for people logging in to web pages (via GWA, which almost certainly does mean using POST). The only very-probably-safe approaches I can see that google could take would be:

- put a session into no-prefetch mode as soon as you see any POST at all (ie, the original login)
- or don't do prefetches when there's a cookie to be sent (particularly as the cookie could get rewritten in another window or javascript before the link is clicked anyway)
- or don't prefetch GET-with-args URLs

...or some combination thereof.

The *best* solution of course would be to either rewrite the http: URI scheme to support in-line specification of the request method (eg http:POST//some.server/whatever?something) and get everyone to use it, or rewrite HTML 4.0 to support nested/overlapping forms (whatwg.com attempts to suggest such changes). But both are well beyond Google's ability to do.

Adding a rel="noprefetch" link type or rel="destructive" link type, like dru says, would also do the job, but you've still got the problem of making all web sites use it where appropriate, which really is wishful thinking.

MK: Spiders (and proxies) don't run client-side scripts, and they quite definitely shouldn't either. What's more, I doubt that request-confirmation javascript code runs on the overwhelming majority of affected links, so you'd still have the same problem even if google did look for onclick=~/\bconfirm\(/. I'd agree in principle that this *beta feature* should, inretrospect, have had more complete usage testing beforehand, but you can hardly blame Google for looking at proxies, looking at their search engine, adding one and one (safe and safe) and making two (er... still safe?).

Anyway, Google being quite well-staffed with experienced algorithm wranglers, I'm sure they'll have some excellent heuristics for detecting such links within a few days. In the meantime, please remember - and please remind affected users - that no matter how reliable Google usually are, the feature is 'beta' and so should not be trusted with anything that you value.

Challenge by jd on May 06, 23:36

Just to correct myself slightly: of course any link that would trigger a non-null (or non-nil to Ruby-heads ;-) ) onclick event would have to be treated as non-deterministic in terms of the fact that you don't know that the hyperlink as-is will ever be followed.

i.e. - as mentioned by others above - the in-line script could return false (effectively "raise()"ing out of the event), it could rewrite the 'href' attribute before it's actually followed (okay, that's unlikely), or it could do something (eg. window.close()) which would completely stop processing of events in the page. So I think, actually, not prefetching links with a present-and-not-empty "onclick" attribute would be quite sensible even if, as I mentioned above, that only solves the problem for some cases.

Challenge by Tom Moertel on May 06, 23:38

Google didn't screw this one up. The blame for this debacle lies with the programmers and designers who used GET-based links to trigger destructive events in ignorance/spite of the standards.

I know that buttons aren't sexy. But sacrificing all in the name of sexiness does not make sense. The safeness of GET and HEAD are fundamental to the way HTTP is designed. This convention is a part of the HTTP specs going back a decade. It is the foundation for web-wide caching. It is the dividing line that tells smart user agents where they can and cannot tread. It is the contract upon which new and deeply cool web technologies are being created right now.

Should we put these things at risk because some people would rather use GET than follow the standards? No way.

Yes, for the short term, Google may have to provide work-arounds to prevent some nasty situations. Nevertheless, I hope that the creators of broken web apps take this wake-up call seriously. I hope that they see it as an opportunity to fix problems instead of a reason to rally behind and defend a poor practice, even if the practice is all too common in the "real world."

Challenge by Eric on May 06, 23:41

This is ridiculous! Web App designers (e.g., those at 37 Signals) are violating HTTP standards to make things look cool, and then they blame Google when things go wrong!?!? Please!


37 Signals screwed up on this one. Really bad. They should issue a recall.

Challenge by matt colgan on May 07, 1:16

If they'd only used the Mac to develop it, it would be quality software.

Challenge by Mark on May 07, 4:18

I also remember Instiki having terrible rollback problems because the "Revert" link was implemented using a GET which, by rights, should be idempotent.

This was in the days before the GWA - it was simple spiders crawling instiki.org that were inadvertently messing up the wiki.

As much as I hate to do it, David, I'm going to have to side with Google on this one. Following a link, whether by hand or by some automated process, should not change the state of a server's data. I would have thought that your experience with Instiki rollbacks would have driven this message home.

I guess the problem is worse now with GWA. The pre-fetch operation can take place on non-public parts of a site where much of the admin is kept.

Google should, and probably will, disable pre-fetches for GWA. Things might go back to normal for a while but how long will it be before some other browser plug-in or search engine comes along and destroys important data on a site that uses non-idempotent GETs? You cannot expect every new development to take care of this problem if it is not well known (ie, part of some standard or RFC).

Challenge by Jarkko Laine on May 07, 9:46

Eric (and Mark),

Please read the comments on the SvN blog entry and esp. Simon Willison's post. The spec says that developers shouldn't use GET, it doesn't say they are violating the specs if they do. Actually it's specifically said that there can be valid reasons to disobey these recommendations.

I sincerely admit that we as web app developers have a lot to learn from this episode but I still think you're distorting the discussion by bashing 37signals for this. It would be understandable if web application development would start from ground zero today. But it isn't. There's a whole sea of existing applications in the web that will be bitten by this and it's just plain nonsense going around screaming that it's your own fault.

As soon as people start using GWA and wreaking havoc in this imperfect world, they'll just be mad at Google and stop using the Accelerator. That's hardly what Google wants and as it's impossible for them to fix all the broken web apps in the world, there's realistically only one option left for them.

For another (bad) metaphor, this is about the same as leaving all the safety equipment away from a car because "if everyone obeys the traffic rules and laws, there will be no accidents".

Challenge by Brian Andersen on May 07, 13:11

WTF.

I'm sorry you guys, but turning this into some discussion whether or not to use standards when programming is just insane.

Safari or Mozilla don't break and table-based design because it's not "standard", they deal with real-world situations and try to make the best of it.

So people don't adhere to standards all the time, fine, that should change in the long run. But until it's changed, everyone better optimize their websites for IE6 and the other crap that's flying around - by realizing your own way isn't the only one.

Jarkko is making sense here; there's only one option, and that is a recall right at this instant.

Challenge by Eric on May 07, 18:51

OK, Jarkko, I read the post. You may not be up to date, so let me quote from the entry which is evolving though updates:


Update 2: If you haven't been following the comments, I've had a change of heart. Even in the absence of Web Accelerator, hiding behind authentication leaves your application open to some very nasty security vulnerabilities (malicious pages can piggy-back your session and cause havoc making dangerous GET calls). I still think the RFC language covers people who thought long and hard before implementing a dangerous GET, but if you haven't thought about security and accelerating caching proxies such as Web Accelerator you haven't been thinking hard enough. [emphasis added]


So, Jarkko, did you think long and hard enough? Clearly you haven't. As you say in your subsequent post, you just wanted to get things done.

Challenge by Simon Hawkins on May 09, 14:10

As a developer assessing Rails for a new development (previous developments have been PHP) I consider that get links being idempotent is very important. What ever fix is considered (and this one looks like it solves most of the problem http://community.moertel.com/ss/space/start/2005-05-08/1 ) it would be great if it did not rely on Javascript. When building interfaces I only use javascript to add enhancements to the UI not be a requirement for them, that way guaranteeing usability what ever the browser or settings. From what I have seen so far of Rails everything can be used with or without Javascript (Except AJAX of course!) - would be great if this continues to be the case.

Challenge by Sune Foldager on May 09, 23:59

This is obviously not Google's fault. It is, as someone stated above, a good opportunity to repair essentially broken software (including phpmyadmin etc.). It's not like this problem just appeared out of nowhere; Any cache or proxy could in principle cause the exact same situation, only the problem will appear (much) more frequently now that Google introduced their accelerator.

Indeed, web-spiders could (and have, for instance with Instiki) cause the same problems. When there is a clear standard (HTTP 1.1) and one side is breaking it, the other side following it, the breakers should rethink and repair their products, not the followers. Otherwise, standards become meaningless, which for some time seemed to be the case with HTML for example.