Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]Re: [tlug] Re: Japanese in URLs?
- Date: Thu, 7 Feb 2008 10:44:15 +1100
- From: "Jim Breen" <jimbreen@example.com>
- Subject: Re: [tlug] Re: Japanese in URLs?
- References: <5634e9210802051929x4bc51a54n6c075baaf2c3ddeb@mail.gmail.com> <78d7dd350802052123g7761aab5s3057f2615100d359@mail.gmail.com> <87tzklivio.fsf@uwakimon.sk.tsukuba.ac.jp>
On 07/02/2008, Stephen J. Turnbull <stephen@example.com> wrote: > Nguyen Vu Hung writes: > > 2008/2/6, Jim Breen <jimbreen@example.com>: > > > and I don't want the browser to play it back to me as an > > > expletive in Klingon because it decided it was somethig in UTF-8. > > > It's different, of course, if the field has an ACE prefix such as > > > "xn--". > > RFC2718[1] says the URL *should* be encoded after the character sequences > > is transtalted to UTF-8. > No, it doesn't. Thanks, Stephen. You saved me writing something very similar. > > As far as I know, browsers which display anything but the hex-encoded > > path are strictly speaking in violation of RFC 3987: They can get very confusing if they attempt anything else. Consider the following URI: http://www.csse.monash.edu.au/~jwb/cgi-bin/wwwjdic.cgi?1W%B6%D0%A4%E1%A4%EB_v1 It is asking for the verb inflection table for 勤める, and since it is generated by a link within WWWJDIC, it is using WWWJDIC's internal coding (EUC-JP). I actually use WWWJDIC in UTF-8 (a cookie setting), so I get that table displayed in UTF-8. If Firefox attempted to display the URI by treating the %B6%D0%A4%E1%A4%EB as UTF-8, it would simply get garbage. > > What Firefox doing is not wrong but personally, I think the browser > > should be able to display actual Japanese for better readability. It would indeed be nice to get Japanese, etc, in URIs or IRIs displaying correctly, but there is no way a browser can be sure of the coding used. You could imagine a browser perhaps having an option for suggesting a URL (de)coding, but in fact the coding of strings such as 勤める above is usually entirely a matter for the server developer. Maybe in some rosy future when the whole universe uses Unicode for everything, and the specs for URIs and IRIs allow for raw UTF8, we might see browser specs being relaxed, but for now, I think Firefox is doing the Right Thing. Cheers Jim PS, I tried the above URL in Opera (9.25). It didn't attempt to decode the %xx%xx string. -- Jim Breen Honorary Senior Research Fellow Clayton School of Information Technology, Monash University, VIC 3800, Australia http://www.csse.monash.edu.au/~jwb/
- Follow-Ups:
- Re: [tlug] Re: Japanese in URLs?
- From: Stephen J. Turnbull
- References:
- [tlug] Re: Japanese in URLs?
- From: Jim Breen
- Re: [tlug] Re: Japanese in URLs?
- From: Nguyen Vu Hung
- Re: [tlug] Re: Japanese in URLs?
- From: Stephen J. Turnbull
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] Re: Japanese in URLs?
- Next by Date: Re: [tlug] Re: Japanese in URLs?
- Previous by thread: Re: [tlug] Re: Japanese in URLs?
- Next by thread: Re: [tlug] Re: Japanese in URLs?
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links