Mailing List Archive
tlug.jp Mailing List tlug archive tlug Mailing List Archive
[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index][tlug] Do you whitelist or blacklist utf-8?
- Date: Tue, 22 Feb 2011 19:57:13 +0900
- From: Dave M G <dave@example.com>
- Subject: [tlug] Do you whitelist or blacklist utf-8?
- User-agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.9.2.13) Gecko/20101208 Thunderbird/3.1.7
TLUG, I've been going a little mental today trying to figure out how to filter out possible malicious characters from POST data going to my site. I want to block things like <,>, *. etc... The thing is that I also want to be able to allow CJK characters, and any other language with non-Latin characters. This is a snap to do if you just want to allow 0-9a-zA-Z. But once you get into Unicode land, it seems to be a whole other ballgame. I've got three stages I want to filter on. First I want to block characters on the client side with Javascript, so that the user is aware of what characters are permissible when entering names and whatnot. Then I want to block any bad characters on the server side in PHP to make sure no script kiddies have tried to POST anything nasty. And also, just for good measure, I want to ensure no nastiness is inserted into my MySQL. I'd like all three steps to be consistent with each other, so I'm trying to standardize a set of bad characters that I can filter for at each step. However, where I've broken down is whether or not I should blacklist bad characters (where I fear I might miss one), whitelist good characters (seems tough to get a whitelist that's utf-8 compatible), or do something like make comparisons on HTML entities or with regex or something using built in functions (PHP and Javascript differ on specific functions and their results). Since you guys are the go-to people for handling utf-8 text, I thought maybe you've encountered this before. How do you handle filtering malicious code from utf-8 text that contains CJK and other languages? And how do you do it in PHP and Javascript? -- Dave M G
- Follow-Ups:
- Re: [tlug] Do you whitelist or blacklist utf-8?
- From: Jean-Christian Imbeault
- Re: [tlug] Do you whitelist or blacklist utf-8?
- From: Shmuel Fomberg
Home | Main Index | Thread Index
- Prev by Date: Re: [tlug] [announcement] 2011-02-19 Technical meeting.
- Next by Date: Re: [tlug] Do you whitelist or blacklist utf-8?
- Previous by thread: Re: [tlug] cacert question
- Next by thread: Re: [tlug] Do you whitelist or blacklist utf-8?
- Index(es):
Home Page Mailing List Linux and Japan TLUG Members Links