问题描述
上下文:在 IIS 中运行的 ASP.NET MVC,带有一个 UTF-8 % 编码的 URL.
Context: ASP.NET MVC running in IIS, with a a UTF-8 %-encoded URL.
使用标准项目模板,以及 HomeController
中的测试操作,例如:
Using the standard project template, and a test-action in HomeController
like:
public ActionResult Test(string id)
{
return Content(id, "text/plain");
}
这适用于大多数 %-encoded UTF-8 路由,例如:
This works fine for most %-encoded UTF-8 routes, such as:
http://mydevserver/Home/Test/%e4%ba%ac%e9%83%bd%e5%bc%81
与预期的结果京都弁
但是使用路线:
http://mydevserver/Home/Test/%ee%93%bb
网址未正确接收.
旁白:%ee%93%bb
是 % 编码的代码点 0xE4FB;基本-多语种平面,私人使用区域;但最终 - 一个有效的 unicode 代码点;您可以手动验证,或通过:
Aside: %ee%93%bb
is %-encoded code-point 0xE4FB; basic-multilingual-plane, private-use area; but ultimately - a valid unicode code-point; you can verify this manually, or via:
string value = ((char) 0xE4FB).ToString();
string encoded = HttpUtility.UrlEncode(value); // %ee%93%bb
现在,接下来会发生什么取决于网络服务器;在 Visual Studio 开发服务器(又名 cassini)上,接收到正确的 id
- 长度为 1 的字符串,包含代码点 0xE4FB.
Now, what happens next depends on the web-server; on the Visual Studio Development Server (aka cassini), the correct id
is received - a string of length one, containing code-point 0xE4FB.
但是,如果我在 IIS 或 IIS Express 中执行此操作,则会得到不同的 id
,特别是 "î»"
,代码点:0xEE、0x201C, 0xBB.您会立即将第一个和最后一个识别为百分比编码字符串的开始和结束......那么中间发生了什么?
If, however, I do this in IIS or IIS Express, I get a different id
, specifically "î"»"
, code-points: 0xEE, 0x201C, 0xBB. You will immediately recognise the first and last as the start and end of our percent-encoded string... so what happened in the middle?
嗯:
- code-point 0x93 is
"
(source) - code-point 0x201c is
"
(source)
在我看来,IIS 在处理我的 url 时执行了某种引用转换.现在也许这可能在一些场景中有用(我不知道),但是当它发生在 % 编码的 UTF-8 块的中间时,这肯定是一件坏事.
It looks to me very much like IIS has performed some kind of quote-translation when processing my url. Now maybe this might have uses in a few scenarios (I don't know), but it is certainly a bad thing when it happens in the middle of a %-encoded UTF-8 block.
请注意,HttpContext.Current.Request.Raw
也 显示此转换已发生,因此这看起来不像是 MVC 错误;还要注意 Darin 的评论,强调它在 url 的路径和查询部分的工作方式不同.
Note that HttpContext.Current.Request.Raw
also shows this translation has occurred, so this does not look like an MVC bug; note also Darin's comment, highlighting that it works differently in the path vs query portion of the url.
所以(两人):
- 我的分析是否遗漏了 unicode/url 处理的一些重要细节?
- 我该如何解决?(即,使我收到预期的字符)
推荐答案
最终,为了解决这个问题,我不得不使用 request.ServerVariables["HTTP_URL"]
和一些手动解析,带有一堆错误处理回退(额外补偿 Uri
中的一些相关故障).不是很好,但只会影响极少数尴尬的请求.
Ultimately, to get around this, I had to use request.ServerVariables["HTTP_URL"]
and some manual parsing, with a bunch of error-handling fallbacks (additionally compensating for some related glitches in Uri
). Not great, but only affects a tiny minority of awkward requests.
这篇关于IIS 是否在执行非法字符替换?如果是这样,如何阻止它?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,WP2