AP计算机众里寻他千百度，名师成就满分路

AP计算机

日志

关于我

千里马

大学讲师,中国首批AP计算机教师,著有中国第一套,历经五年实践证明深受学生欢迎的成功的AP计算机双语教材,2013年以93%的满分率开创了中国AP计算机成功的先河,远远超出全美26.6%的满分率,为中国AP计算机教学树立了典范,并在同年加拿大计算机竞赛中勇夺桂冠,任教学生获哥伦比亚大学,麻省理工学院,卡耐基梅隆大学,宾夕法尼亚大学,康奈尔大学,西北大学等学校录取,远程学生遍及北京、长春、南京、重庆、广州、济南, 深圳、成都、费城,洛杉矶,加州,宾州,新罕布什尔州等地,希望借此平台为信息技术的发展做出贡献！

文章分类

Java输入汉字的编码问题(转载）

2015-02-25 17:28:12| 分类：默认分类 | 标签： |举报 |字号大中小订阅

下载LOFTER 我的照片书 |

这个简单的Java程序竟然有问题，如果我们输入的是中文，程序不会正常输出。
[java]view plaincopy
import java.util.Scanner;  
public class Test {  
    public static void main(String[] args) {  
        Scanner scanner = new Scanner(System.in);  
        String s = scanner.next();  
        System.out.println("你输入了 = "+ s);  
    }  
}  

[plain]view plaincopy
结果：  
run：  
陶   // 用户输入  
你输入了 = ??  
成功生成（总时间：7 秒）  

这究竟是为什么呢？
先了解一下Java的输入文件流机制，System.in是字节流。系统是按照每个字节读入，最后组成字节组作为读入的。
Scanner是套在System.in外面的字符流。下面我们直接显示System.in读入的字节
[java]view plaincopy
public class Bianma {  
    private static void printBytes(byte[] bytes) {  
        for (byte b : bytes) {  
            printByte(b);  
        }  
        System.out.println();  
    }  
    private static void printByte(byte abyte) {  
        String hex = "00"+Integer.toHexString((int)abyte);  
        System.out.print(hex.substring(hex.length() - 2) + "\t");  
    }  
    public static void main(String[] args) throws IOException {  
        String s = "陶";  
        printBytes(s.getBytes("GBK"));  
        printBytes(s.getBytes("UTF-8"));  
        byte b = (byte) System.in.read();  
        while (true) {  
            printByte(b);  
            b = (byte) System.in.read();  
        }  
    }  
}  

我们可以看到System.in读入的字节流是默认以GBK编码的。
[plain]view plaincopy
run：  
cc        d5      // 陶的GBK编码      
e9        99        b6        // 陶的UTF8编码  
陶  
cc        d5        0a    // 0a是回车生成的   

对比，可以知道Scanner的字符套默认是以GBK编码转化的。
下面这个输出可以验证
[Java]view plaincopy
byte[] b = new byte[]{new Byte((byte) 0xcc), new Byte((byte) 0xd5)};  
System.out.println(new String(b, "GBK"));  

这个输出可以准确输出为：陶。
因此，我们有两个方法解决这个问题：
一、使用以下方式读入
[java]view plaincopy
Scanner scanner = new Scanner(System.in, "GBK");  

二、更改默认编码
[java]view plaincopy
String encoding = System.getProperty("file.encoding");  
System.out.println(encoding);  

这个输出为UTF8。如果输出为GBK则不会又开头所提的问题。可以认为file.encoding的值是Java程序main入口函数的默认编码。
NetBeans修改方法如下：

附注：
可以看到陶的UTF8编码为
e9 99 b6 
ef bb bf e9 99 b6（带有BOM头，其中BOM头为ef bb bf）
Unicode的编码为fe ff 76 96，其中fffe是控制高位和低位的发送顺序的。
其中变化方法如下：

上一篇[科普]数据存储知识扫盲
下一篇SCI期刊信息整理爬虫

评论这张

转发至微博

阅读(357)| 评论(0)

历史上的今天

this.p={  m:2,
              b:2,
              loftPermalink:'',
              id:'fks_087070082094087067083087084066072087085071087084081071086083',
              blogTitle:'Java输入汉字的编码问题(转载）',
              blogAbstract:'<div id=\"article_details\"   style=\"margin: 20px; color: rgb(51, 51, 51); font-family: Arial, Console, Verdana, \'Courier New\'; font-size: 12px; line-height: normal;\"   \><div id=\"article_content\"   style=\"margin: 20px 0px 0px; font-size: 14px; line-height: 26px; font-family: Arial;\"   \><p\>这个简单的Java程序竟然有问题，如果我们输入的是中文，程序不会正常输出。</p\><p\></p\><div data-unuse=\"1\"   style=\"font-family: Consolas, \'Courier New\', Courier, mono, serif; font-size: 12px; background-color: rgb(231, 229, 220); width: 700.90625px; overflow: auto; padding-top: 1px; margin: 18px 0px !important;\"   \><div data-unuse=\"1\"   style=\"padding-left: 45px;\"   \></div\></div\></div\></div\>',
              blogTag:'',
              blogUrl:'blog/static/144327425201512552812450',
              isPublished:1,
              istop:false,
              type:2,
              modifyTime:1424857545435,
              publishTime:1424856492450,
              permalink:'blog/static/144327425201512552812450',
              commentCount:0,
              mainCommentCount:0,
              recommendCount:0,
              bsrk:-100,
              publisherId:0,
              recomBlogHome:false,
              currentRecomBlog:false,
              attachmentsFileIds:[],
              vote:{},
              groupInfo:{},
              friendstatus:'none',
              followstatus:'unFollow',
              pubSucc:'',
              visitorProvince:'',
              visitorCity:'',
              visitorNewUser:false,
              postAddInfo:{},
              mset:'000',
              mcon:'',
              srk:-100,
              remindgoodnightblog:false,
              isBlackVisitor:false,
              isShowYodaoAd:true,
              hostIntro:'大学讲师,中国首批AP计算机教师,著有中国第一套,历经五年实践证明深受学生欢迎的成功的AP计算机双语教材,2013年以93%的满分率开创了中国AP计算机成功的先河,远远超出全美26.6%的满分率,为中国AP计算机教学树立了典范,并在同年加拿大计算机竞赛中勇夺桂冠,任教学生获哥伦比亚大学,麻省理工学院,卡耐基梅隆大学,宾夕法尼亚大学,康奈尔大学,西北大学等学校录取,远程学生遍及北京、长春、南京、重庆、广州、济南,\n深圳、成都、费城,洛杉矶,加州,宾州,新罕布什尔州等地,希望借此平台为信息技术的发展做出贡献！',
              hmcon:'0',
              selfRecomBlogCount:'0',
              lofter_single:'<iframe width="140" height="560" style="overflow:hidden;" src="http://www.lofter.com/mailEntry.do?blogad=1&blog" frameBorder="0"></iframe>'
            }

{list a as x}
    {if !!x}
    <div class="iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
      {if x.visitorName==visitor.userName}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.visitorNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.visitorName)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        {if x.moveFrom=='wap'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/wapblog.html?frompersonalbloghome"><span title="来自网易手机博客" class="iblock wapIcon"> </span></a>
        {elseif x.moveFrom=='iphone'}
          <a class="noul pnt" target="_blank"><span title="来自iPhone客户端" class="iblock iphoneIcon"> </span></a>
        {elseif x.moveFrom=='android'}
          <a class="noul pnt" target="_blank"><span title="来自Android客户端" class="iblock androidIcon"> </span></a>
        {elseif x.moveFrom=='mobile'}
          <a class="noul pnt" target="_blank" href="http://blog.163.com/services/emsblog.html?frompersonalbloghome"><span title="来自网易短信写博" class="iblock wapIcon"> </span></a>
        {/if}
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.visitorName}/">
          ${fn(x.visitorNickname,8)|escape}
        </a>
      </div>
    </div>
    {/if}
    {/list}

<#--最新日志，群博日志--> <#--推荐日志-->

<p class="fc06">推荐过这篇日志的人：</p>
    <div>
      {list a as x}
      {if !!x}
      <div class="iblock nbw-fce nbw-f40">
        <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
        <img alt="${x.recommenderNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.recommenderName)}"/>
        </a>
        <div class="cwd thide">
          <a class="fc03 m2a" target="_blank" hidefocus="true" href="http://blog.163.com/${x.recommenderName}/">
            ${fn(x.recommenderNickname,6)|escape}
          </a>
        </div>
      </div>
      {/if}
      {/list}
    </div>
    {if !!b&&b.length>0}
    <p  class="fc06">他们还推荐了：</p>
    <ul>
    {list b as y}
      {if !!y}
        <li class="rrb"><span class="iblock">·</span><a class="fc03 m2a" target="_blank" href="http://blog.163.com/${y.recommendBlogPermalink}/?from=blog/static/144327425201512552812450">${y.recommendBlogTitle|escape}</a></li>
      {/if}
    {/list}
    </ul>
    {/if}

<#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇，下一篇--> <#-- 热度 -->

{list a as x}
    {if !!x}
    <div class="hotItem iblock nbw-fce nbw-f40">
      <a class="fc03 noul" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
      {if x.publisherUsername==visitor.userName}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}&r=${visitor.imageUpdateTime}"/>
      {else}
      <img alt="${x.publisherNickname|escape}" onerror="this.src=location.f40" class="cwd bdwa bdc0" src="${fn1(x.publisherUsername)}"/>
      {/if}
      </a>
      <div class="cwd vname thide">
        <a class="fc03 m2a"  target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/">
          ${fn(x.publisherNickname,8)|escape}
        </a>
      </div>
      <a class="f-myLikeIcons hottype {if x.type==1} js-liketype{elseif x.type==2} js-reblogtype{elseif x.type==3} js-sharetype{else}{/if}" target="_blank" hidefocus="true" href="http://blog.163.com/${x.publisherUsername}/"> </a>
    </div>
    {/if}
    {/list}

<#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->

页脚

我的照片书 - 手机博客 - 下载LOFTER APP - 订阅此博客

AP计算机众里寻他千百度，名师成就满分路

导航

日志

Java输入汉字的编码问题(转载）

历史上的今天

最近读者

热度

评论

页脚