<kbd id="afajh"><form id="afajh"></form></kbd>
<strong id="afajh"><dl id="afajh"></dl></strong>
    <del id="afajh"><form id="afajh"></form></del>
        1. <th id="afajh"><progress id="afajh"></progress></th>
          <b id="afajh"><abbr id="afajh"></abbr></b>
          <th id="afajh"><progress id="afajh"></progress></th>

          golang處理gb2312轉(zhuǎn)utf-8編碼的問(wèn)題

          共 3611字,需瀏覽 8分鐘

           ·

          2020-11-24 02:33

          問(wèn)題描述:


          如果你有把曾經(jīng)的php或者java的老代碼用go重寫(xiě)的經(jīng)驗(yàn),很可能會(huì)遇到gb2312轉(zhuǎn)utf-8的問(wèn)題


          最近有同學(xué)在工作有使用到iconv-go這個(gè)庫(kù),涉及到轉(zhuǎn)換字符的,出現(xiàn)如下報(bào)錯(cuò),然后再咨詢我,然后我自己也學(xué)習(xí)了一下。

          報(bào)錯(cuò)信息如下:

          invalid or incomplete multibyte or wide character


          用到的golang轉(zhuǎn)化庫(kù)為:

          github.com/djimenez/iconv-go


          使用的函數(shù)為:

          body, err = iconv.ConvertString(body, "GBK", "utf-8")


          解決思路:

          進(jìn)去github.com/djimenez/iconv-go點(diǎn)擊源碼查看

          首先iconv.ConvertString的實(shí)現(xiàn)是在iconv.go中

          func ConvertString(input string, fromEncoding string, toEncoding string) (output string, err error) {  // create a temporary converter  converter, err := NewConverter(fromEncoding, toEncoding)
          if err == nil { // convert the string output, err = converter.ConvertString(input)
          // close the converter converter.Close() }
          return}

          通過(guò)以上發(fā)現(xiàn), 它調(diào)用了

          NewConverter(fromEncoding, toEncoding)

          新建了一個(gè)結(jié)構(gòu)體Converter,調(diào)用下面結(jié)構(gòu)體的實(shí)現(xiàn)的

          output, err = converter.ConvertString(input)


          繼續(xù)跟蹤這個(gè)結(jié)構(gòu)方法,在converter.go內(nèi)找到實(shí)現(xiàn)

          type Converter struct {  context C.iconv_t  open    bool}
          // Initialize a new Converter. If fromEncoding or toEncoding are not supported by// iconv then an EINVAL error will be returned. An ENOMEM error maybe returned if// there is not enough memory to initialize an iconv descriptorfunc NewConverter(fromEncoding string, toEncoding string) (converter *Converter, err error) { converter = new(Converter)
          // convert to C strings toEncodingC := C.CString(toEncoding) fromEncodingC := C.CString(fromEncoding)
          // open an iconv descriptor converter.context, err = C.iconv_open(toEncodingC, fromEncodingC)
          // free the C Strings C.free(unsafe.Pointer(toEncodingC)) C.free(unsafe.Pointer(fromEncodingC))
          // check err if err == nil { // no error, mark the context as open converter.open = true }
          return}

          可以看出,它底層調(diào)用的是CGO庫(kù)轉(zhuǎn)化實(shí)現(xiàn)

          converter.context, err = C.iconv_open(toEncodingC, fromEncodingC)


          通過(guò)查詢C庫(kù)的文檔man iconv_open,DESCRIPTION部分有如下介紹

          The empty encoding name "" is equivalent to "char": it denotes the locale dependent character encoding.
          When the string "http://TRANSLIT" is appended to tocode, transliteration is activated. This means that when a character cannot be represented in the targetcharacter set, it can be approximated through one or several similarly looking characters.
          When the string "http://IGNORE" is appended to tocode, characters that cannot be represented in the target character set will be silently discarded.
          The resulting conversion descriptor can be used with iconv any number of times. It remains valid until deallocated using iconv_close.
          A conversion descriptor contains a conversion state. After creation using iconv_open, the state is in the initial state. Using iconv modifies the descrip-tor's conversion state. (This implies that a conversion descriptor can not be used in multiple threads simultaneously.) To bring the state back to the ini-tial state, use iconv with NULL as inbuf argument.

          重點(diǎn)是這句話

          When the string "http://IGNORE" is appended to tocode, characters that cannot be represented in the target character set will be silently discarded.


          大致意思是說(shuō),在"tocode"之后加"http://IGNORE",那些不能被tocode顯示的字符將會(huì)自動(dòng)被忽略,oh good,正好是我想要的.


          由這些層層調(diào)用關(guān)系

          ConvertString(input string, fromEncoding string, toEncoding string)NewConverter(fromEncoding string, toEncoding string) (converter *Converter, err error)C.iconv_open(toEncodingC, fromEncodingC)


          我們只需將//IGNORE傳遞到c庫(kù)既可支持


          所以代碼改為:

          body, err = iconv.ConvertString(body, "GBK", "utf-8//IGNORE")


          經(jīng)測(cè)試,沒(méi)有報(bào)err,大功告成.



          重述一下解決方案:

          body, err = iconv.ConvertString(body, "GBK", "utf-8//IGNORE")


          推薦閱讀


          福利

          我為大家整理了一份從入門(mén)到進(jìn)階的Go學(xué)習(xí)資料禮包,包含學(xué)習(xí)建議:入門(mén)看什么,進(jìn)階看什么。關(guān)注公眾號(hào) 「polarisxu」,回復(fù)?ebook?獲??;還可以回復(fù)「進(jìn)群」,和數(shù)萬(wàn) Gopher 交流學(xué)習(xí)。


          瀏覽 78
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          評(píng)論
          圖片
          表情
          推薦
          點(diǎn)贊
          評(píng)論
          收藏
          分享

          手機(jī)掃一掃分享

          分享
          舉報(bào)
          <kbd id="afajh"><form id="afajh"></form></kbd>
          <strong id="afajh"><dl id="afajh"></dl></strong>
            <del id="afajh"><form id="afajh"></form></del>
                1. <th id="afajh"><progress id="afajh"></progress></th>
                  <b id="afajh"><abbr id="afajh"></abbr></b>
                  <th id="afajh"><progress id="afajh"></progress></th>
                  学生妹一级 | 欧美性受XXX黑人XYX | 久久久午夜福利视频 | 色综合综合色 | 无码无毛 |