php 读取 和 生成 unicode csv 文件的例子

下文给各位整理了一篇关于php 读取 和 生成 unicode csv 文件的例子,希望这个例子可以给各位带来帮助的哦.

先介绍下BOM:

  1. Bytes Encoding Form
  2. EF BB BF UTF-8
  3. FF FE UTF-16 aka UCS-2, little endian
  4. FE FF UTF-16 aka UCS-2, big endian
  5. 00 00 FF FE UTF-32 aka UCS-4, little endian.
  6. 00 00 FE FF UTF-32 aka UCS-4, big-endian.

读取 unicode csv 文件,代码如下:

  1. function fopen_utf8($filename){
  2. $encoding='';
  3. $handle = fopen($filename, 'r');
  4. $bom = fread($handle, 2);
  5. // fclose($handle);
  6. rewind($handle);
  7. if($bom === chr(0xff).chr(0xfe) || $bom === chr(0xfe).chr(0xff)){
  8. // UTF16 Byte Order Mark present
  9. $encoding = 'UTF-16';
  10. } else {
  11. $file_sample = fread($handle, 1000) + 'e'; //read first 1000 bytes
  12. // + e is a workaround for mb_string bug
  13. rewind($handle);
  14. $encoding = mb_detect_encoding()($file_sample , 'UTF-8, UTF-7, ASCII, EUC-JP,SJIS, eucJP-win, SJIS-win, JIS, ISO-2022-JP');
  15. } //开源软件:phpfensi.com
  16. if ($encoding){
  17. stream_filter_append($handle, 'convert.iconv.'.$encoding.'/UTF-8');
  18. }
  19. return ($handle);
  20. }

生成 unicode csv,此php文件一定要是无BOM的UTF-8编码文件.

  1. $content=iconv("UTF-8","UTF-16LE",$content);
  2. $content = "\xFF\xFE".$content; //添加BOM
  3. header("Content-type: text/csv;charset=UTF-16LE") ;
  4. header("Content-Disposition: attachment; filename=test.csv");

再介绍一个 操作 ANSI 编码 以 "," 隔开的 操作类,代码如下:

  1. <?php
  2. // Unicode BOM is U+FEFF, but after encoded, it will look like this.
  3. define ('UTF32_BIG_ENDIAN_BOM' , chr(0x00) . chr(0x00) . chr(0xFE) . chr(0xFF));
  4. define ('UTF32_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE) . chr(0x00) . chr(0x00));
  5. define ('UTF16_BIG_ENDIAN_BOM' , chr(0xFE) . chr(0xFF));
  6. define ('UTF16_LITTLE_ENDIAN_BOM', chr(0xFF) . chr(0xFE));
  7. define ('UTF8_BOM' , chr(0xEF) . chr(0xBB) . chr(0xBF));
  8. function detect_utf_encoding($filename) {
  9. $text = file_get_contents($filename);
  10. $first2 = substr($text, 0, 2);
  11. $first3 = substr($text, 0, 3);
  12. $first4 = substr($text, 0, 3);
  13. if ($first3 == UTF8_BOM) return 'UTF-8';
  14. elseif ($first4 == UTF32_BIG_ENDIAN_BOM) return 'UTF-32BE';
  15. elseif ($first4 == UTF32_LITTLE_ENDIAN_BOM) return 'UTF-32LE';
  16. elseif ($first2 == UTF16_BIG_ENDIAN_BOM) return 'UTF-16BE';
  17. elseif ($first2 == UTF16_LITTLE_ENDIAN_BOM) return 'UTF-16LE';
  18. }
  19. ?>