Node.JS高效压缩解压zip文件:将child_process执行结果输入stdout流


发布者 ourjs  发布时间 1517026678422
关键字 JS学习  Node.JS 
在 Node.js 中有很多与zip压缩解压相关的库,比如 node-unzip, adm-zip, archiver等。不过在低端ARM芯片压缩大目录时,会非常慢。

其实 Node.JS 是不太适合计算密集型的压缩与解压指令的,在较便宜的嵌入式设备上的性能并不是很好。

并且archiver/node-unzip 这些包也非常大,要占用几M的空。其实使用 7zip(windows)/ zip(linux) 等这些用C/C++写的软件,简单封装一下,性能就能很好地提升。


使用系统软件压缩与解压


首先需要在 Windows 电脑上安装 7-zip,在linux电脑上安装 zip (apt-get install zip)

然后在命令街上测试,比如将当前 onceio 目录写入上一级的 onceio.zip

Linux

zip -r ../onceio.zip ./onceio

Windows:

"C:\\Program Files\\7-Zip\\7z.exe" a -tzip ..\\onceio.zip ./onceio


将解压/解压结构输出到流


这里有一个难点,就是怎么样把输出结果输出到流。

比如说网页要下载很多文件,最好将这些文件压缩入流然后直接输出到 response,这样就不需要先创建这个文件了,内容也不需要预先装入内存或磁盘了。

其实几乎所有的命令行工具都支持将从命令行输入输出数据。我们只需要将流对象与stdin/stdout流pipe一下即可。

Linux: 将文件压缩到当前命令行(stdout),-表示不写入文件

zip -r - ./zip-util.js

Windows: -so 参数表示将结果输出到stdout

# 将 archive.gz 通过流写入 doc.txt
7z x archive.gz -so > Doc.txt
# 将 Doc.txt 压缩入 archive.gz
7z a dummy -tgzip -so Doc.txt > archive.gz

Node.JS代码参考

var zip   = function(input, output, cb) {
  var isOutStream = output && output.pipe
  var isInputArr  = Array.isArray(input)

  var zipHandler  = function(err, states) {
    var exePath       = ''
    var params

    var isOutDir      = states && states.isDirectory()
    var archive_name  = ''
    if (isOutDir || isOutStream) {
      //输出是目录或者流,文件名从输入获取
      archive_name    = path.basename(isInputArr ? input[0] : input) + '.zip'
    } else {
      //输出是文件名
      archive_name    = path.basename(output)
    }

    if (isOutStream) {
      //https://superuser.com/questions/148498/7zip-how-to-extract-to-std-output
      if (isWin32) {
        //7z x archive.gz -so > Doc.txt
        //7z a dummy -tgzip -so Doc.txt > archive.gz
        exePath   = path7Zip
        params    = ['a', '-tzip', '-so', archive_name]
      } else {
        //https://stackoverflow.com/questions/18933975/zip-file-and-print-to-stdout
        //zip -r - files_to_be_archived | nc remotehost 8787
        exePath   = 'zip'
        params    = ['-r', '-']
      }
    } else {
      var output_dir    = ''
      if (isOutDir) {
        output_dir      = output
      } else {
        output_dir      = path.dirname(output)
      }

      var archive_path  = path.join(output_dir, archive_name)

      if (isWin32) {
        //7z.exe' a -tzip -y onceio D:\\github\\oncedoc\\oncedoc\\svr\\onceio
        exePath   = path7Zip
        params    = ['a', '-tzip', '-y', archive_path]
      } else {
        //https://www.cnblogs.com/chinareny2k/archive/2010/01/05/1639468.html
        //zip -r yasuo.zip abc.txt dir1
        exePath   = 'zip'
        params    = ['-r', archive_path]
      }
    }

    if (isInputArr) {
      params = params.concat(input)
    } else {
      params.push(input)
    }

    console.log(exePath)
    console.log(params)

    var zipProc = cp.spawn(exePath, params)
    var errMsg  = ''
    var outMsg  = ''

    isOutStream && zipProc.stdout.pipe(output)

    zipProc.stderr.on('data', function (data) {
      errMsg += data.toString()
    })

    zipProc.on('close', function(code) {
      var err = null
      // if (errMsg) {
      //   err = new Error(errMsg)
      // }

      console.log(errMsg)
      //console.log(outMsg)
      cb && cb(err)
    })
  }

  isOutStream
    ? zipHandler()
    : fs.stat(output, zipHandler)
}


使用示例


将 onceio 压缩到上一级目录,会自动命名为 onceio.zip
zip('./onceio', '../', function() {
  console.log('zipped')
})

将 oncedb 压缩到上一级目录的 oncedb.new.zip
zip('./oncedb', '../oncedb.new.zip', function() {
  console.log('zipped')
})

压缩多个目录
zip(['./onceio', './oncedb'], '../oncelal.zip', function() {
  console.log(arguments)
})

将文件压缩到主进程输出流 process.stdout
zip('./zip-util.js', process.stdout, function() {
  console.log(arguments)
})

附:7zip与zip更更参数详解

Usage: 7z <command> [<switches>...] <archive_name> [<file_names>...]

<Commands>
  a : Add files to archive
  b : Benchmark
  d : Delete files from archive
  e : Extract files from archive (without using directory names)
  h : Calculate hash values for files
  i : Show information about supported formats
  l : List contents of archive
  rn : Rename files in archive
  t : Test integrity of archive
  u : Update files to archive
  x : eXtract files with full paths

<Switches>
  -- : Stop switches parsing
  @listfile : set path to listfile that contains file names
  -ai[r[-|0]]{@listfile|!wildcard} : Include archives
  -ax[r[-|0]]{@listfile|!wildcard} : eXclude archives
  -ao{a|s|t|u} : set Overwrite mode
  -an : disable archive_name field
  -bb[0-3] : set output log level
  -bd : disable progress indicator
  -bs{o|e|p}{0|1|2} : set output stream for output/error/progress line
  -bt : show execution time statistics
  -i[r[-|0]]{@listfile|!wildcard} : Include filenames
  -m{Parameters} : set compression Method
    -mmt[N] : set number of CPU threads
    -mx[N] : set compression level: -mx1 (fastest) ... -mx9 (ultra)
  -o{Directory} : set Output directory
  -p{Password} : set Password
  -r[-|0] : Recurse subdirectories
  -sa{a|e|s} : set Archive name mode
  -scc{UTF-8|WIN|DOS} : set charset for for console input/output
  -scs{UTF-8|UTF-16LE|UTF-16BE|WIN|DOS|{id}} : set charset for list files
  -scrc[CRC32|CRC64|SHA1|SHA256|*] : set hash function for x, e, h commands
  -sdel : delete files after compression
  -seml[.] : send archive by email
  -sfx[{name}] : Create SFX archive
  -si[{name}] : read data from stdin
  -slp : set Large Pages mode
  -slt : show technical information for l (List) command
  -snh : store hard links as links
  -snl : store symbolic links as links
  -sni : store NT security information
  -sns[-] : store NTFS alternate streams
  -so : write data to stdout
  -spd : disable wildcard matching for file names
  -spe : eliminate duplication of root folder for extract command
  -spf : use fully qualified file paths
  -ssc[-] : set sensitive case mode
  -sse : stop archive creating, if it can't open some input file
  -ssw : compress shared files
  -stl : set archive timestamp from the most recently modified file
  -stm{HexMask} : set CPU thread affinity mask (hexadecimal number)
  -stx{Type} : exclude archive type
  -t{Type} : Set type of archive
  -u[-][p#][q#][r#][x#][y#][z#][!newArchiveName] : Update options
  -v{Size}[b|k|m|g] : Create volumes
  -w[{path}] : assign Work directory. Empty path means a temporary directory
  -x[r[-|0]]{@listfile|!wildcard} : eXclude filenames
  -y : assume Yes on all queries

zip [-options] [-b path] [-t mmddyyyy] [-n suffixes] [zipfile list] [-xi list]
  The default action is to add or replace zipfile entries from list, which
  can include the special name - to compress standard input.
  If zipfile and list are omitted, zip compresses stdin to stdout.
  -f   freshen: only changed files  -u   update: only changed or new files
  -d   delete entries in zipfile    -m   move into zipfile (delete OS files)
  -r   recurse into directories     -j   junk (don't record) directory names
  -0   store only                   -l   convert LF to CR LF (-ll CR LF to LF)
  -1   compress faster              -9   compress better
  -q   quiet operation              -v   verbose operation/print version info
  -c   add one-line comments        -z   add zipfile comment
  -@   read names from stdin        -o   make zipfile as old as latest entry
  -x   exclude the following names  -i   include only the following names
  -F   fix zipfile (-FF try harder) -D   do not add directory entries
  -A   adjust self-extracting exe   -J   junk zipfile prefix (unzipsfx)
  -T   test zipfile integrity       -X   eXclude eXtra file attributes
  -y   store symbolic links as the link instead of the referenced file
  -e   encrypt                      -n   don't compress these suffixes

相关阅读:

Node.JS高效压缩解压zip文件:将child_process执行结果输入stdout流
NodeJS 文件压缩/解压方案-Linux上zip相关命令









 热门文章 - 分享最多
  1. 移动端开发框架哪个好?jQuery/Vue/AngularJS有哪些区别和优缺点?
  2. 比特币最近为何会暴跌?大资金如何靠做空比特币获利
  3. Node.JS中UDP打洞穿透内网路由,架设内网服务器技术详解及源码
  4. 红衣教主周鸿祎会不会成为中国首富
  5. OnceAir顽石企业私有云网盘使用介绍
  6. 马化腾创办腾讯的第一桶金是怎么来的:炒股10万炒到70万
  7. node.js用fs.rename强制重命名或移动文件夹
  8. 如何为OnceAir顽石个人私有云盘设置固定公网访问地址
  9. JavaScript数组从头开始的位置插入新元素或删除第一个元素
  10. Node.JS如何查看本地MAC/IP地址、计算cpu使用率和内存容量

 相关阅读
  1. Node.JS段点续传:Nginx配置文件分段下载功能实现
  2. Node.JS如何查看本地MAC/IP地址、计算cpu使用率和内存容量
  3. JavaScript数组从头开始的位置插入新元素或删除第一个元素
  4. Node.JS中UDP打洞穿透内网路由,架设内网服务器技术详解及源码
  5. 移动端开发框架哪个好?jQuery/Vue/AngularJS有哪些区别和优缺点?
  6. Node.JS 8.x和9.x新特性:N-API,NPM5,ERROR CODE
  7. Node.JS读取中文TXT编码文件显示乱码问题解决方案
  8. Node.JS与USB接口通信:检测U盘/移动硬盘插拔事件和发送接数据
  9. NodeJS动态传参特性:不定个数参数的省略,默认值与解构
  10. 从 Node 到 Go:一个粗略的比较—GO平均性能比JavaScript快十几倍

  开源的 OurJS
OurJS开源博客已经迁移到 OnceOA 平台。

  关注我们
扫一扫即可关注我们:
OnceJS

OnceOA