Node.js V0.12新特性之给子进程的同步API

尽管发明Node.js的初衷主要是为了编写Web服务器,但开发人员又发现了其他适用(和不适用!)Node的用途。令人觉得惊喜的是,这些用途中有一个是编写shell脚本。并且那确实有意义:Node的跨平台支持已经相当好了,既然前端和后端都用JavaScript写了,如果构建系统也用JavaScript写不是更好吗,对吧?

异步对shell脚本的坏处

在这一用途上值得称道的库是Grunt,它是构建在ShellJS之上的。然而ShellJS有一块硬骨头要啃:Node迫使它用异步I/O。尽管对于Web服务器来说异步I/O很棒,因为它必须随时做出响应,但对于需要逐步执行的shell脚本来说,异步I/O意义不大。

所以,ShellJS的作者们发现了一个“有趣的”解决办法,让它可以运行一个shell命令,然后等着命令完成。大致上是下面这样的代码:

var child_process = require(‘child_process’);
var fs = require(‘fs’);

function execSync(command) {
// 在子shell中运行命令
child_process.exec(command + ‘ 2>&1 1>output && echo done! > done’);

// 阻塞事件循环,知道命令执行完
while (!fs.existsSync(‘done’)) {
// 什么都不做
}

// 读取输出
var output = fs.readFileSync(‘output’);

// 删除临时文件。
fs.unlinkSync(‘output’);
fs.unlinkSync(‘done’);

return output;
}
换句话说,在shell执行你的命令时,ShellJS依然在运行,并持续不断地轮询着文件系统,检查是否能找到表明命令已经完成的那个文件。有点儿像驴子。

这种效率低下又丑陋不堪的解决办法让Node核心团队受刺激了,实现了一个真正的解决方案 – Node v0.12最终应该会支持同步运行子进程。实际上这个特性已经在路线图上放了很长时间了– 我记得是在2011年的JSConf.eu上(!) ,跟现在已经退休的Node维护者Felix Geisendoerfer坐在一起,勾勒出了一个实现execSync的办法。在过了两年多以后,这一特性现在终于出现在了master分支上。

恭喜,ShellJS的人们挑了一个很好的刺儿! 🙂

同步对shell脚本的好处

我们刚加上的API spawnSync跟它的异步小伙伴类似,它提供的底层API让你可以完全掌控子进程的设置。它还会返回所有我们能够收集的信息:退出码、终止信号、可能的启动错误,以及这个进程的全部输出。当然,在流中使用spawnSync没有任何意义-它是同步的,所以事件处理器不能在进程退出前运行-所以进程的所有输出会被缓冲到一个单例字符串或缓冲对象中。

并且就像众所周知的exec(运行shell命令)和execFile(用于运行一个可执行文件)方法一样,我们为常见的情况添加了execSync和execFileSync,它们比spawnSync更易用。如果你用了这些API,Node会假定你关心的只是进程写到stdout中的数据。如果进程或shell返回了非零的退出码,node会认为出现错误了,exec(Sync)会抛出。

比如获取项目git历史的代码就像下面这样简单:

var history = child_process.execSync(‘git log’, { encoding: ‘utf8’ });
process.stdout.write(history);
现在你可能在想“怎么要用这么长时间?”从表面上看,启动一个子进程并读取它的输出看起来简直是小菜一碟。也确实是这样-如果你只关心非常常见的情况。但是,我们不想做出来的解决方案只是一半。

当需要同时发送输入并读取一或多个输出流时,有两个选择:用线程-或者用事件循环。比如Python的实现,我们发现他们或者用事件循环(在Unix系的平台上)或者用线程(在Windows上)。并且它的实现可真不是一碟小菜。

2011年我们就意识到Node已经有一个非常棒的事件循环库了,即libuv。理论上已经具备了实现这一特性的所有条件。然而总是有或大或小的问题,让它并不能真正可靠地工作。

比如说,当子进程退出时,kernel会给node发送一个SIGCHLD信号通知它,但当有多个事件循环存在时,有很长一段时间libuv都不能正确处理信号。还有,删除事件循环并且不留下堆栈跟踪的能力也是最近才加上的。之前Node根本不管,它只是在某点退出,然后让OS打扫战场。如果我们需要一个临时的事件循环,并且在不需要它后仍然继续运行,这种策略就不太合适了。

慢慢的,随着时间的推移,所有这些问题都被解决了。所以如果你现在再设法看看过去那些缓冲区管理、参数解析、超时处理等诸如此类的东西,你会发现这个特性的核心只是一个事件循环,带子进程、计时器,还有一堆附着在它上面的管道。

如果你不关心它都是如何运作的,只需要看看文档,让node为控制子进程提供的丰富选项震你一下吧。现在谁愿意去把ShellJS修好?:)

作者简介

本文最初由Bert Belder发表在StrongLoop上。Bert Belder从2010年就开始做Node.js了,并且他还是libuv的主要编写者之一,Node.js就是在这个库上构建的。他除了是StrongLoop和Node核心的技术领导者,他正在做的特性还会让Node处于创新的最前沿,甚至是在1.0版出来之后。StrongLoop降低了在Node中开发APIs的难度,还添加了监测、集群化以及私有注册的支持等DevOps能力。

查看英文原文:What’s New in Node.js v0.12 – execSync: a Synchronous API for Child Processes 2014年3月12日

利用Proxy Cache使Nginx对静态资源进行缓存

前言

Nginx是高性能的HTTP服务器,通过Proxy Cache可以使其对静态资源进行缓存。其原理就是把静态资源按照一定的规则存在本地硬盘,并且会在内存中缓存常用的资源,从而加快静态资源的响应。

配置Proxy Cache

以下为nginx配置片段:

proxy_temp_path   /usr/local/nginx/proxy_temp_dir 1 2;

#keys_zone=cache1:100m 表示这个zone名称为cache1,分配的内存大小为100MB
#/usr/local/nginx/proxy_cache_dir/cache1 表示cache1这个zone的文件要存放的目录
#levels=1:2 表示缓存目录的第一级目录是1个字符,第二级目录是2个字符,即/usr/local/nginx/proxy_cache_dir/cache1/a/1b这种形式
#inactive=1d 表示这个zone中的缓存文件如果在1天内都没有被访问,那么文件会被cache manager进程删除掉
#max_size=10g 表示这个zone的硬盘容量为10GB

proxy_cache_path  /usr/local/nginx/proxy_cache_dir/cache1  levels=1:2 keys_zone=cache1:100m inactive=1d max_size=10g;

server {
    listen 80;
    server_name *.example.com;

    #在日志格式中加入$upstream_cache_status
    log_format format1 '$remote_addr - $remote_user [$time_local]  '
        '"$request" $status $body_bytes_sent '
        '"$http_referer" "$http_user_agent" $upstream_cache_status';

    access_log log/access.log fomat1;

    #$upstream_cache_status表示资源缓存的状态,有HIT MISS EXPIRED三种状态
    add_header X-Cache $upstream_cache_status;
    location ~ .(jpg|png|gif|css|js)$ {
        proxy_pass http://127.0.0.1:81;

        #设置资源缓存的zone
        proxy_cache cache1;

        #设置缓存的key
        proxy_cache_key $host$uri$is_args$args;

        #设置状态码为200和304的响应可以进行缓存,并且缓存时间为10分钟
        proxy_cache_valid 200 304 10m;

        expires 30d;
    }
}

安装Purge模块

Purge模块被用来清除缓存

$ wget http://labs.frickle.com/files/ngx_cache_purge-1.2.tar.gz
$ tar -zxvf ngx_cache_purge-1.2.tar.gz

查看编译参数

$ /usr/local/nginx/sbin/nginx -V 

在原有的编译参数后面加上--add-module=/usr/local/ngx_cache_purge-1.2

$ ./configure --user=www --group=www --prefix=/usr/local/nginx \
--with-http_stub_status_module --with-http_ssl_module \
--with-http_realip_module --add-module=/usr/local/ngx_cache_purge-1.2
$ make && make install

退出nginx,并重新启动

$ /usr/local/nginx/sbin/nginx -s quit
$ /usr/local/nginx/sbin/nginx

配置Purge

以下是nginx中的Purge配置片段

location ~ /purge(/.*) {
    #允许的IP
    allow 127.0.0.1;
    deny all;
    proxy_cache_purge cache1 $host$1$is_args$args;
}

清除缓存

使用方式:

$ wget http://example.com/purge/uri

其中uri为静态资源的URI,如果缓存的资源的URL为 http://example.com/js/jquery.js,那么访问 http://example.com/purge/js/jquery.js则会清除缓存。

命中率

保存如下代码为hit_rate.sh:

#!/bin/bash
# author: Jeremy Wei <shuimuqingshu@gmail.com>
# proxy_cache hit rate

if [ $1x != x ] then
    if [ -e $1 ] then
        HIT=`cat $1 | grep HIT | wc -l`
        ALL=`cat $1 | wc -l`
        Hit_rate=`echo "scale=2;($HIT/$ALL)*100" | bc`
        echo "Hit rate=$Hit_rate%"
    else
        echo "$1 not exsist!"
    fi
else
    echo "usage: ./hit_rate.sh file_path"
fi

使用方式

$ ./hit_rate.sh /usr/local/nginx/log/access.log

参考:

http://wiki.nginx.org/HttpProxyModule

(完)

 

作者: JeremyWei | 可以转载, 但必须以超链接形式标明文章原始出处和作者信息及版权声明
网址: http://weizhifeng.net/nginx-proxy-cache.html

6 Must Have Node.js Modules

So you’re thinking about using node.js: awesome. If you’re new to the community you’re probably thinking “what’s the best node.js module / library for X?” I think it’s really true when experienced language gurus say “80% of your favorite language is your favorite library.” This is the first in a series of articles will give you a high-level overview of some of our favorite node.js libraries at Nodejitsu. Today we’ll take a look at these libraries:

 

  1. cradle: A high-level, caching, CouchDB library for Node.js
  2. findit: Walk a directory tree in node.js
  3. node_redis: Redis client for node
  4. node-static: RFC2616 compliant HTTP static-file server module, with built-in caching.
  5. optimist: Light-weight option parsing for node.js
  6. xml2js: Simple XML to JavaScript object converter.

 

cradle: A high-level, caching, CouchDB library for Node.js

If you’re using CouchDB you should be using cradle. Cradle stands above the other CouchDB libraries in the node.js community: it has a robust LRU (least recently used) cache, bulk document processing, and a simple and elegant API:

 

//
// Create a connection
//
var conn = new(cradle.Connection)('http://living-room.couch', 5984, {
  cache: true,
  raw: false
});

//
// Get a database
//
var database = conn.database('newyorkcity');

//
// Now work with it
//
database.save('flatiron', {
  description: 'The neighborhood surrounding the Flatiron building',
  boundaries: {
    north: '28 Street',
    south: '18 Street',
    east: 'Park Avenue',
    west: '6 Avenue'
  }
}, function (err, res) {
  console.log(res.ok) // True
});

 

 

findit: Walk a directory tree in Node.js

A common set of problems that I see on the nodejs mailing list are advanced file system operations: watching all the files in a directory, enumerating an entire directory, etc. Recently, when working on my fork of docco to respect directory structure in the documentation produced I needed such a feature. It was surprisingly easy:

 

var findit = require('findit');

findit.find('/dir/to/walk', function (file) {
  //
  // This function is called each time a file is enumerated in the dir tree
  //
  console.log(file);
});

 

 

node_redis: Redis client for Node.js

There have been a lot of redis clients released for node.js. The question has become: which client is the right one to use? When selecting an answer to this question for any library you want to look for a few things including: the author, the recent activity, and the number of followers on GitHub. In this case the author is Matt Ranney, a member of the node.js core team. The most recent commit was yesterday, and the repository has over 300 followers.

Redis is really fast, and extremely useful for storing volatile information like sessions and cached data. Lets take a look at some sample usage:

 

var redis = require("redis"),
    client = redis.createClient();

client.on("error", function (err) {
  console.log("Error " + err);
});

client.set("string key", "string val", redis.print);
client.hset("hash key", "hashtest 1", "some value", redis.print);
client.hset(["hash key", "hashtest 2", "some other value"], redis.print);
client.hkeys("hash key", function (err, replies) {
  console.log(replies.length + " replies:");
  replies.forEach(function (reply, i) {
      console.log("    " + i + ": " + reply);
  });
  client.quit();
});

 

 

node-static: RFC2616 compliant HTTP static-file server module, with built-in caching

I bet you’re wondering “What the $%^@ is RFC2616?” RFC2616 is the standards specification for HTTP 1.1, released in 1999. This spec is responsible for outlining how (among other things) files should be served over HTTP. Thus, when choosing a node.js static file server, its important to understand which libraries are standards compliant and which are not: node-static is. In addition, it has some great built-in caching which will speed up your file serving in highly concurrent scenarios.

Using node-static is easy, lets make a static file server in 7 lines of Javascript:

 

var static = require('node-static');

//
// Create a node-static server instance to serve the './public' folder
//
var file = new(static.Server)('./public');

require('http').createServer(function (request, response) {
  request.addListener('end', function () {
    //
    // Serve files!
    //
    file.serve(request, response);
  });
}).listen(8080);

 

 

optimist: Light-weight option parsing for Node.js

One of the great things about node.js is how easy it is to write (and later publish with npm) simple command-line tools in Javascript. Clearly, when one is writing a command line tool one of the most important things is to have a robust command line options parser. Our library of choice for this at Nodejitsu is optimist by substack.

Lets take a look at a sample CLI script reminiscent of FizzBuzz:

 

#!/usr/bin/env node
var argv = require('optimist').argv;

if (argv.rif - 5 * argv.xup > 7.138) {
  console.log('Buy more riffiwobbles');
}
else {
  console.log('Sell the xupptumblers');
}

 

Using this CLI script is easy:

$ ./node-optimist.js --rif=55 --xup=9.52
Buy more riffiwobbles

$ ./node-optimist.js --rif 12 --xup 8.1
Sell the xupptumblers

This library has support for -a style arguments and --argument style arguments. In addition any arguments passed without an option will be available in argv._. For more information on this library check out the repository on GitHub.

 

xml2js: Simple XML to Javascript object converter

Writing clients in node.js for APIs that expose data through JSON is almost too easy. There is no need for a complex, language-specific JSON parsing library that one might find in languages such as Ruby or Python. Just use the built-in Javascript JSON.parse method on the data returned and voila! you’ve got native Javascript objects.

But what about APIs that only expose their data through XML? You could use the native libxmljsmodule from polotek, but the overhead of dealing with individual XML nodes is non-trivial and (in my opinion) can lead to excess complexity. There is another, simpler option: the lesser knownxml2js library available on npm and GitHub.

Lets suppose that we had some XML (/me dies a little inside):

 

<?xml version="1.0" encoding="UTF-8"?>
<root>
  <child foo="bar">
    <grandchild baz="fizbuzz">grandchild content</grandchild>
  </child>
  <sibling>with content!</sibling>
</root>

 

Parsing this using xml2js is actually surprisingly easy:

 

var fs = require('fs'),
    eyes = require('eyes'),
    xml2js = require('xml2js');

var parser = new xml2js.Parser();

parser.on('end', function(result) {
  eyes.inspect(result);
});

fs.readFile(__dirname + '/foo.xml', function(err, data) {
  parser.parseString(data);
});

 

The output we would see is:

 

{
  child: {
    @: { foo: 'bar' },
    grandchild: {
      #: 'grandchild content',
      @: { baz: 'fizbuzz' }
    }
  },
  sibling: 'with content!'
}

 

If you haven’t already noticed, xml2js transforms arbitrary XML to JSON in the following way:

  • All entity tags like <child> become keys in the corresponding JSON.
  • Simple tags like <sibling>with content</sibling> become simple key:value pairs (e.g. sibling: ‘with content!’)
  • More complex tags like <child>... and <grandchild>... become complex key:value pairs where the value is an Object literal with two important properties:
    1. @: An Object representing all attributes on the specified tag
    2. #: Any text content for this XML node.

This simple mapping can greatly simplify the XML parsing logic in your node.js application and is worth checking out if you ever have to deal with the three-headed dog we all love to hate.

 

Just getting started

This is the first in a series of articles where we will outline at a high level the best-of-the-best for modules, libraries and techniques in node.js that you should be aware of. If you’re interested in writing your own node.js modules and publishing them to npm, check out isaacs new article: How to Module over at howtonode.org.

关于编译resin的错误处理。

不知道什么时候开始,resin4就无法编译

错误1:aclocal: couldn’t open directory `m4′: No such file or directory

[root@datanode2 resin-pro-4.0.36]# make
CDPATH=”${ZSH_VERSION+.}:” && cd . && aclocal -I m4
perl: warning: Setting locale failed.
perl: warning: Please check that your locale settings:
LANGUAGE = (unset),
LC_ALL = (unset),
LANG = “en.UTF-8”
are supported and installed on your system.
perl: warning: Falling back to the standard locale (“C”).
aclocal: couldn’t open directory `m4′: No such file or directory
make: *** [aclocal.m4] Error 1

处理办法在resin目录下建立m4目录。

mkdir m4

问题解决。

错误2:Makefile.in not found

[root@datanode2 resin-pro-4.0.36]# make
cd . && automake –foreign –ignore-deps
configure.ac:1568: required file `../pro/modules/c/src/Makefile.in’ not found
configure.ac:1568: required file `../pro/modules/c/src/resin/Makefile.in’ not found
configure.ac:1568: required file `../pro/modules/c/src/resinssl/Makefile.in’ not found
make: *** [Makefile.in] Error 1

解决办法:
mkdir ../pro
cp -r modules ../pro/

 

以上2个错误应该跟新版的resin没有多大关系,不过可能是我的编译器比较旧吧。

SIEGE

Siege is an http load testing and benchmarking utility. It was designed to let web developers measure their code under duress, to see how it will stand up to load on the internet. Siege supports basic authentication, cookies, HTTP and HTTPS protocols. It lets its user hit a web server with a configurable number of simulated web browsers. Those browsers place the server “under siege.”

下载最新版

wget http://www.joedog.org/pub/siege/siege-latest.tar.gz

tar -zxf siege-lastest.tar.gz

cd siege-xxx

./configure && make && install

vi /tmp/tmpurl
http://127.0.0.1/index.html
http://127.0.0.1/images/banner/1.jpg
http://127.0.0.1/images/banner/8.jpg

siege -c 100 -b -i -r 100 -f /tmp/tmpurl

100个用户,执行100次。

** SIEGE 3.0.0
** Preparing 100 concurrent users for battle.
The server is now under siege.. done.

Transactions: 10000 hits
Availability: 100.00 %
Elapsed time: 2.44 secs
Data transferred: 1396.79 MB
Response time: 0.02 secs
Transaction rate: 4098.36 trans/sec
Throughput: 572.45 MB/sec
Concurrency: 91.24
Successful transactions: 10000
Failed transactions: 0
Longest transaction: 0.23
Shortest transaction: 0.00

静态文件用nginx直接serve

#css|js|ico|gif|jpg|jpeg|png|txt|html|htm|xml|swf|wav这些都是静态文件,但应分辨,js、css可能经常会变,过期时间应小一些,图片、html基本不变,过期时间可以设长一些
location ~* ^.+\.(ico|gif|jpg|jpeg|png|html|htm)$ {
root /var/www/poseidon/root/static;
access_log off;
expires 30d;
}
location ~* ^.+\.(css|js|txt|xml|swf|wav)$ {
root /var/www/poseidon/root/static;
access_log off;
expires 24h;
}
#注:location不包括?后面带的参数,所以以上正则可以匹配http://192.168.1.16/image/sxxx.jpg?a=xxx

Nginx的worker_cpu_affinity详解

配置文件中的worker_cpu_affinity可以用来绑定每个nginx进程所使用的CPU
官方的解释是:
#----------------------------引用文字-开始----------------------------
Syntax: worker_cpu_affinity cpumask [cpumask...]
Default: none
Linux only.
With this option you can bind the worker process to a CPU, it calls sched_setaffinity().
For example,
worker_processes 4;
worker_cpu_affinity 0001 0010 0100 1000;
Bind each worker process to one CPU only.
worker_processes 2;
worker_cpu_affinity 0101 1010;
Bind the first worker to CPU0/CPU2, bind the second worker to CPU1/CPU3. This is suitable for HTT.
#----------------------------引用文字-结束----------------------------

最关键的地方没说清楚,怎样来表示每个CPU?

经过漫天的搜索和多次尝试发现
详解

如果8核就是:

worker_cpu_affinity 00000001 00000010 00000100 00001000 00010000 00100000 01000000 10000000;

如此类推~~