Scribe分布式日志系统

Scribe分布式日志系统
一.介绍
Scribe是facebook开发的分布式日志系统,使用thrift传输log,因此无论是什么语言开发的项目都可以实现log收集,传送到远程或主从同步到远程。Scribe由两部分组成:central scribe server和local scribe server。在分布式系统中,每一个节点都会运行一个local scribe server,收集此节点的日志信息,并将其发送给central scribe server。
各个数据源需通过thrift向scribe传输数据,每条数据记录包含一个category和一个message。可以在scribe配置中指定thrift线程数,默认是3。在后端,scribe可以将不同category的数据存放到不同目录中,以便于进行分别处理。后端的日志存储方式可以是各种各样的store,包括file,buffer(双层存储,一个主存储,一个副存储),network(另一个scribe服务器),bucket(包含多个store,通过hash将数据存到不同store中),null(忽略数据),thriftfile(写到一个thrift tfile transport文件中)和multi(把数据同时存放到不同store中)。
scribe,分布式日志系统

二.安装
# yum install gcc-c++ libevent libevent-devel automake autoconf m4 bison zlib zlib-devel bzip2 bzip2-devel flex pkgconfig python python-devel ruby ruby-devel mono-devel libxml2 libxml2-devel ant openssl-devel
# wget http://nchc.dl.sourceforge.net/project/boost/boost/1.45.0/boost_1_45_0.tar.gz
# tar zxvf boost_1_45_0.tar.gz -C ../software/
# ./bjam
# ./bjam -s HAVE_ICU=1 –prefix=/usr/local/boost-1.45 –includedir=/usr/local/boost-1.45/include –libdir=/usr/local/boost-1.45/lib
# ./bjam install –prefix=/usr/local/boost-1.45
# wget http://www.apache.org/dist//thrift/0.7.0/thrift-0.7.0.tar.gz
# tar zxvf thrift-0.7.0.tar.gz -C ../software/
# ./configure –prefix=/usr/local/thrift-0.7.0 –with-boost=/usr/local/boost-1.45
# make
# make install
# cd ./contrib/fb303/
# ./bootstrap.sh –with-boost=/usr/local/boost-1.45
# ./configure –prefix=/usr/local/thrift-0.7.0/fb303 –with-boost=/usr/local/boost-1.45 –with-thriftpath=/usr/local/thrift-0.7.0
# make
# make install
https://github.com/pcting/scribe/downloads 下载pcting-scribe-2ee14d3.tar.gz
# tar zxvf pcting-scribe-2ee14d3.tar.gz -C ../software/
# export BOOST_ROOT=/usr/local/boost-1.45
# export LD_LIBRARY_PATH=/usr/local/thrift-0.7.0/lib:/usr/lib:/usr/local/lib:/usr/local/boost-1.45/lib:/usr/local/thrift-0.7.0/fb303/lib
# ./bootstrap.sh –with-boost=/usr/local/boost-1.45
# ./configure –prefix=/usr/local/scribe –with-boost=/usr/local/boost-1.45 –with-thriftpath=/usr/local/thrift-0.7.0 –with-fb303path=/usr/local/thrift-0.7.0/fb303
# make
# make install

三.配置
# mkdir etc
# cp /usr/local/src/software/pcting-scribe-2ee14d3/examples/example1.conf ./scribe.conf
# mkdir /tmp/scribetest
配置参数解释如下:
具体参数含义参见:http://sourceforge.net/apps/mediawiki/scribeserver/index.php?title=Configuration
//全局配置参数
port=1463
max_msg_per_second=2000000 //每秒最大日志并发数
check_interval=3 //检测每个store频率
max_queue_size=10000000 //最大队列大小,单位是byte
new_thread_per_category //是否为每个category创建一个新的线程。默认是ture
//存储配置 scribe是根据存储配置来记录信息的。每个store必须指定以下消息类(category):
default store:只能有一个默认store。用来处理任何store没有处理的category
prefix store:如果指定category以*结尾,store将处理任何以指定的前缀的categories

category=default //用来决定哪些信息由这个store来处理
type=buffer //类型 Store::createStore
target_write_size=20480 //在处理消息前允许多大的消息队列增长 单位是byte
max_write_interval=1 //在处理消息前多久允许消息队列增长间隔
buffer_send_rate=2 //从secondary store多少次读取一个组消息发送到primary store
retry_interval=30 //写入到primary store失败后,多久时间再次发送到primary store
retry_interval_range=10 //重试间隔
fs_type=std //目前只支持std
file_path=/tmp/scribetest //存储目录
base_filename=thisisoverwritten //文件名称,默认是category名称
max_size=1000000 //文件轮转大小
add_newlines=1 //每条日志是否一行
rotate_period=daily //轮转周期,hourly,daily,never默认是never 创建新文件的频率
rotate_hour=0 //0-23,默认是1 如果rotate_period=daily,几点轮转
rotate_minute=10 //0-59,默认是15 如果rotate_period=daily/hourly,几点几分轮转
remote_host //将日志发送到远程scribe服务器上
remote_port
timeout //socket超时时间,单位ms
use_conn_pool //是否使用连接池代替开放的多连接,默认是false

# vi scribe.conf
port=1463
max_msg_per_second=2000000
check_interval=3

category=default
type=buffer
target_write_size=20480
max_write_interval=1
buffer_send_rate=2
retry_interval=30
retry_interval_range=10

type=file
fs_type=std
file_path=/tmp/scribetest
base_filename=xuhh
max_size=1000000
add_newlines=1
# ../bin/scribed ./scribe.conf & //开启scribed服务
# echo “test scribe” | ../bin/scribe_cat scribe_test //测试
# cat /tmp/scribetest/scribe_test/scribe_test_current //查看日志信息
# ../bin/scribe_ctrl counters //查看统计信息
scribe_test:received good: 2

集中存储到scribe-central配置
scribe,分布式日志系统
Central端配置:
# vi scribe-central.conf
port=1463
max_msg_per_second=2000000
check_interval=3

category=ignore*
type=null

category=bucket_me
type=buffer
target_write_size=20480
max_write_interval=1
buffer_send_rate=2
retry_interval=30
retry_interval_range=10

type=bucket
num_buckets=5
bucket_subdir=bucket
bucket_type=key_hash
delimiter=58

type=file
fs_type=std
file_path=/tmp/scribetest
base_filename=bucket_me
max_size=10000

category=default
type=buffer
target_write_size=20480
max_write_interval=1
buffer_send_rate=2
retry_interval=30
retry_interval_range=10 type=file
fs_type=std
file_path=/tmp/scribetest
base_filename=thisisoverwritten
max_size=1000000

Client端配置:
# vi scribe-client.conf
port=1464
max_msg_per_second=2000000
check_interval=3

category=default
type=buffer
target_write_size=20480
max_write_interval=1
buffer_send_rate=1
retry_interval=30
retry_interval_range=10

type=network
remote_host=192.168.1.173
remote_port=1463

[client]# echo “from scribe_client to scribe_central” | ../bin/scribe_cat -h localhost:1464 scribe_central
[Thu Dec 1 14:30:37 2011] “Successfully sent messages to remote scribe server “
[client]# echo “this message will be ignored” | ../bin/scribe_cat -h localhost:1464 ignore_me //将忽略掉
[client]# echo “this message will be bucketed” | ../bin/scribe_cat -h localhost:1464 bucket_me
[Thu Dec 1 15:04:21 2011] “[bucket_me] Creating new category store from model default”
[Thu Dec 1 15:04:21 2011] “Opened connection to remote scribe server “
[Thu Dec 1 15:04:21 2011] “[bucket_me] Opened file for writing”
[Thu Dec 1 15:04:21 2011] “[bucket_me] Changing state from to <SENDING_BUFFER>”
[Thu Dec 1 15:04:24 2011] “[bucket_me] read entries of bytes from file “
[Thu Dec 1 15:04:28 2011] “Successfully sent messages to remote scribe server “
[Thu Dec 1 15:04:28 2011] “[bucket_me] No more buffer files to send, switching to streaming mode”
[Thu Dec 1 15:04:28 2011] “[bucket_me] Changing state from <SENDING_BUFFER> to “
[central]# cat /tmp/scribetest/scribe_central/scribe_central_current
from scribe_client to scribe_central
[central]# cat bucket000/bucket_me_current
this message will be bucketed
[central]# /home/xuhh/scribe-2.2/bin/scribe_ctrl counters
bucket_me:received good: 1
scribe_overall:received good: 5
scribe_central:received good: 3
scribe_overall:ignored: 1
ignore_me:ignored: 1
ignore_me:received good: 1
[client]# /usr/local/scribe/bin/scribe_ctrl counters 1464
bucket_me:received good: 1
scribe_overall:received good: 4
scribe_central:received good: 2
ignore_me:received good: 1
scribe_overall:sent: 4

配置PHP接口:
生成PHP库:
# cd /usr/local/scribe/
# /usr/local/thrift-0.7.0/bin/thrift -o . -I /usr/local/thrift-0.7.0/fb303/share –gen php /usr/local/thrift-0.7.0/fb303/share/fb303/if/fb303.thrift
# /usr/local/thrift-0.7.0/bin/thrift -o . -I /usr/local/thrift-0.7.0/fb303/share/ –gen php /data/src/software/pcting-scribe-2ee14d3/if/scribe.thrift
# cp -r /data/src/software/thrift-0.7.0/lib/php/src includes
# mkdir -p includes/packages/fb303
# mkdir -p includes/packages/scribe
# mv gen-php/fb303/FacebookService.php ./includes/packages/fb303/
# mv gen-php/fb303/fb303_types.php ./includes/packages/fb303/
# mv gen-php/scribe/scribe_types.php ./includes/packages/scribe/
# mv gen-php/scribe/scribe.php ./includes/

测试程序具体参见:http://www.ruturaj.net/scribe-php-logging