doryokujinの技術ブログ: GlusterFS 3.3-beta1 -> 3.3-beta2 への更新（失敗）とHadoop対応させようとした（失敗）時の備忘録

GlusterFS を現状のbeta3.3-1からbeta3.3-2へアップグレードしたが、色んな所でハマってしまった。また、HadoopのInputFormatへ対応させようとしたが、これもうまくいかなかった。結局アップグレードはせずに元に戻したが、今後再チャレンジするときのための備忘録

1 .GlusterFS 3.3-beta2

※ 3.3-beta2 でHadoopのInputFormatに対応するようになった。

beta1とbeta2のDLリンク：

元々glusterfsのvolume上のmountした上のファイルのパスに対しては

# getfattr -m . -n trusted.glusterfs.pathinfo client_path

で物理的なロケーションを知ることができたのでこの対応は自然なように思える。このHadoop対応に伴ってbeta1とbeta2でそれなりに色々変更点があった。

1-1. trusted.glusterfs.pathinfo の出力の変更

beta2からは物理的なロケーション情報を得る際に取得できる項目が増えた。まずはその変更点をメモしておく。

# gluster volume info home

    Volume Name: home
    Type: Replicate
    Status: Started
    Number of Bricks: 2
    Transport-type: tcp
    Bricks:
    Brick1: delta5:/home/doryokujin/test_vol_server
    Brick2: delta6:/home/doryokujin/test_vol_server

のようなReplicate Volumeに対して、

# mount -t glusterfs localhost:home /home/doryokujin/test_vol_client

でdelta6に" -t gluster" (fuse) オプションでマウントしておく。
※beta1:

# getfattr -m . -n trusted.glusterfs.pathinfo /home/doryokujin/test_vol_client/some_file

    getfattr: Removing leading '/' from absolute path names
    # file: home/doryokujin/test_vol_client/some_file
    trusted.glusterfs.pathinfo="(
    repository-dht delta6:home/doryokujin/test_vol_server/some_file
    )

※beta2:

# getfattr -m . -n trusted.glusterfs.pathinfo /home/doryokujin/test_vol_client/some_file

    getfattr: Removing leading '/' from absolute path names
    # file: home/doryokujin/test_vol_client/some_file
    trusted.glusterfs.pathinfo="(
    <REPLICATE:home-replicate-0> 
    <POSIX:delta5:/home/doryokujin/test_vol_server/some_file> 
    <POSIX:delta6:/home/doryokujin/test_vol_server/some_file>)

以上のようにReplicationしている場合はきちんとReplicaを含めたすべてのサーバーの物理ロケーションを表示してくれるようになった。また、Volume Type:"REPLICATE"などのフィールドが追加されている。これはHadoopのInputFormatとして扱う際に、<(.*):(.*)>のフィールドからVolume Typeと物理ロケーションを別々に読み取ってそれに応じたアクションを行うためである。この出力変更は自前のMapReduceを書く場合でもこのコマンドによって容易に物理ロケーションを知ることができ、大変有用である。
なお、

getfattr: Removing leading '/' from absolute path names

はエラー出力なのだがとりあえず放置（どこかにこのエラーについての言及があったと思うので後で追記）。

1-2. Volumeを再作成する際のidチェック（そしてエラー）

また、過去に作成したVolumeと同じserver pathにvolumeを作成しようとするとエラーが出て作成できなくなってしまった。このエラーのためにbeta2への更新は中断している。以下は過去に repository というVolumeを作っていたが、beta2への更新のためにいったん削除していたvolumeである。

# gluster volume create repository replica 2 transport tcp ¥
 delta1:/repository delta11:/repository ¥
 delta2:/repository delta12:/repository... ¥
 delta6:/repository delta16:/repository

    'delta6:/repository' has been part of a deleted volume with id fbbfeb57-4355-4a9b-a727-9c9f26b79220. Please re-create the brick directory.

このように、「過去に削除したVolumeがあるのでbrickを再作成してね」というエラーが出て作成できない。beta1ではそんなこと聞かれなかったのに…同じようなバグはこちらのメーリングリストにも投稿されているが明確な回答は返ってきていない。以下のコマンドで現在の属性を確認する。

# attr -l /repository                                          
    
    Attribute "gfid" has a 16 byte value for /repository
    Attribute "glusterfs.volume-id" has a 16 byte value for /repository
    Attribute "glusterfs.dht" has a 16 byte value for /repository
    Attribute "afr.repository-client-10" has a 12 byte value for /repository
    Attribute "afr.repository-client-11" has a 12 byte value for /repository

過去の間違った操作で消えずになにか余計な属性が残ってしまっているのかな？ちなみにこれらの属性は

# sudo attr -r "gfid" /repository
   
    attr_remove: Operation not supported
    Could not remove "gfid" for /repository

のような操作では削除できない。その後、この問題に対してGluserFS Developerの@vbellurさんから解決策を教えて頂いた。この方法できちんとVolumeの再作成ができた。

# setfattr -x "trusted.glusterfs.volume-id" /repository

以上の2点がbeta2で新しく気づいた点である。ほかにもありそうだけど、結構大きな変更があった。しかしVolumeを作成した後も、途中でmountが解除されてしまうエラーが発生してうまくいかない。

[2011-09-07 17:45:43.277953] E [client-handshake.c:1157:client_query_portmap_cbk] 0-code-client-0: failed to get the port number for remote subvolume
[2011-09-07 17:45:43.278083] E [client-handshake.c:1157:client_query_portmap_cbk] 0-code-client-1: failed to get the port number for remote subvolume
[2011-09-07 17:46:28.412140] W [fuse-bridge.c:1751:fuse_readv_cbk] 0-glusterfs-fuse: 360: READ => -1 (No such file or directory)
[2011-09-07 17:46:28.412182] E [mem-pool.c:468:mem_put] 0-mem-pool: invalid argument
pending frames:

patchset: git://git.gluster.com/glusterfs.git
signal received: 11
time of crash: 2011-09-07 17:46:28
configuration details:
argp 1
backtrace 1
dlfcn 1
fdatasync 1
libpthread 1
llistxattr 1
setfsid 1
spinlock 1
epoll.h 1
xattr.h 1
st_atim.tv_nsec 1
package-string: glusterfs 3.3beta2
/lib/x86_64-linux-gnu/libc.so.6(+0x33d80)[0x7f2593807d80]
/lib/x86_64-linux-gnu/libpthread.so.0(pthread_spin_lock+0x0)[0x7f2593b748e0]
/usr/local/lib/libglusterfs.so.0(fd_ref+0x23)[0x7f25943fb893]
/usr/local/lib/glusterfs/3.3beta2/xlator/mount/fuse.so(+0x9a5d)[0x7f25927bda5d]
/usr/local/lib/glusterfs/3.3beta2/xlator/mount/fuse.so(+0x181cf)[0x7f25927cc1cf]
/lib/x86_64-linux-gnu/libpthread.so.0(+0x6d8c)[0x7f2593b6ed8c]
/lib/x86_64-linux-gnu/libc.so.6(clone+0x6d)[0x7f25938ba04d]
---------

以下では上記のエラーが起きないようにテストVolumeを新たに作成してHadoopのインプットとしてGlusterFSから読み込ませようと試行錯誤した記録。

2. For Hadoop
2-1. 追加するConfigurationとその意味
まずドキュメントはこちら。conf/core-site.xml に次の項目を追記または変更する。


  
        fs.glusterfs.impl
        org.apache.hadoop.fs.glusterfs.GlusterFileSystem
       $HADOOP_HOME/lib/ 以下に glusterfs-0.20.2-0.1.jar を
  
  
        fs.default.name
        glusterfs://namenode.geishatokyo.com:9000
       namenodeのhost,port
  
  
        fs.glusterfs.volname
        repository
       使用するGlusterFSのVolume名
  
  
        fs.glusterfs.mount
        /mnt/disk1/glusterfs/client/repository
       Volumeのマウントポイント。ノードごとにmount先異なる場合は注意。
  
  
        fs.glusterfs.server
        delta6
       volumeのserverとなっているノードのhostnameを記述。
  
  
        quick.slave.io
        Off

Hadoopが実行された時に、 fs.glusterfs.volname, fs.glusterfs.server, fs.glusterfs.mount を引数として、GlusterFSがきちんとマウントされているかを確認している。(GlusterFileSystem.java L82)。

mountCmd = "mount -t glusterfs " + server + ":" + "/" + volname + " " + mount; # line 82
# ex.) mount -t glusterfs delta6:/repository /mnt/disk1/glusterfs/client/repository

ここでmountコマンド（他にも getfattr コマンド等）を内部で使用しているのでsudo権限がいるのでHadoop実行時にはsudo権限が必要。これらのパラメータの設定が不正確だとこの時点でエラーとなり、

System.out.println("Failed to initialize GlusterFS"); # line 127
System.exit(-1);

となる。以下はPydoopでのip_countの実行例。conf.xmlに上記の設定を記述している。

# hadoop pipes -D mapred.reduce.tasks=12 ¥
 -D hadoop.pipes.java.recordreader=true ¥
 -D hadoop.pipes.java.recordwriter=true ¥
 -D mapred.output.compress=false ¥
 -conf conf.xml ¥
 -program ip_count.py ¥
 -input /mnt/disk1/glusterfs/client/repository/log_repository/some_file
 -output out_ip_count

注意点：

-program で指定したスクリプト名は fs.glusterfs.mount の値と結合したパス上から読み込まれる。
-input 先頭に "gluster://" は不要。むしろエラーが出る。

以上の設定でコマンドを実行した場合、以下のエラーが出てうまくいかなかった。深夜の格闘も虚しくそのまま朝になり、現在glusterfsを利用したMapReduceフレームワークを自作して業務で使用している都合上、mountが解除されてしまう問題は致命的なのでこれ以上進めずにbeta1に戻してしまった。また時間と体力があるときにきちんとテスト環境を作ってチャレンジする予定。

java.lang.RuntimeException: java.lang.ClassNotFoundException: org.apache.hadoop.fs.glusterfs.GlusterFileSystem
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:996)
        at org.apache.hadoop.fs.FileSystem.createFileSystem(FileSystem.java:1508)
        at org.apache.hadoop.fs.FileSystem.access$200(FileSystem.java:67)
        at org.apache.hadoop.fs.FileSystem$Cache.getInternal(FileSystem.java:1548)
        at org.apache.hadoop.fs.FileSystem$Cache.get(FileSystem.java:1530)
        at org.apache.hadoop.fs.FileSystem.get(FileSystem.java:228)
        at org.apache.hadoop.fs.Path.getFileSystem(Path.java:183)
        at org.apache.hadoop.mapred.JobHistory$JobInfo.logSubmitted(JobHistory.java:1243)
        at org.apache.hadoop.mapred.JobInProgress$3.run(JobInProgress.java:681)
        at java.security.AccessController.doPrivileged(Native Method)
        at javax.security.auth.Subject.doAs(Subject.java:396)
        at org.apache.hadoop.security.UserGroupInformation.doAs(UserGroupInformation.java:1115)
        at org.apache.hadoop.mapred.JobInProgress.initTasks(JobInProgress.java:678)
        at org.apache.hadoop.mapred.JobTracker.initJob(JobTracker.java:4013)
        at org.apache.hadoop.mapred.EagerTaskInitializationListener$InitJob.run(EagerTaskInitializationListener.java:79)
        at java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
        at java.lang.Thread.run(Thread.java:662)
Caused by: java.lang.ClassNotFoundException: org.apache.hadoop.fs.glusterfs.GlusterFileSystem
        at java.net.URLClassLoader$1.run(URLClassLoader.java:202)
        at java.security.AccessController.doPrivileged(Native Method)
        at java.net.URLClassLoader.findClass(URLClassLoader.java:190)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:307)
        at sun.misc.Launcher$AppClassLoader.loadClass(Launcher.java:301)
        at java.lang.ClassLoader.loadClass(ClassLoader.java:248)
        at java.lang.Class.forName0(Native Method)
        at java.lang.Class.forName(Class.java:247)
        at org.apache.hadoop.conf.Configuration.getClassByName(Configuration.java:943)
        at org.apache.hadoop.conf.Configuration.getClass(Configuration.java:994)
        ... 17 more

1 件のコメント:

reznor2011年9月8日 15:16
Hi,

You need not give the full path in org.apache.hadoop.fs.glusterfs.GlusterFileSystem tag

fs.glusterfs.impl
org.apache.hadoop.fs.glusterfs.GlusterFileSystem
$HADOOP_HOME/lib/ 以下に glusterfs-0.20.2-0.1.jar を

it should be

fs.glusterfs.impl
org.apache.hadoop.fs.glusterfs.GlusterFileSystem
返信削除
返信

コメントを追加

doryokujinの技術ブログ

2011年9月7日水曜日

GlusterFS 3.3-beta1 -> 3.3-beta2 への更新（失敗）とHadoop対応させようとした（失敗）時の備忘録

1 件のコメント:

自己紹介

ブログアーカイブ

2011年9月7日水曜日

GlusterFS 3.3-beta1 -> 3.3-beta2 への更新（失敗）とHadoop対応させようとした（失敗）時の備忘録

1 件のコメント:

自己紹介

ブログ アーカイブ

ブログアーカイブ