반응형

1. 출처 : http://www.joinc.co.kr/modules/moniwiki/wiki.php/Site/System_management/ZFS

-------------------------------------------------------


1 ZFS

유닉스의 파일 시스템을 대체하기 위해서 SUN에서 개발한 파일 시스템으로 Solaris 10에 소개된다. 기능을 보면 알겠지만 단순한 파일 시스템으로 보기는 힘들다. 모든 잡다한 관리기능까지를 포함한 볼륨메니저로 봐야 한다. 예컨데 파일 시스템과 Logical Volume manager, snapshots, copy-on-write clones, continuous integrity checking, automatic repaire, 등 무슨 기능일지도 모를 것 같은 다양한 기능들을 가지고 있는 확장된 파일 시스템이다. ZFS를 써본 사람들은 지상 최대의 파일 시스템이라는 찬사를 보낸다. 정말 그러하다.

나는 Cloud 환경에서 신뢰성과 확장성, 성능과 관리성까지 두루 갖춘 만능 File System을 찾아보자라는 목적으로 살펴보게 됐다.

ZFS의 기능을 정리했다.
  1. Data Integrigy 다른 파일 시스템과 구분되는 가장 큰 특징이다. 디스크 상의 유저 데이터를 보호준다. bit rot, cosmic radiation, current spikes, bugs in disk firmware, ghost writes 등으로 부터 데이터를 보호해 준다. 물론 Ext, XFS, JFS, ReiserFS, NTFS 등도 유저 데이터를 보호하기 위한 기능을 가지고 있기는 하지만, ZFS는 이들 보다 탁월한 성능을 보여준다.
  2. Storage Pool LVM과 마찬가지로 하나 이상의 장치를 통합해서 관리할 수 있다. 이 논리적인 Storage Pool을 zpools라고 하며, block device 처럼 작동한다. 블럭 장치들은 다양한 레벨로 조합할 수 있습니다. non-redundantly (RAID 0과 비슷), mirror ( RAID 1과 비슷 ), RAID-Z (RAID-5와 비슷), RAID-Z2 (RAID-6와 비슷) 등등이다.
  3. Capacity : ZFS는 128-bit 파일 시스템으로 용량에는 제약기 없다고 보면 되겠다. 페타바이트급 파일시스템이라고 하는 이유가 있다.
    • 2^48 개의 독립된 디렉토리를 다룰 수 있다.
    • 파일의 최대크기는 16 exabytes ( 16 X 10^18)
    • zpool의 최대 크기는 256 zettabytes (2^78)
    • 시스템에서 zpools의 갯수는 2^64
    • zpools에서 파일 시스템의 갯수 2^64

2 Linux와 ZFS

Linux에 ZFS를 이식하기 위한 노력이 진행 중이다. 이식은 두 가지 방향으로 이루어지고 있다.
  1. native ZFS
    리눅스 커널이 ZFS를 지원하도록 하는 프로젝트
  2. zfs-fuse
    fuse를 이용해서 ZFS를 지원하도록 하는 프로젝트
아직까지는 Linux에서 마음놓고 사용할 만한 수준이 아닌 것 간다. zfs-fuse 같은 경우에는 성능에 문제가 상당히 있어서 zfs란 이런 거구나를 체험하기 위한 수준 정도에서나 사용할 수 있을 것 같다. . Native ZFS는 눈여겨 볼만하다. 꾸준히 개발하고 있는 것 같기는 한데, 언제쯤 1.0이 되서 믿고 쓸만한 날이 올지 모르겠다. 1년전에 0.6.x 버전이었는데, 지금(2013년 9월)도 0.6.x 버전이다.

2.1 zfs-fuse

  1. apt-get install zfs-fuse
  2. zpoll create mypool /dev/sdb /dev/sdc /dev/sdd /dev/sde
  3. zpool status mypool
  4. zfs create mypool/testzfs
  5. FS 성능 측정 : Bonnie++

3 Solaris와 ZFS

3.1 opensolaris 설치

솔라리스는 ZFS를 기본 파일 시스템으로 하고 있다. 이렇게 된거 x86기반의 opensolaris를 설치해서 ZFS를 경험해 보기로 했다. VirtualBox를 이용 해서 가상으로 올렸다.
  • hypervisor : VirtualBox VirtualBox 와 opensolaris 모두 Oracle에서 개발하고 있으니, 궁합이 잘 맞을 거란 생각이 든다.
  • OpenSolaris
설치는 윈도우보다 간단하다. 설치 과정은 생략.

솔라리스를 마지막으로 써본게 아마 9년전 쯤인것 같다. 2002년이던가 ? 당시만 해도 상당히 투박한(하지만 왠지 멋져 보이긴 했음) CDE 화면이었는데, 지금은 gnome이 뜬다. 예쁘다.

보낸 사람 Linux

3.2 zpool

논리적 볼륨 관리자의 핵심은 장치를 아우르는 하나의 pool을 만드는 거다. 이 pool을 zpool이라고 부른다.

테스트를 위해서 SATA 2G x 2 장치를 준비했다. 리눅스에서 하던 것처럼 fdisk -l로 장치를 확인하려고 했더니, 내가 원하는 그 fdisk가 아니다. format으로 장치를 확인할 수 있다.
# format
AVAILABLE DISK SELECTIONS:
       0. c7d0 <DEFAULT cyl 2085 alt 2 hd 255 sec 63>
          /pci@0,0/pci-ide@1,1/ide@0/cmdk@0,0
       1. c9t0d0 <ATA-VBOX HARDDISK-1.0-2.00GB>
          /pci@0,0/pci8086,2829@d/disk@0,0
       2. c9t1d0 <ATA-VBOX HARDDISK-1.0-2.00GB>
          /pci@0,0/pci8086,2829@d/disk@1,0

c9t0d0, c9t1d0을 tank라는 이름의 zpool로 묶기로 했다.
# zpool create tank c9t0d0 c9t1d0

제대로 만들어 졌는지 확인.
# zpool list
NAME    SIZE   USED  AVAIL    CAP  HEALTH  ALTROOT
rpool  15.9G  3.80G  12.1G    23%  ONLINE  -
tank   3.97G   232K  3.97G     0%  ONLINE  -

zfs로 파일 시스템에 대한 상세 정보를 확인.
# zfs list
NAME                         USED  AVAIL  REFER  MOUNTPOINT
rpool                       4.16G  11.5G  77.5K  /rpool
rpool/ROOT                  3.16G  11.5G    19K  legacy
rpool/ROOT/opensolaris      3.16G  11.5G  3.02G  /
rpool/dump                   511M  11.5G   511M  -
rpool/export                5.04M  11.5G    21K  /export
rpool/export/home           5.02M  11.5G    21K  /export/home
rpool/export/home/yundream     5M  11.5G     5M  /export/home/yundream
rpool/swap                   512M  11.8G   137M  -
tank                        74.5K  3.91G    19K  /tank

zfs는 디렉토리 형태로 pool을 관리할 수 있다. tank 밑에 music, movie, source 3개의 파일 시스템을 만들어 봤다.
# zfs create tank/music
# zfs create tank/movie
# zfs create tank/source
# zfs list
NAME                         USED  AVAIL  REFER  MOUNTPOINT
...
tank                         154K  3.91G    23K  /tank
tank/movie                    19K  3.91G    19K  /tank/movie
tank/music                    19K  3.91G    19K  /tank/music
tank/source                   19K  3.91G    19K  /tank/source
3개의 파일 시스템이 추가로 만들어지긴 했지만 volume을 모두 공유해서 사용하고 있다. 내 목적은 각 파일 시스템 별로 쿼터를 정하는 거다. /tank/movie를 1G를 사용하도록 쿼터를 할당했다.
# zfs set quota=1g tank/movie
# zfs list | grep tank
tank                         154K  3.91G    23K  /tank
tank/movie                    19K  1024M    19K  /tank/movie
tank/music                    19K  3.91G    19K  /tank/music
tank/source                   19K  3.91G    19K  /tank/source

# df -h /tank/movie
Filesystem            Size  Used Avail Use% Mounted on
tank/movie            1.0G   19K  1.0G   1% /tank/movie

3.3 mirror, RAIDZ, RAIDZ2

ZFS는 mirror와 RAIDZ, RAIDZ2 3가지 RAID 레벨을 지원한다. 10G SATA Disk 6개를 추가해서 각 모드별로 테스트를 진행했다.

RAIDZ와 RAIDZ2는 각각 RAID 5와 RAID 6와 매우 유사하다. 즉
  • RAIDZ : RAID 5 처럼 블럭 기반 striped 와 하나의 분산 parity 블럭을 가진다.
  • RAIDZ2 : RAID 6 처럼 블럭 기반 striped 와 두개의 분산 parity 블럭을 가진다.


4 Native ZFS on Linux

  • Native ZFS on Linux : 앞으로 오픈 솔라리스 대신 리눅스를 이용해서 zfs를 테스트할 계획이다.

5 Nexenta

Nexenta는 opensolaris 기반의 NAS/SAN어플라이언스 제품이다. 기업환경에서 ZFS를 이용해서 안정적으로 스토리지 시스템을 구성할 수 있도록 HA(High Availability cluster), Namespace Cluster, CLI및 GUI(웹 기반의) 인터페이스를 제공한다.

6 History

  1. 작성일 : 2011년 9월 6일
  2. 수정이력
    1. 작성 : 2011/9/6
    2. 2013-08-31 : 문장을 다듬었음. 링크 추가


------------------------------------------------------------------------------------------------------------


zfs 파일 시스템의 관련 블로그들의 대부분은 

spl-0.6.0-rc6.tar.gz, zfs-0.6.0-rc6.tar.gz를 가이드했지만 내가 이 블로그를 쓰는 시점은 0.6.3을 받았다. 버전별로 리눅스 커널 지원이 다르다.

관련 패키지 정보는 ZFS 관련 공식 사이트인 아래 링크에 들어가서 받도록 하자.


http://zfsonlinux.org/


 


------------------------------------------------------------------------------------------------------------

2. 출처 : http://blog.lovetonight.net/m/post/164


ZFS on Centos6

마성민 | 2013/04/19 11:37 | Linux 앱으로 보기

약 30TB 정도의 데이터 백업을 위해 백업 서버를 한대 구성하고 XFS로 포맷하고 데이터를 저장..

파일 개수가 약 1억개 정도 되다보니.. 데이터를 백업 하는 과정에서 계속 XFS 파티션이 깨지는 현상이 발생..

거기다 설상가상으로 서버이전이다 뭐다 해서 용량이 갑자기 10TB가 추가되어버린..

ㅠ_ㅠ 아.. 서버 재구성해야 하나.. 했는데..

 

이번기회에 ZFS로 가보자 하고 결정..

아래 내용은 Onenote에 정리해둔 내용을 옮겨적음..

 

[기본환경]

  • OS : CentOS release 6.3 (Final) 64Bit
  • Kernel : 2.6.32-279.19.1.el6.x86_64

 

[설치]

  • Linux 시스템에서 ZFS 사용하기 위해서는 Native ZFS 이용하는 방법과 ZFS-FUSE 이용하는 방법이 있으며, 각각의 방법에 따라 ZFS 버전이 달라질 있습니다.

 

  1. 먼저 ZFS 구성을 위해 기본적으로 필요한 항목들을 설치 합니다.

Shell > yum install kernel-devel zlib-devel libuuid-devel libblkid-devel libselinux-devel parted lsscsi

Native ZFS Kernel 모듈 방식으로 동작 하기 때문에 위와 같이 kernel-devel 반드시 설치해 주어야 합니다.

 

  1. ZFS패키지인 SPL ZFS 패키지를 다운로드 합니다.

다운로드는 http://www.zfsonlinux.org 에서 받을 있습니다.

 

  1. 다운로드한 패키지의 압축을 해제 설치를 진행 합니다.

# spl install

Shell > tar xvfz spl-0.6.0-rc13.tar.gz

Shell > cd spl-0.6.0-rc13

Shell > ./configure && make rpm

Shell > rpm -Uvh *.x86_64.rpm

Shell > cd ..

 

# zfs install

Shell > tar xvfz zfs-0.6.0-rc13.tar.gz

Shell > cd zfs-0.6.0-rc13

Shell > ./configure && make rpm

Shell> rpm -Uvh *.x86_64.rpm

Shell > cd ..

 

  1. 설치된 zfs 커널 모듈을 인식시켜 줍니다.

Shell > modprobe zfs

Shell > lsmod | grep -i zfs

 

  1. ZFS Spool 생성 합니다.

여기서 ZFS Spool 만들때 단일 디스크를 사용할 것인지 아니면 여러 개의 단일 디스크를 소프트웨어 레이드 방식으로 구성 것인지를 결정 있습니다. ZFS Spool 생성할 때에는 zpool 명령어를 사용하며, 해당 명령어에는 다양한 기능들이 포함되어 있으므로 자세한 내용은 공식 Document 참조하시기 바랍니다.

아래에서는 단일 디스크와 2 디스크를 미러링 하는 방법, 그리고 4 이상의 디스크를 Raid5 구성하는 방법을 설명 하겠습니다.

# EX : zpool [명령어] [옵션] spool

# CREATE EX : zpool create [-f|-n] [-o property=value] [-m mountpoint] pool vdev

 

# /dev/sdb 디스크를 backup 스풀로 생성하며, compress option 기본설정 하도록 한다.

Shell > zpool create -o compress=on backup disk /dev/sdb

 

# /dev/sdb 디스크와 /dev/sdc 디스크를 Backup 스풀로 생성하며, Raid1 구성 Compress option 설정 하도록 한다.

Shell > zpool create -o compress=on backup mirror /dev/sdb /dev/sdc

 

# /dev/sd[b-f] 디스크( 5) 4개를 Raid5 (Parity Disk 1)구성하고 나머지 1개의 디스크는 Hot Spare 구성을 Backup 스풀을 구성하고 /backup_spool 이라는 디렉토리에 마운트 되도록 한다.

Shell > zpool create -m /backup_spool backup raidz1 /dev/sdb /dev/sdc /dev/sdd /dev/sde spare /dev/sdf

 

  1. ZFS Spool ZFS 파일 시스템을 생성 합니다.

ZFS Spool 공간을 이용하여 파일시스템을 생성할 있으며, 각각의 파일시스템은 Spool 적용된 property Option 상속하게 됩니다.

# EX : zfs [명령어] [옵션]

# CREATE EX : zfs create [-p] [-o] filesystem

 

# backup 스풀의 공간을 이용하여 maildata 라는 파일시스템을 생성하고 자동 압축되지 않도록 구성 한다.

Shell > zfs create -o compress=off backup/maildata

 

# Backup 스풀의 공간을 이용하여 mysql 라는 파일시스템을 생성하고 자동압축되지 않으며, /data/mysql 마운트 되도록 한다.

Shell > zfs create -o compress=off -o mountpoint=/data/mysql backup/mysql

 

# Backup 스풀의 공간을 이용하여 user 라는 파일시스템을 생성하고 자동압축되며, 500GB 쿼터를 설정하고 /home/user 디렉토리에 마운트 되도록 한다.

Shell > zfs create -o compress=on -o mountpoint=/home/user -o quota=500GB backup/user

 

ZFS로 구성한 후 Compression 옵션 덕분에 실제 8TB 정도 되는 데이터는 5.6TB 정도로 압축되어 저장되었다는 후문이...

 

ZFS 사랑해요~!


------------------------------------------------------------------------------------------------------------






------------------------------------------------------------------------------------------------------------

출처 : https://github.com/zfsonlinux/pkg-zfs/wiki/HOWTO-install-Ubuntu-to-a-Native-ZFS-Root-Filesystem


HOWTO install Ubuntu to a Native ZFS Root Filesystem


Note: These instructions were originally created for older version of Ubuntu (12.04). Some required resource (e.g. grub ppa) is not available on newer versions, thus can lead to errors. If you use Ubuntu 14.04 or newer, see this page instead, which also allows things like raidz root, boot from snapshot, zfs-only setup (no need for separate /boot/grub), and lz4 compression.


These instructions are for Ubuntu. The procedure for Debian, Mint, or other distributions in the DEB family is similar but not identical.

System Requirements

  • 64-bit Ubuntu Live CD. (Not the alternate installer, and not the 32-bit installer!)
  • AMD64 or EM64T compatible computer. (ie: x86-64)
  • 8GB disk storage available.
  • 2GB memory minimum.

Computers that have less than 2GB of memory run ZFS slowly. 4GB of memory is recommended for normal performance in basic workloads. 16GB of memory is the recommended minimum for deduplication. Enabling deduplication is a permanent change that cannot be easily reverted.

Recommended Version

  • Ubuntu 12.04 Precise Pangolin
  • spl-0.6.3
  • zfs-0.6.3

Step 1: Prepare The Install Environment

1.1 Start the Ubuntu LiveCD and open a terminal at the desktop.

1.2 Input these commands at the terminal prompt:

$ sudo -i
# apt-add-repository --yes ppa:zfs-native/stable
# apt-get update
# apt-get install debootstrap spl-dkms zfs-dkms ubuntu-zfs

1.3 Check that the ZFS filesystem is installed and available:

# modprobe zfs
# dmesg | grep ZFS:
ZFS: Loaded module v0.6.3-2~trusty, ZFS pool version 5000, ZFS filesystem version 5

Step 2: Disk Partitioning

This tutorial intentionally recommends MBR partitioning. GPT can be used instead, but beware of UEFI firmware bugs.

2.1 Run your favorite disk partitioner, like parted or cfdisk, on the primary storage device. /dev/disk/by-id/scsi-SATA_disk1 is the example device used in this document.

2.2 Create a small MBR primary partition of at least 8 megabytes. 256mb may be more realistic, unless space is tight. /dev/disk/by-id/scsi-SATA_disk1-part1 is the example boot partition used in this document.

2.3 On this first small partition, set type=BE and enable the bootable flag.

2.4 Create a large partition of at least 4 gigabytes. /dev/disk/by-id/scsi-SATA_disk1-part2 is the example system partition used in this document.

2.5 On this second large partition, set type=BF and disable the bootable flag.

The partition table should look like this:

# fdisk -l /dev/disk/by-id/scsi-SATA_disk1

Disk /dev/sda: 10.7 GB, 10737418240 bytes
255 heads, 63 sectors/track, 1305 cylinders
Units = cylinders of 16065 * 512 = 8225280 bytes
Sector size (logical/physical): 512 bytes / 512 bytes
I/O size (minimum/optimal): 512 bytes / 512 bytes
Disk identifier: 0x00000000

Device    Boot      Start         End      Blocks   Id  System
/dev/sda1    *          1           1        8001   be  Solaris boot
/dev/sda2               2        1305    10474380   bf  Solaris

Remember: Substitute scsi-SATA_disk1-part1 and scsi-SATA_disk1-part2 appropriately below.

Hints:

  • Are you doing this in a virtual machine? Is something in /dev/disk/by-id missing? Go read the troubleshooting section.
  • Recent GRUB releases assume that the /boot/grub/grubenv file is writable by the stage2 module. Until GRUB gets a ZFS write enhancement, the GRUB modules should be installed to a separate filesystem in a separate partition that is grub-writable.
  • If /boot/grub is in the ZFS filesystem, then GRUB will fail to boot with this message: error: sparse file not allowed. If you absolutely want only one filesystem, then remove the call to recordfail() in each grub.cfg menu stanza, and edit the /etc/grub.d/10_linux file to make the change permanent.
  • Alternatively, if /boot/grub is in the ZFS filesystem you can comment each line with the text save_env in the file /etc/grub.d/00_header and run update-grub.

Step 3: Disk Formatting

3.1 Format the small boot partition created by Step 2.2 as a filesystem that has stage1 GRUB support like this:

# mke2fs -m 0 -L /boot/grub -j /dev/disk/by-id/scsi-SATA_disk1-part1

3.2 Create the root pool on the larger partition:

# zpool create -o ashift=9 rpool /dev/disk/by-id/scsi-SATA_disk1-part2

Always use the long /dev/disk/by-id/* aliases with ZFS. Using the /dev/sd* device nodes directly can cause sporadic import failures, especially on systems that have more than one storage pool.

Warning: The grub2-1.99 package currently published in the PPA for Precise does not reliably handle a 4k block size, which is ashift=12.

Hints:

  • # ls -la /dev/disk/by-id will list the aliases.
  • The root pool can be a mirror. For example, zpool create -o ashift=9 rpool mirror /dev/disk/by-id/scsi-SATA_disk1-part2 /dev/disk/by-id/scsi-SATA_disk2-part2. Remember that the version and ashift matter for any pool that GRUB must read, and that these things are difficult to change after pool creation.
  • If you are using a mirror with a separate boot partition as described above, don't forget to edit the grub.cfg file on the second HD partition so that the "root=" partition refers to that partition on the second HD also; otherwise, if you lose the first disk, you won't be able to boot from the second because the kernel will keep trying to mount the root partition from the first disk.
  • The pool name is arbitrary. On systems that can automatically install to ZFS, the root pool is named "rpool" by default. Note that system recovery is easier if you choose a unique name instead of "rpool". Anything except "rpool" or "tank", like the hostname, would be a good choice.
  • If you want to create a mirror but only have one disk available now you can create the mirror using a sparse file as the second member then immediately off-line it so the mirror is in degraded mode. Later you can add another drive to the spool and ZFS will automatically sync them. The sparse file won't take up more than a few KB so it can be bigger than your running system. Just make sure to off-line the sparse file before writing to the pool.

3.2.1 Create a sparse file at least as big as the larger partition on your HDD:

# truncate -s 11g /tmp/sparsefile

3.2.2 Instead of the command in section 3.2 use this to create the mirror:

# zpool create -o ashift=9 rpool mirror /dev/disk/by-id/scsi-SATA_disk1-part2 /tmp/sparsefile

3.2.3 Offline the sparse file. You can delete it after this if you want.

# zpool offline rpool /tmp/sparsefile

3.2.4 Verify that the pool was created and is now degraded.

# zpool list
NAME       SIZE  ALLOC   FREE    CAP  DEDUP  HEALTH  ALTROOT
rpool     10.5G   188K  10.5G     0%  1.00x  DEGRADED  -

3.3 Create a "ROOT" filesystem in the root pool:

# zfs create rpool/ROOT

3.4 Create a descendant filesystem for the Ubuntu system:

# zfs create rpool/ROOT/ubuntu-1

On Solaris systems, the root filesystem is cloned and the suffix is incremented for major system changes through pkg image-update or beadm. Similar functionality for APT is possible but currently unimplemented.

3.5 Dismount all ZFS filesystems.

# zfs umount -a

3.6 Set the mountpoint property on the root filesystem:

# zfs set mountpoint=/ rpool/ROOT/ubuntu-1

3.7 Set the bootfs property on the root pool.

# zpool set bootfs=rpool/ROOT/ubuntu-1 rpool

The boot loader uses these two properties to find and start the operating system. These property names are not arbitrary.

Hint: Putting rpool=MyPool or bootfs=MyPool/ROOT/system-1 on the kernel command line overrides the ZFS properties.

3.9 Export the pool:

# zpool export rpool

Don't skip this step. The system is put into an inconsistent state if this command fails or if you reboot at this point.

Step 4: System Installation

Remember: Substitute "rpool" for the name chosen in Step 3.2.

4.1 Import the pool:

# zpool import -d /dev/disk/by-id -R /mnt rpool

If this fails with "cannot import 'rpool': no such pool available", you can try import the pool without the device name eg:

    # zpool import -R /mnt rpool

4.2 Mount the small boot filesystem for GRUB that was created in step 3.1:

# mkdir -p /mnt/boot/grub
# mount /dev/disk/by-id/scsi-SATA_disk1-part1 /mnt/boot/grub

4.4 Install the minimal system:

# debootstrap trusty /mnt

The debootstrap command leaves the new system in an unconfigured state. In Step 5, we will only do the minimum amount of configuration necessary to make the new system runnable.

Step 5: System Configuration

5.1 Copy these files from the LiveCD environment to the new system:

# cp /etc/hostname /mnt/etc/
# cp /etc/hosts /mnt/etc/

5.2 The /mnt/etc/fstab file should be empty except for a comment. Add this line to the /mnt/etc/fstab file:

/dev/disk/by-id/scsi-SATA_disk1-part1  /boot/grub  auto  defaults  0  1

The regular Ubuntu desktop installer may add dev, proc, sys, or tmp lines to the /etc/fstab file, but such entries are redundant on a system that has a /lib/init/fstab file. Add them now if you want them.

5.3 Edit the /mnt/etc/network/interfaces file so that it contains something like this:

# interfaces(5) file used by ifup(8) and ifdown(8)
auto lo
iface lo inet loopback

auto eth0
iface eth0 inet dhcp

Customize this file if the new system is not a DHCP client on the LAN.

5.4 Make virtual filesystems in the LiveCD environment visible to the new system and chroot into it:

# mount --bind /dev  /mnt/dev
# mount --bind /proc /mnt/proc
# mount --bind /sys  /mnt/sys
# chroot /mnt /bin/bash --login

5.5 Install PPA support in the chroot environment like this:

# locale-gen en_US.UTF-8
# apt-get update
# apt-get install ubuntu-minimal software-properties-common

Even if you prefer a non-English system language, always ensure that en_US.UTF-8 is available. The ubuntu-minimal package is required to use ZoL as packaged in the PPA.

5.6 Install ZFS in the chroot environment for the new system:

# apt-add-repository --yes ppa:zfs-native/stable
# apt-add-repository --yes ppa:zfs-native/grub
# apt-get update
# apt-get install --no-install-recommends linux-image-generic linux-headers-generic
# apt-get install ubuntu-zfs
# apt-get install grub2-common grub-pc
# apt-get install zfs-initramfs
# apt-get dist-upgrade

Warning: This is the second time that you must wait for the SPL and ZFS modules to compile. Do not try to skip this step by copying anything from the host environment into the chroot environment.

Note: This should install a kernel package and its headers, a patched mountall and dkms packages. Double-check that you are getting these packages from the PPA if you are deviating from these instructions in any way.

Choose /dev/disk/by-id/scsi-SATA_disk1 if prompted to install the MBR loader.

Ignore warnings that are caused by the chroot environment like:

  • Can not write log, openpty() failed (/dev/pts not mounted?)
  • df: Warning: cannot read table of mounted file systems
  • mtab is not present at /etc/mtab.

5.7 Set a root password on the new system:

# passwd root

Hint: If you want the ubuntu-desktop package, then install it after the first reboot. If you install it now, then it will start several process that must be manually stopped before dismount.

Step 6: GRUB Installation

Remember: All of Step 6 depends on Step 5.4 and must happen inside the chroot environment.

6.1 Verify that the ZFS root filesystem is recognized by GRUB:

# grub-probe /
zfs

And that the ZFS modules for GRUB are installed:

# ls /boot/grub/zfs*
/boot/grub/zfs.mod  /boot/grub/zfsinfo.mod

Note that after Ubuntu 13, these are now in /boot/grub/i386/pc/zfs*

# ls /boot/grub/i386-pc/zfs*
/boot/grub/i386-pc/zfs.mod  /boot/grub/i386-pc/zfsinfo.mod

Otherwise, check the troubleshooting notes for GRUB below.

6.2 Refresh the initrd files:

# update-initramfs -c -k all
update-initramfs: Generating /boot/initrd.img-3.2.0-40-generic

6.3 Update the boot configuration file:

# update-grub
Generating grub.cfg ...
Found linux image: /boot/vmlinuz-3.2.0-40-generic
Found initrd image: /boot/initrd.img-3.2.0-40-generic
done

Verify that boot=zfs appears in the boot configuration file:

# grep boot=zfs /boot/grub/grub.cfg
linux /ROOT/ubuntu-1/@/boot/vmlinuz-3.2.0-40-generic root=/dev/sda2 ro boot=zfs $bootfs quiet splash $vt_handoff
linux /ROOT/ubuntu-1/@/boot/vmlinuz-3.2.0-40-generic root=/dev/sda2 ro single nomodeset boot=zfs $bootfs

6.4 Install the boot loader to the MBR like this:

# grub-install $(readlink -f /dev/disk/by-id/scsi-SATA_disk1)
Installation finished. No error reported.

Do not reboot the computer until you get exactly that result message. Note that you are installing the loader to the whole disk, not a partition.

Note: The readlink is required because recent GRUB releases do not dereference symlinks.

Step 7: Cleanup and First Reboot

7.1 Exit from the chroot environment back to the LiveCD environment:

# exit

7.2 Run these commands in the LiveCD environment to dismount all filesystems:

# umount /mnt/boot/grub
# umount /mnt/dev
# umount /mnt/proc
# umount /mnt/sys
# zfs umount -a
# zpool export rpool

The zpool export command must succeed without being forced or the new system will fail to start.

7.3 We're done!

# reboot

Caveats and Known Problems

This is an experimental system configuration.

This document was first published in 2010 to demonstrate that the lzfs implementation made ZoL 0.5 feature complete. Upstream integration efforts began in 2012, and it will be at least a few more years before this kind of configuration is even minimally supported.

Gentoo, and its derivatives, are the only Linux distributions that are currently mainlining support for a ZoL root filesystem.

zpool.cache inconsistencies cause random pool import failures.

The /etc/zfs/zpool.cache file embedded in the initrd for each kernel image must be the same as the /etc/zfs/zpool.cache file in the regular system. Run update-initramfs -c -k all after any /sbin/zpool command changes the /etc/zfs/zpool.cache file.

Pools do not show up in /etc/zfs/zpool.cache when imported with the -R flag.

This will be a recurring problem until issue zfsonlinux/zfs#330 is resolved.

Every upgrade can break the system.

Ubuntu systems remove old dkms modules before installing new dkms modules. If the system crashes or restarts during a ZoL module upgrade, which is a failure window of several minutes, then the system becomes unbootable and must be rescued.

This will be a recurring problem until issue zfsonlinux/pkg-zfs#12 is resolved.

When doing an upgrade remotely an extra precaution would be to use screen, this way if you get disconnected your installation will not get interrupted.

Troubleshooting

(i) MPT2SAS

Most problem reports for this tutorial involve mpt2sas hardware that does slow asynchronous drive initialization, like some IBM M1015 or OEM-branded cards that have been flashed to the reference LSI firmware.

The basic problem is that disks on these controllers are not visible to the Linux kernel until after the regular system is started, and ZoL does not hotplug pool members. See https://github.com/zfsonlinux/zfs/issues/330.

Most LSI cards are perfectly compatible with ZoL, but there is no known fix if your card has this glitch. Please use different equipment until the mpt2sas incompatibility is diagnosed and fixed, or donate an affected part if you want solution sooner.

(ii) Areca

Systems that require the arcsas blob driver should add it to the /etc/initramfs-tools/modules file and run update-initramfs -c -k all.

Upgrade or downgrade the Areca driver if something like RIP: 0010:[<ffffffff8101b316>] [<ffffffff8101b316>] native_read_tsc+0x6/0x20 appears anywhere in kernel log. ZoL is unstable on systems that emit this error message.

(iii) GRUB Installation

Verify that the PPA for the ZFS enhanced GRUB is installed:

# apt-add-repository ppa:zfs-native/grub
# apt-get update

Reinstall the zfs-grub package, which is an alias for a patched grub-common package:

# apt-get install --reinstall zfs-grub

Afterwards, this should happen:

# apt-cache search zfs-grub
grub-common - GRand Unified Bootloader (common files)

# apt-cache show zfs-grub
N: Can't select versions from package 'zfs-grub' as it is purely virtual
N: No packages found

# apt-cache policy grub-common zfs-grub
grub-common:
 Installed: 1.99-21ubuntu3.9+zfs1~precise1
 Candidate: 1.99-21ubuntu3.9+zfs1~precise1
 Version table:
*** 1.99-21ubuntu3.9+zfs1~precise1 0
      1001 http://ppa.launchpad.net/zfs-native/grub/ubuntu/precise/main amd64 Packages
       100 /var/lib/dpkg/status
    1.99-21ubuntu3 0
      1001 http://us.archive.ubuntu.com/ubuntu/ precise/main amd64 Packages
zfs-grub:
 Installed: (none)
 Candidate: (none)
 Version table:

For safety, grub modules are never updated by the packaging system after initial installation. Manually refresh them by doing this:

# cp /usr/lib/grub/i386-pc/*.mod /boot/grub/

If the problem persists, then open a bug report and attach the entire output of those apt-get commands.

Packages in the GRUB PPA are compiled against the stable PPA. Systems that run the daily PPA may experience failures if the ZoL library interface changes.

Note that GRUB does not currently dereference symbolic links in a ZFS filesystem, so you cannot use the /vmlinux or /initrd.img symlinks as GRUB command arguments.

(iv) GRUB does not support ZFS Compression

If the /boot hierarchy is in ZFS, then that pool should not be compressed. The grub packages for Ubuntu are usually incapable of loading a kernel image or initrd from a compressed dataset.

(v) VMware

  • Set disk.EnableUUID = "TRUE" in the vmx file or vsphere configuration. Doing this ensures that /dev/disk aliases are created in the guest.

(vi) QEMU/KVM/XEN

  • In the /etc/default/grub file, enable the GRUB_TERMINAL=console line and remove the splash option from the GRUB_CMDLINE_LINUX_DEFAULT line. Plymouth can cause boot errors in these virtual environments that are difficult to diagnose.

  • Set a unique serial number on each virtual disk. (eg: -drive if=none,id=disk1,file=disk1.qcow2,serial=1234567890)

(vii) Kernel Parameters

The zfs-initramfs package requires that boot=zfs always be on the kernel command line. If the boot=zfs parameter is not set, then the init process skips the ZFS routine entirely. This behavior is for safety; it makes the casual installation of the zfs-initramfs package unlikely to break a working system.

ZFS properties can be overridden on the the kernel command line with rpool and bootfs arguments. For example, at the GRUB prompt:

linux /ROOT/ubuntu-1/@/boot/vmlinuz-3.0.0-15-generic boot=zfs rpool=AltPool bootfs=AltPool/ROOT/foobar-3

(viii) System Recovery

If the system randomly fails to import the root filesystem pool, then do this at the initramfs recovery prompt:

# zpool export rpool
: now export all other pools too
# zpool import -d /dev/disk/by-id -f -N rpool
: now import all other pools too
# mount -t zfs -o zfsutil rpool/ROOT/ubuntu-1 /root
: do not mount any other filesystem
# cp /etc/zfs/zpool.cache /root/etc/zfs/zpool.cache
# exit

This refreshes the /etc/zfs/zpool.cache file. The zpool command emits spurious error messages regarding missing or corrupt vdevs if the zpool.cache file is stale or otherwise incorrect.



------------------------------------------------------------------------------------------------------------

출처 : http://www.oracle.com/technetwork/articles/servers-storage-admin/howto-build-openstack-zfs-2248817.html



About OpenStack in Oracle Solaris 11

Want to comment on this article? Post the link on Facebook's OTN Garage page.  Have a similar article to share? Bring it up on Facebook or Twitter and let's discuss.

OpenStack, a popular open source project that provides cloud management infrastructure, is integrated into Oracle Solaris 11.2. OpenStack storage features include Cinder for block storage access (see Figure 1) and Swift for object storage that also provides redundancy and replication.

ZFS, a file system that integrates volume management features, provides a simple interface for managing large amounts data. It has a robust set of data services and also supports a variety of storage protocols.

Cinder provisions a ZFS block device or a volume for your project (or tenant) instances. An Oracle Solaris Kernel Zone or a non-global zone is created and deployed for each project instance. After you create a deployable image of the zone and launch an instance of the zone image, Cinder allocates a ZFS volume to contain the instance's image as the guest's root device.

Oracle Solaris 11.2 provides additional Cinder drivers to provision the following devices:

  • iSCSI targets from a pool on a Cinder volume node
  • FC LUNs as block devices
  • iSCSI targets from an Oracle ZFS Storage Appliance

However, using these features is beyond the scope of this article.

A good way to get started with OpenStack is to run a small, all-in-one configuration where all OpenStack services are enabled, along with the Cinder volume service on the same system node, and to use ZFS as the back-end storage.

ZFS provides robust redundancy and doesn't need any special software or hardware arrays to provide data redundancy. ZFS is simple to configure and manage.

This article describes cloud storage practices for deploying a cloud infrastructure environment on Oracle Solaris and using Cinder to provide block storage devices as ZFS volumes on the same system.

This article does not describe how to set up OpenStack. For information on setting up OpenStack, see "Getting Started with OpenStack on Oracle Solaris 11.2."

Figure 1. Diagram of OpenStack Cinder Block Storage Service

Figure 1. Diagram of OpenStack Cinder Block Storage Service

OpenStack Block Storage Prerequisites and Deployment Process

The prerequisites require that Oracle Solaris 11.2 OpenStack already be running on a single SPARC or x86 system as the compute node that runs the primary OpenStack services and has multiple local or SAN-based devices.

The components of the configuration include the following:

  • Compute node (Nova): The system node where zones for tenant or project instances are managed, but zones are not installed as part of the process described in this article. The Cinder volume service runs on this node as well.
  • Volume service (Cinder): The location where the Cinder volume service allocates ZFS volumes for tenant or project instances, which is customized in this article.
  • User authorization (Keystone): Both admin and tenant user names and passwords must already have been created and can be provided to Keystone, the authentication service.

The following general steps describe how to customize a single system that runs OpenStack services and runs the Cinder volume service to deploy ZFS volumes. Data redundancy is configured and ZFS compression and encryption can also be added in this configuration.

The remaining sections of this article describe these steps in detail.

Create the ZFS Components

Oracle Solaris runs on a ZFS storage pool that is typically called rpool. This usually small pool is not an ideal environment for hosting a cloud infrastructure, because it contains the Oracle Solaris components that run the system.

A general recommendation is to keep your root pool (rpool) small and host your application, user, and cloud data in a separate pool. Mirrored pool configurations perform best for most workloads.

The following steps describe how to configure the components shown in Figure 2: a mirrored ZFS storage pool (tank), the primary file system (cinder), and the ZFS volumes that will contain the tenant (or project) cloud instances.

Figure 2. A Mirrored ZFS Storage Pool with File System Components

Figure 2. A Mirrored ZFS Storage Pool with File System Components

  1. Create a separate, mirrored ZFS storage pool that provides data redundancy and also configures two spares.

    The following example creates a mirrored storage pool called tank that contains two pairs of mirrored disks and two spare disks:

    # zpool create tank mirror c0t5000C500335F4C7Fd0 \
    c0t5000C500335F7DABd0 mirror c0t5000C500335FC6F3d0 \
    c0t5000C500336084C3d0 spare c0t5000C500335E2F03d0 \
    c0t50015179594B6F52d0
    

    Size the mirrored storage pool according to your estimated cloud data needs. You can always add another mirrored pair of devices to your mirrored ZFS storage pool if you need more space.

    For more information about ZFS administration syntax, see Managing ZFS File Systems in Oracle Solaris 11.2.

    Review your pool's current status:

    # zpool status tank
    

    Identify the pool's raw available space:

    # zpool list tank
    
  2. Create a ZFS file system:

    Note: If you want to use encryption to secure your cloud data, encryption must be enabled (as described below) when the ZFS file system is created. For more information about ZFS encryption and other encryption key methods besides being prompted for a passphrase, see see Managing ZFS File Systems in Oracle Solaris 11.2.

    # zfs create tank/cinder
    

    Review the actual available space that is available to your file system:

    # zfs list -r tank/cinder
    

    Frequently review the space available for your ZFS volumes by monitoring the USED space and the AVAIL space.

    If you want to conserve disk space, enable compression on the tank/cinder file system. ZFS volumes that are allocated for project instances are automatically compressed.

    # zfs set compression=on tank/cinder
    

    If you want to secure your cloud data, consider enabling encryption.

    # zfs create -o encryption=on tank/cinder
    Enter passphrase for 'tank/cinder': xxxxxxxx
    Enter again: xxxxxxxx
    

Customize the Cinder Storage Location

  1. Modify the zfs_volume_base parameter in /etc/cinder/cinder.conf to identify an alternate pool/file-system.

    For example, change this line:

    # zfs_volume_base = rpool/cinder 
    

    To this:

    # zfs_volume_base = tank/cinder
    
  2. Refresh the Cinder volume services:

    # svcadm restart svc:/application/openstack/cinder/cinder-volume:setup
    # svcadm restart svc:/application/openstack/cinder/cinder-volume:default
    

Test Your Cinder Configuration

  1. Set the authentication environment variables:

    Cinder expects the following Keystone authorization parameters to be presented as options on the command line, or you can set them as environment variables.

    # export OS_AUTH_URL=http://localhost:5000/v2.0 
    # export OS_USERNAME=admin-user-name 
    # export OS_PASSWORD=password 
    # export OS_TENANT_NAME=tenant-user-name 
    
  2. Create a 1-GB test volume:

    # cinder create --display_name test 1
    +---------------------+--------------------------------------+
    |       Property      |                Value                 |
    +---------------------+--------------------------------------+
    |     attachments     |                  []                  |
    |  availability_zone  |                 nova                 |
    |       bootable      |                false                 |
    |      created_at     |      2014-07-17T20:19:33.423744      |
    | display_description |                 None                 |
    |     display_name    |                 test                 |
    |          id         | 258d80e9-2ef3-eab8-fbea-96a4d176360d |
    |       metadata      |                  {}                  |
    |         size        |                  1                   |
    |     snapshot_id     |                 None                 |
    |     source_volid    |                 None                 |
    |        status       |               creating               |
    |     volume_type     |                 None                 |
    +---------------------+--------------------------------------+
    
  3. After you create a Cinder volume, confirm that it is created and the space is consumed:

    # zfs list -r tank/cinder
    NAME                             USED  AVAIL REFER MOUNTPOINT
    tank/cinder                      1.03G 547G  31K  /tank/cinder
    tank/cinder/volume-258d80e9-...  1.03G  546G 16K  -
    

    You can also confirm that the volume is visible from OpenStack's Horizon interface. When you launch a project instance through Horizon, a new Cinder volume is created automatically, so you can remove the test volume from the Horizon->volume menu by using the Delete Volume feature.

(Optional) Perform Additional Cinder Configuration Customization

The following are additional customizations you can do:

  • Monitor your ZFS pool for disk failures by setting up smtp-notify alert notifications.
  • Use ZFS snapshots to replicate ZFS volumes.
  • Create a separate archive pool with ZFS compression enabled to reduce the storage footprint when archiving tenant data.

For more information, see Managing ZFS File Systems in Oracle Solaris 11.2.

Summary

A redundant ZFS storage pool that is serving Cinder volumes for OpenStack can be hosted on any local or SAN storage to provide cloud data protection. You can also apply robust data services, such as encryption for data security or compression for data reduction, to your cloud storage.

See Also

The ZFS blog.



------------------------------------------------------------------------------------------------------------




------------------------------------------------------------------------------------------------------------

출처 : https://wiki.archlinux.org/index.php/Installing_Arch_Linux_on_ZFS


Installing ZFS on a CentOS 6 Linux server

As most of my long term readers know I am a huge Solaris fan. How can’t you love an Operating System that comes with ZFS, DTrace, Zones, FMA and Network Virtualization amongst other things? I use Linux during my day job, and I’ve been hoping for quite some time that Oracle would port one or more of these technologies to Linux. Well the first salvo has been fired, though it wasn’t from Oracle. It comes by way of the ZFS on Linux project, which is an in-kernel implementation of ZFS (this project is different from the FUSE ZFS port).

I had some free time this weekend to play around with ZFS on Linux, and my initial impressions are quite positive. The port on Linux is based on the latest version of ZFS that is part of OpenSolaris (version 28), so things like snapshots, de-duplication, improved performance and ZFS send and recv are available out of the box. There are a few missing items, but from what I can tell from the documentation there is plenty more coming.

The ZFS file system for Linux comes as source code, which you build into loadable kernel modules (this is how they get around the license incompatibilities). The implementation also contains the userland utilities (zfs, zpool, etc.) most Solaris admins are used to, and they act just like their Solaris counterparts! Nice!

My testing occurred on a CentOS 6 machine, specifically 6.2:

$ cat /etc/redhat-release
CentOS release 6.2 (Final)

The build process is quite easy. Prior to compiling source code you will need to install a few dependencies:

$ yum install kernel-devel zlib-devel libuuid-devel libblkid-devel libselinux-devel parted lsscsi

Once these are installed you can retrieve and build spl and zfs packages:

$ wget http://github.com/downloads/zfsonlinux/spl/spl-0.6.0-rc6.tar.gz

$ tar xfvz spl-0.6.0-rc6.tar.gz && cd spl*6

$ ./configure && make rpm

$ rpm -Uvh *.x86_64.rpm

Preparing...                ########################################### [100%]
   1:spl-modules-devel      ########################################### [ 33%]
   2:spl-modules            ########################################### [ 67%]
   3:spl                    ########################################### [100%]

$ wget http://github.com/downloads/zfsonlinux/zfs/zfs-0.6.0-rc6.tar.gz

$ tar xfvz zfs-0.6.0-rc6.tar.gz && cd zfs*6

$ ./configure && make rpm

$ rpm -Uvh *.x86_64.rpm

Preparing...                ########################################### [100%]
   1:zfs-test               ########################################### [ 17%]
   2:zfs-modules-devel      ########################################### [ 33%]
   3:zfs-modules            ########################################### [ 50%]
   4:zfs-dracut             ########################################### [ 67%]
   5:zfs-devel              ########################################### [ 83%]
   6:zfs                    ########################################### [100%]

If everything went as planned you now have the ZFS kernel modules and userland utilities installed! To begin using ZFS you will first need to load the kernel modules with modprobe:

$ modprobe zfs

To verify the module loaded you can tail /var/log/messages:

Feb 12 17:54:27 centos6 kernel: SPL: Loaded module v0.6.0, using hostid 0x00000000
Feb 12 17:54:27 centos6 kernel: zunicode: module license 'CDDL' taints kernel.
Feb 12 17:54:27 centos6 kernel: Disabling lock debugging due to kernel taint
Feb 12 17:54:27 centos6 kernel: ZFS: Loaded module v0.6.0, ZFS pool version 28, ZFS filesystem version 5

And run lsmod to verify they are there:

$ lsmod | grep -i zfs

zfs                  1038053  0 
zcommon                42478  1 zfs
znvpair                47487  2 zfs,zcommon
zavl                    6925  1 zfs
zunicode              323120  1 zfs
spl                   210887  5 zfs,zcommon,znvpair,zavl,zunicode

To create our first pool we can use the zpool utilities create option:

$ zpool create mysqlpool mirror sdb sdc

The example above created a mirrored pool out of the sdb and sdc block devices. We can see this layout in the output of `zpool status`:

$ zpool status -v

  pool: mysqlpool
 state: ONLINE
 scan: none requested
config:

	NAME        STATE     READ WRITE CKSUM
	mysqlpool   ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sdb     ONLINE       0     0     0
	    sdc     ONLINE       0     0     0

errors: No known data errors

Awesome! Since we are at pool version 28 lets disable atime updates and enable compression and deduplication:

$ zfs set compression=on mysqlpool

$ zfs set dedup=on mysqlpool

$ zfs set atime=off mysqlpool

For a somewhat real world test, I stopped one of my MySQL slaves, mounted the pool on /var/lib/mysql, synchronized the previous data over to the ZFS file system and then started MySQL. No errors to report, and MySQL is working just fine. Next up, I trash one side of the mirror and verified that resilvering works:

$ dd if=/dev/zero of=/dev/sdb

$ zpool scrub mysqlpool

I let this run for a few minutes then ran `zpool status` to verify the scrub fixed everything:

$ zpool status -v

  pool: mysqlpool
 state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
	attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
	using 'zpool clear' or replace the device with 'zpool replace'.
   see: http://www.sun.com/msg/ZFS-8000-9P
 scan: scrub repaired 966K in 0h0m with 0 errors on Sun Feb 12 18:54:51 2012
config:

	NAME        STATE     READ WRITE CKSUM
	mysqlpool   ONLINE       0     0     0
	  mirror-0  ONLINE       0     0     0
	    sdb     ONLINE       0     0   175
	    sdc     ONLINE       0     0     0

I beat on the pool pretty good and didn’t encounter any hangs or kernel oopses. The file systems port is still in its infancy, so I won’t be trusting it with production data quite yet. Hopefully it will mature in the coming months, and if we’re lucky maybe one of the major distributions will begin including it! That would be killer!!



------------------------------------------------------------------------------------------------------------


------------------------------------------------------------------------------------------------------------

출처 : http://prefetch.net/blog/index.php/2012/02/13/installing-zfs-on-a-centos-6-linux-server/



Installing Arch Linux on ZFS


This article details the steps required to install Arch Linux onto a root ZFS filesystem. This article supplements the Beginners' guide.

Installation

See ZFS#Installation for installing the ZFS packages. If installing Arch Linux onto ZFS from the archiso, it would be easier to use the demz-repo-archiso repository.

Embedding archzfs into archiso

See ZFS article.

Partition the destination drive

Review Beginners' guide#Prepare_the_storage_drive for information on determining the partition table type to use for ZFS. ZFS supports GPT and MBR partition tables.

ZFS manages its own partitions, so only a basic partition table scheme is required. The partition that will contain the ZFS filesystem should be of the type bf00, or "Solaris Root".

Partition scheme

Here is an example, using MBR, of a basic partition scheme that could be employed for your ZFS root setup:

Part     Size   Type
----     ----   -------------------------
   1     512M   Ext boot partition (8300)
   2     XXXG   Solaris Root (bf00)

Here is an example using GPT. The BIOS boot partition contains the bootloader.

Part     Size   Type
----     ----   -------------------------
   1       2M   BIOS boot partition (ef02)
   1     512M   Ext boot partition (8300)
   2     XXXG   Solaris Root (bf00)

An additional partition may be required depending on your hardware and chosen bootloader. Consult Beginners' guide#Install_and_configure_a_bootloader for more info.

Tip: Bootloaders with support for ZFS are described in #Install and configure the bootloader.
Warning: Several GRUB bugs (bug #42861, zfsonlinux/grub/issues/5) prevent or complicate installing it on ZFS partitions, use of a separate boot partition is recommended

Format the destination disk

Format the boot partition as well as any other system partitions. Do not do anything to the Solaris partition nor to the BIOS boot partition. ZFS will manage the first, and your bootloader the second.

Setup the ZFS filesystem

First, make sure the ZFS modules are loaded,

# modprobe zfs

Create the root zpool

# zpool create zroot /dev/disk/by-id/id-to-partition
Warning: Always use id names when working with ZFS, otherwise import errors will occur.

Create necessary filesystems

If so desired, sub-filesystem mount points such as /home and /root can be created with the following commands:

# zfs create zroot/home -o mountpoint=/home
# zfs create zroot/root -o mountpoint=/root

Note that if you want to use other datasets for system directories (/var or /etc included) your system will not boot unless they are listed in /etc/fstab! We will address that at the appropriate time in this tutorial.

Swap partition

See ZFS#Swap volume.

Configure the root filesystem

First, set the mount point of the root filesystem:

# zfs set mountpoint=/ zroot

and optionally, any sub-filesystems:

# zfs set mountpoint=/home zroot/home
# zfs set mountpoint=/root zroot/root

and if you have seperate datasets for system directories (ie /var or /usr)

# zfs set mountpoint=legacy zroot/usr
# zfs set mountpoint=legacy zroot/var

and put them in /etc/fstab

/etc/fstab
# <file system>        <dir>         <type>    <options>             <dump> <pass>
zroot/usr              /usr          zfs       defaults,noatime      0      0
zroot/var              /var          zfs       defaults,noatime      0      0

Set the bootfs property on the descendant root filesystem so the boot loader knows where to find the operating system.

# zpool set bootfs=zroot zroot

Export the pool,

# zpool export zroot
Warning: Do not skip this, otherwise you will be required to use -f when importing your pools. This unloads the imported pool.
Note: This might fail if you added a swap partition above. Need to turn it off with the swapoff command.

Finally, re-import the pool,

# zpool import -d /dev/disk/by-id -R /mnt zroot
Note: -d is not the actual device id, but the /dev/by-id directory containing the symbolic links.

If there is an error in this step, you can export the pool to redo the command. The ZFS filesystem is now ready to use.

Be sure to bring the zpool.cache file into your new system. This is required later for the ZFS daemon to start.

# cp /etc/zfs/zpool.cache /mnt/etc/zfs/zpool.cache

if you don't have /etc/zfs/zpool.cache, create it:

# zpool set cachefile=/etc/zfs/zpool.cache zroot

Install and configure Arch Linux

Follow the following steps using the Beginners' guide. It will be noted where special consideration must be taken for ZFSonLinux.

  • First mount any boot or system partitions using the mount command.
  • Install the base system.
  • The procedure described in Beginners' guide#Generate an fstab is usually overkill for ZFS. ZFS usually auto mounts its own partitions, so we do not need ZFS partitions in fstab file, unless the user made datasets of system directories. To generate the fstab for filesystems, use:
# genfstab -U -p /mnt | grep boot >> /mnt/etc/fstab
  • Edit the /etc/fstab:
Note:
  • If you chose to create datasets for system directories, keep them in this fstab! Comment out the lines for the '/, /root, and /home mountpoints, rather than deleting them. You may need those UUIDs later if something goes wrong.
  • Anyone who just stuck with the guide's directions can delete everything except for the swap file and the boot/EFI partition. It seems convention to replace the swap's uuid with /dev/zvol/zroot/swap.
  • When creating the initial ramdisk, first edit /etc/mkinitcpio.conf and add zfs before filesystems. Also, move keyboard hook before zfs so you can type in console if something goes wrong. You may also remove fsck (if you are not using Ext3 or Ext4). Your HOOKS line should look something like this:
HOOKS="base udev autodetect modconf block keyboard zfs filesystems"
  • Regenerate the initramfs with the command:
# mkinitcpio -p linux

Install and configure the bootloader

For BIOS motherboards

Follow GRUB#BIOS_systems_2 to install GRUB onto your disk. grub-mkconfig does not properly detect the ZFS filesystem, so it is necessary to edit grub.cfg manually:

/boot/grub/grub.cfg
set timeout=2
set default=0

# (0) Arch Linux
menuentry "Arch Linux" {
    set root=(hd0,msdos1)
    linux /vmlinuz-linux zfs=zroot rw
    initrd /initramfs-linux.img
}

if you did not create a separate /boot participation, kernel and initrd paths have to be in the following format:

 /dataset/@/actual/path  

Example:

   linux /@/boot/vmlinuz-linux zfs=zroot rw
   initrd /@/boot/initramfs-linux.img

For UEFI motherboards

Use EFISTUB and rEFInd for the UEFI boot loader. See Beginners' guide#For UEFI motherboards. The kernel parameters in refind_linux.conf for ZFS should include zfs=bootfs or zfs=zroot so the system can boot from ZFS. The root and rootfstype parameters are not needed.

Unmount and restart

We are almost done!

# exit
# umount /mnt/boot
# zfs umount -a
# zpool export zroot

Now reboot.

Warning: If you do not properly export the zpool, the pool will refuse to import in the ramdisk environment and you will be stuck at the busybox terminal.

After the first boot

If everything went fine up to this point, your system will boot. Once. For your system to be able to reboot without issues, you need to enable the zfs.target to auto mount the pools and set the hostid.

For each pool you want automatically mounted execute:

# zpool set cachefile=/etc/zfs/zpool.cache <pool>

Enable the target with systemd:

# systemctl enable zfs.target

When running ZFS on root, the machine's hostid will not be available at the time of mounting the root filesystem. There are two solutions to this. You can either place your spl hostid in the kernel parameters in your boot loader. For example, adding spl.spl_hostid=0x00bab10c, to get your number use the hostid command.

The other, and suggested, solution is to make sure that there is a hostid in /etc/hostid, and then regenerate the initramfs image. Which will copy the hostid into the initramfs image. To do write the hostid file safely you need to use a small C program:

#include <stdio.h>
#include <errno.h>
#include <unistd.h>

int main() {
    int res;
    res = sethostid(gethostid());
    if (res != 0) {
        switch (errno) {
            case EACCES:
            fprintf(stderr, "Error! No permission to write the"
                         " file used to store the host ID.\n"
                         "Are you root?\n");
            break;
            case EPERM:
            fprintf(stderr, "Error! The calling process's effective"
                            " user or group ID is not the same as"
                            " its corresponding real ID.\n");
            break;
            default:
            fprintf(stderr, "Unknown error.\n");
        }
        return 1;
    }
    return 0;
}

Copy it, save it as writehostid.c and compile it with gcc -o writehostid writehostid.c, finally execute it and regenerate the initramfs image:

# ./writehostid
# mkinitcpio -p linux

You can now delete the two files writehostid.c and writehostid. Your system should work and reboot properly now. 


------------------------------------------------------------------------------------------------------------





------------------------------------------------------------------------------------------------------------

출처 : http://blog.boxcorea.com/wp/archives/129



오랫동안 사용하던 E450이 문제가 있어서 수리를 하면서, 그동안 사용하던 solaris9를 solaris10으로 다시 설치했다. disksuit로 구성을 하려다가 오래전에 본 비디오가 생각나서 zfs로 구성을 해 보기로 했다.

개념은 굉장히 간단하며, 사용방법 또한 metadb를 구성하는 것보다 간단하다. 하지만, 아직 확실한 개념 정립이 되지 않아서…

사용하는 디스크는 모두 6개로, 9GB 3개와 18GB 3개다. 9GB 1개는 OS를 설치했고, 나머지는 모두 사용하지 않는 상태다. 디스크는 아래와 같다

bash-3.00# format
Searching for disks…done
AVAILABLE DISK SELECTIONS:
0. c0t0d0 <SUN9.0G cyl 4924 alt 2 hd 27 sec 133>
/pci@1f,4000/scsi@3/sd@0,0
1. c0t1d0 <FUJITSU-MAE3091L SUN9.0G-0706-8.43GB>
/pci@1f,4000/scsi@3/sd@1,0
2. c0t2d0 <IBM-DDRS39130SUN9.0G-S98E-8.43GB>
/pci@1f,4000/scsi@3/sd@2,0
3. c2t0d0 <FUJITSU-MAG3182L SUN18G-1111-16.87GB>
/pci@1f,4000/scsi@4/sd@0,0
4. c3t2d0 <SEAGATE-ST318203LSUN18G-034A-16.87GB>
/pci@1f,4000/scsi@4,1/sd@2,0
5. c3t3d0 <SEAGATE-ST318203LC-0002-16.96GB>
/pci@1f,4000/scsi@4,1/sd@3,0
Specify disk (enter its number):

먼저, zpool을 사용하여 디스크 풀(이름은 fox_pool로 했다)을 만든다. (18GB 짜리 3개)

#zpool create fox_pool c2t0d0 c3t2d0 c3t3d0

#zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
fox_pool               58.8G    102K   58.7G     0%  ONLINE     -

9GB 디스크를 하나 더 추가했다.

 #zpool add -f fox_pool c0t2d0

예전의 disk suit이 디스크 size에 민감했던 반면에, zfs는 디스크 size가 달라도 잘 추가가 된다.  상태를 확인해보면,

# zpool status
pool: fox_pool
state: ONLINE
scrub: none requested
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
c2t0d0    ONLINE       0     0     0
c3t2d0    ONLINE       0     0     0
c3t3d0    ONLINE       0     0     0
c0t2d0    ONLINE       0     0     0

errors: No known data errors

이렇게  pool을 생성하면 /에 pool이름이 자동으로 마운트되어 사용가능한 상태가 된다. 이게 영 맘에 안들어서 zpool destroy로 만든 pool을 삭제하고 다시 생성했다(마운트하는 방법을 몰라서… ;ㅡㅡ)

#zpool create -f -m /export/home fox_pool c2t0d0 c3t2d0 c3t3d0 c0t2d0

그런데, 이것이 굳이 이럴 필요가 있는것인지 알 수가 없다. 왜냐하면, zfs 로  /export/home 에 zfs를 마운트할 수 있기 때문이다. 즉, 하나의 디스크 풀을 각기 다른 디렉토리에 마운트 할 수 있는것 같다. 그래서 oracle 프로그램용으로 zfs를 생성해서 마운트 해보았다.

 #zfs create fox_pool/oracle mount /oracle  —> error

#zfs create fox_pool/oracle

# zfs list
NAME              USED  AVAIL  REFER  MOUNTPOINT
fox_pool          130K  57.8G    31K  /export/home
fox_pool/oracle  24.5K  57.8G  24.5K  /export/home/oracle

생성은 잘 되었지만, 문제가 있다. 나는 oracle을 /에 마운트 하고 싶은 것이다.

# zfs destroy fox_pool/oracle
# zfs list
NAME       USED  AVAIL  REFER  MOUNTPOINT
fox_pool  99.5K  57.8G    30K  /export/home

그래서 삭제를 해 버렸다.

여기서 어떻게 해야할지 모르겠다……   자, 답을 알아냈다

#zfs create fox_pool/oracle
#zfs set mountpoint=/oracle   fox_pool/oracle

마운트 포인트를 변경하니 해결이 된다.

자, 여기서, 일부러 에러를 유발하여 복구시켜보았다.

#dd if=/dev/urandom of=/dev/c3t3d0s0 bs=1024 count=10000

s0는 디스크들의 정보가 기록되는 슬라이스로 보이는데, 이곳을 쓰레기 값으로 채워넣은 것이다.

#zpool scrub fox_pool

이 에러는 복구가 되지 않았다. 사실 내가 원하는것은 이것이 아니었다. 그 후 여러번의 시행 착오를 거쳐서 알아낸 것은, zpool 생성시 어떤 옵션도 주지 않으면 디스크들이 stripe 로 묶이는 것이며, mirror 옵션을 주던가 아니면 가장 중요한, raidz 옵션을 주는 것이다.  사실 내가 원하던 것은 Raid5 였다.  mirror도 좋지만, 디스크 두개중 한개밖에 사용할 수 없으니까.

아무튼, Raid-Z 로 디스크를 구성했다. Raid5와 다른점은 디스크 두개로도 구성이 가능하다는 점이다. 사실 이 경우는 mirror와 별 차이가 없는것 같다.

# zpool create fox_pool raidz c2t0d0 c3t2d0
# zpool status
pool: fox_pool
state: ONLINE
scrub: none requested
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c2t0d0  ONLINE       0     0     0
c3t2d0  ONLINE       0     0     0

errors: No known data errors
# df -h

fox_pool                16G    24K    16G     1%    /fox_pool

# zpool list
NAME                    SIZE    USED   AVAIL    CAP  HEALTH     ALTROOT
fox_pool               33.5G    178K   33.5G     0%  ONLINE     -

두 개의 디스크에 다시 에러를 유발시킨다. 여기서는 두번째 디스크 c3t2d0s0에 쓰레기값을 넣었다.

# zpool scrub fox_pool
# zpool status
pool: fox_pool
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using ‘zpool clear’ or replace the device with ‘zpool replace’.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed with 0 errors on Tue May 27 22:00:49 2008
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c2t0d0  ONLINE       0     0     0
c3t2d0  ONLINE       0     0    27

errors: No known data errors

에러를 수정하는 방법에는 두가지가 있다. clear나 replace하는 방법이다. 나는 디스크가 하나 더 있기때문에, replace했다.

#zpool replace fox_pool c3t2d0 c3t3d0
# zpool status
pool: fox_pool
state: ONLINE
scrub: resilver completed with 0 errors on Tue May 27 22:02:22 2008
config:

NAME           STATE     READ WRITE CKSUM
fox_pool       ONLINE       0     0     0
raidz1       ONLINE       0     0     0
c2t0d0     ONLINE       0     0     0
replacing  ONLINE       0     0     0
c3t2d0   ONLINE       0     0    27
c3t3d0   ONLINE       0     0     0

errors: No known data errors

잠시후 확인해보면 디스크가 바뀌어 있는것을 확인 할 수 있다.

 zpool status
pool: fox_pool
state: ONLINE
scrub: scrub completed with 0 errors on Tue May 27 22:09:20 2008
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c2t0d0  ONLINE       0     0     0
c3t3d0  ONLINE       0     0     0

errors: No known data errors

 

fox_pool에 남은 c3t2d0를 추가하기 위하여 zpool add명령을 사용했다.

결과는 별로다. 디스크가 raidz로 추가되는 것이 아니라, 기존의 raidz1과 stripe로 묶여버린 것이다. 물론, raidz 옵션을 추가할때 넣어봤지만, 이경우역시 작동되지 않았다.

# zpool status
pool: fox_pool
state: ONLINE
scrub: none requested
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c2t0d0  ONLINE       0     0     0
c3t3d0  ONLINE       0     0     0
c3t2d0    ONLINE       0     0     0

errors: No known data errors

그래서 다시 디스크 세개로 fox_pool을 생성하였으며, 역시 에러를 유발한 후 테스트 하였다.

# zpool scrub fox_pool
# zpool status
pool: fox_pool
state: ONLINE
status: One or more devices has experienced an unrecoverable error.  An
attempt was made to correct the error.  Applications are unaffected.
action: Determine if the device needs to be replaced, and clear the errors
using ‘zpool clear’ or replace the device with ‘zpool replace’.
see: http://www.sun.com/msg/ZFS-8000-9P
scrub: scrub completed with 0 errors on Tue May 27 21:44:42 2008
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c2t0d0  ONLINE       0     0     0
c3t2d0  ONLINE       0     0     0
c3t3d0  ONLINE       0     0    38

errors: No known data errors

같은방식으로 에러를 유발했다. c3t3d0의 체크섬이 38이다. 이것은 아래 명령으로 수정가능하다.

bash-3.00# zpool clear fox_pool c3t3d0
bash-3.00# zpool status
pool: fox_pool
state: ONLINE
scrub: scrub completed with 0 errors on Tue May 27 21:44:42 2008
config:

NAME        STATE     READ WRITE CKSUM
fox_pool    ONLINE       0     0     0
raidz1    ONLINE       0     0     0
c2t0d0  ONLINE       0     0     0
c3t2d0  ONLINE       0     0     0
c3t3d0  ONLINE       0     0     0

errors: No known data errors

다음은 snapshot을 만들어 보았다. snapshot은 만드는 시점의 데이타를 한번만 반영하는것 같다. 말 그대로 스냅샷을 수행하던 시점의 백업을 만드는 것 같다.

fox_pool에 zfs로 화일시스템을 만들고 세개의 화일(test.txt, last.txt, words)을 생성했다.  그리고 스냅샷을 만들었다.

# ls -al
total 576
drwxr-xr-x   2 root     sys            5 May 27 22:26 .
drwxr-xr-x   3 root     sys            3 May 27 22:25 ..
-rw-r–r–   1 root     root        7105 May 27 22:26 last.txt
-rw-r–r–   1 root     root       16566 May 27 22:26 test.txt
-r–r–r–   1 root     root      206663 May 27 22:26 words

# zfs snapshot fox_pool/home@snap1
bash-3.00# zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
fox_pool              424K  33.1G  35.3K  /fox_pool
fox_pool/home         316K  33.1G   316K  /fox_pool/home
fox_pool/home@snap1      0      –   316K  –

# rm words
# ls -al
total 58
drwxr-xr-x   2 root     sys            4 May 27 22:37 .
drwxr-xr-x   3 root     sys            3 May 27 22:25 ..
-rw-r–r–   1 root     root        7105 May 27 22:26 last.txt
-rw-r–r–   1 root     root       16566 May 27 22:26 test.txt
bash-3.00# zfs snapshot fox_pool/home@snap2
bash-3.00# zfs list
NAME                  USED  AVAIL  REFER  MOUNTPOINT
fox_pool              467K  33.1G  35.3K  /fox_pool
fox_pool/home         348K  33.1G  57.9K  /fox_pool/home
fox_pool/home@snap1   290K      –   316K  –
fox_pool/home@snap2      0      –  57.9K  -

스냅샷은 /fox_pool/home/.zfs/snapshot/snap1 과, /fox_pool/home/.zfs/snapshot/snap2에 각각 저장되어 있다.

# pwd
/fox_pool/home/.zfs/snapshot/snap2
# ls
last.txt  test.txt
# cd ../snap2
# ls
last.txt  test.txt

snap1로 롤백을 해보았다.

# zfs rollback fox_pool/home@snap1
cannot rollback to ‘fox_pool/home@snap1′: more recent snapshots exist
use ‘-r’ to force deletion of the following snapshots:
fox_pool/home@snap2
# zfs rollback -r fox_pool/home@snap1
cannot unmount ‘/fox_pool/home': Device busy
bash-3.00# pwd
/fox_pool/home
# ls
# cd ..
# zfs rollback -r fox_pool/home@snap1
# cd home
# ls
last.txt  test.txt  words

이때, 나중에 만든 snap2는 snap1이 만들어지던 시점에는 존재하지않았기때문에, 지워져 버렸다.. :-( 그래서 경고메시지가 나왔었군…

암튼, 스냅샷은 디스크를 차지하고 있으므로, 필요가 없으면 제거해준다.

#zfs  destroy fox_pool/home@snap1

스냅샷이 저장되는 디렉토리 .zfs 는 ls -al로는 확인이 되지 않았지만, 그 이후는 확인이 가능했다. 아울러, 필요한 화일을 직접 복사하는것도 가능했다.

여기까지 사용해본 소감은, 참 편리하다는거. newfs도 필요없고, 마운트도 필요없고. 생성속도 또한 빠르다. disksuit을 사용해야할 이유를 더 이상 찾을 수가 없다.(물론, solaris10이 아니라면 선택의 여지가 없겠지만…)

마지막으로 참고사이트 : http://docs.sun.com/app/docs/doc/819-5461?l=en  너무 늦게 발견 ;ㅡㅡ




------------------------------------------------------------------------------------------------------------



반응형

'OS > LINUX Common' 카테고리의 다른 글

리눅스 프로세스별 메모리 사용량 확인  (0) 2015.09.02
리눅스 LVM 생성  (0) 2015.08.20
tmpfs 설정하는 방법  (0) 2014.04.16
GNU tar의 특이성 , 체크섬 오류  (0) 2013.07.09
리눅스 vsftpd 설치 / 설정.  (0) 2011.10.14

+ Recent posts