A filesystem is the methods and
data structures that an operating system uses to keep track of files on a disk
or partition; that is, the way the files are organised on the disk. File
Systems is also used to refer to a partition or disk that is used to store the
files or the type of the filesystem. Before a partition or disk can be used as
a filesystem, it needs to be initialised, and the bookkeeping data structures
need to be written to the disk. This process is called making a filesystem.
The central concepts are superblock,
inode, data block, directory block, and indirection
block. The superblock contains information about the filesystem
as a whole, such as its size (the exact information here depends on the filesystem).
An inode contains all information about a file, except its name.
The name is stored in the directory, together with the number of the inode. A directory
entry consists of a filename and the number of the inode which represents
the file. The inode contains the numbers of several data blocks, which are
used to store the data in the file. There is space only for a few data block
numbers in the inode, however, and if more are needed, more space for pointers
to the data blocks is allocated dynamically. These dynamically allocated blocks
are indirect blocks; the name indicates that in order to find the data block,
one has to find its number in the indirect block first.
Linux supports several types of filesystems.:
minix
The oldest, presumed to be the most reliable, but
quite limited in features (some time stamps are
missing, at most 30 character filenames) and
restricted in capabilities (at most 64 MB per filesystem).
ext2
The most featureful of the native Linux filesystems,
currently also the most popular one. It is
designed to be easily upwards compatible, so that new
versions of the filesystem code do not require re-making the existing
filesystems.
ext
An older version of ext2 that wasn't upwards
compatible. It is hardly ever used in new installations any more, and most
people have converted to ext2.
In addition, support for several foreign filesystem exists, to make it easier to exchange files with other operating systems.
msdos
Compatibility with MS-DOS (and OS/2 and Windows NT) FAT
filesystems.
vfat
This is an extension of the FAT filesystem known as
FAT32. It supports larger disk sizes than FAT.Most MS Windows disks are vfat.
iso9660
The standard CD-ROM filesystem; the popular Rock Ridge
extension to the CD-ROM standard that allows longer file names is supported
automatically.
nfs
A networked filesystem that allows sharing a
filesystem between many computers to allow easy
access to the files from all of them.
smbfs
A networks filesystem which allows sharing of a
filesystem with an MS Windows computer. It is
compatible with the Windows file sharing protocols.
The choice of filesystem to use depends on the
situation. If compatibility or other reasons make one of the non-native
filesystems necessary, then that one must be used. If one can choose freely,
then it is probably wisest to use ext2, since it has all the features but does
not suffer from lack of performance.
·
Creating a file system
Filesystems are created, i.e., initialised, with the mkfs
command. There is actually a separate program for each filesystem type. mkfs
is just a front end that runs the appropriate program depending on the
desired filesystem type. The type is selected with the -t fstype option.
To create an ext2 filesystem on a floppy, one would
give the following command:
$ mkfs -t ext2 –c
/dev/fd0H1440
mke2fs 0.5a, 5-Apr-94 for
EXT2 FS 0.5, 94/03/10
360 inodes, 1440 blocks
72 blocks (5.00%) reserved
for the super user
First data block=1
Block size=1024 (log=0)
Fragment size=1024 (log=0)
1 block group
8192 blocks per group, 8192
fragments per group
360 inodes per group
Checking for bad blocks
(read-only test): done
Writing inode tables: done
Writing superblocks and
filesystem accounting information:
done
$
·
Overview of the File System Hierarchy
This section is loosely based on the Filesystems
Hierarchy Standard (FHS) version 2.1, which attempts to set a
standard for how the directory tree in a Linux system is organised.
The full directory tree is intended to be
breakable into smaller parts, each capable of being on its own disk or
partition, to accommodate to disk size limits and to ease backup and other
system administration tasks. The major parts are the root (/), /usr, /var, and
/home filesystems (see Figure
1). Each part has a different
purpose. The directory tree has been designed so that it works well in a
network of Linux machines which may share some parts of the filesystems over a
read-only device (e.g., a CD-ROM), or over the network with NFS.

Figure 1
The root filesystem is specific for each
machine and contains the files that are necessary for booting the system up,
and to bring it up to such a state that the other filesystems may be mounted.
The contents of the root filesystem will therefore be sufficient for the single
user state. It will also contain tools for fixing a broken system, and for
recovering lost files from backups.
The root filesystem should generally be small, since
it contains very critical files and a small, infrequently modified filesystem
has a better chance of not getting corrupted. A corrupted root filesystem will
generally mean that the system becomes unbootable except with special measures
(e.g., from a floppy), so you don't want to risk it.
The /usr filesystem contains all
commands, libraries, manual pages, and other unchanging files
needed during normal operation. No files in /usr
should be specific for any given machine, nor
should they be modified during normal use. This allows
the files to be shared over the network,
which can be cost-effective since it saves disk space
(there can easily be hundreds of megabytes,
increasingly multiple gigabytes in /usr). It can make
administration easier (only the master
/usr needs to be changed when updating an application,
not each machine separately) to have /usr network mounted. Even if the
filesystem is on a local disk, it could be mounted read-only, to lessen the
chance of filesystem corruption during a crash.
The /var filesystem contains files that
change, such as spool directories (for mail, news, printers, etc), log files,
formatted manual pages, and temporary files. Traditionally everything in /var
has been somewhere below /usr, but that made it impossible to mount /usr
read-only.
The /home
filesystem contains the users' home directories, i.e., all the real
data on the system.
Separating home directories to their own directory
tree or filesystem makes backups easier; the other parts often do not have to
be backed up, or at least not as often as they seldom change. A big /home might
have to be broken across several filesystems, which requires adding an extra
naming level below /home, for example /home/students and /home/staff.
The /etc directory contains a lot of
files. Many networking configuration files are in /etc
The /dev directory contains
the special device files for all the devices. The device files are named using
special conventions. The device files are created during installation, and
later with the /dev/MAKEDEV script. The /dev/MAKEDEV.local is a
script written by the system administrator that creates local-only device files
or links (i.e. those that are not part of the standard MAKEDEV, such as
device files for some non-standard device driver).
The /proc filesystem contains an
illusionary filesystem. It does not exist on a disk. Instead, the kernel
creates it in memory. It is used to provide information about the system.
Shareable
vs. unshareable files
Variable vs. static files
·
Shareable vs unsharable
Shareable files are those that can be
accessed by various hosts;
Unshareable files are
not available to any other hosts..
·
Static vs variable
Variable
files can
change at any time without any intervention;
Static
files, such as read-only documentation and binaries, do not change without
an action from the system administrator or an agent that the system
administrator has placed in motion to accomplish that task
The reason for looking at
files in this way is to help you understand the type of permissions given to
the directory that holds them. The way in which the operating system and its
users need to use the files determines the directory where those files should
be placed, whether the directory is mounted read-only or read-write, and the
level of access allowed on each file. The top level of this organization is
crucial, as the access to the underlying directories can be restricted or
security problems may manifest themselves if the top level is left disorganized
or without a widely-used structure.
Mounting and unmounting
Before one can use a filesystem, it has to
be mounted.. Since all files in UNIX are in a single directory tree, the
mount operation will make it look like the contents of the new filesystem are
the contents of an existing subdirectory in some already mounted filesystem.
For example, Figure 2 shows three separate filesystems, each with their own root
directory. When the last two filesystems are mounted below /home and /usr,
respectively, on the first filesystem, we can get a single directory tree, as
in Figure 3.
Figure 2.. Three separate filesystems.

Figure 3.. /home and /usr have been mounted.

The mounts could be done as in the following example:
$ mount /dev/hda2 /home
$ mount /dev/hda3 /usr
$
To mount an MS-DOS floppy, you could use the following
command:
$ mount -t msdos /dev/fd0
/floppy
$
When a filesystem no longer needs to be
mounted, it can be unmounted with umount.
To unmount the directories of the previous
example, one could use the commands
$ umount
/dev/hda2
$ umount /usr
$
In a multi user
system you need a way to protect each user from another. One of the reasons
is that a user
can abuse the system for his needs, or be able to read/modify/delete another
users work. Even
if your using your Linux box in a single-user mode you need to protect your
self from making deadly
mistakes that can damage your system.
The Chmod Command
What is Chmod?
Chmod is used to change the access permissions of a named file, directory,
device or program. These permissions can be set to three different classes,
user, group, and the world. Each of these classes of user (owner, group and
world) can have permission to read, write or execute the file, depending on
your preference.
Permissions & Values
In Linux, every file and directory has three(3) sets of access permissions.
Those applied to the owner of the file, those applied to the group the file
has, and those of all users on the system.
You can see these permissions when you do an ls -l.
The output will look like:
|
ls -l |
||
|
total 16 |
||
|
drwx------ |
2 ty ty |
4096 Jun 9 00:01 mail |
|
-rw------- |
1 ty ty |
557 Jul 4 12:22
mbox |
|
drwx------ |
2 ty ty |
4096 Apr 5 20:55 nsmail |
|
drwx---r-x |
4 ty ty |
4096 Jun 11 21:34 public_html |
What does all this mean?
This first column of the listing is the permissions of the file.
|
drwx---r-x |
The first character represents the type of file. |
|
drwx---r-x |
The next nine(9) characters are the file permissions. |
The following letters are used to represent those permissions:
|
Letter |
Meaning |
|
|
|
|
r |
Read |
|
w |
Write |
|
x |
Execute |
Each permission has a corresponding value. Seen here:
Read = 4
Write = 2
Execute = 1
When you combine attributes, you add their value.
|
Permission |
Values |
Meaning |
|
|
|
|
|
--- |
0 |
No permissions |
|
r-- |
4 |
Read only |
|
rw- |
6 |
Read and Write |
|
rwx |
7 |
Read, Write and Execute |
|
r-x |
5 |
Read and Execute |
|
--x |
1 |
Execute |
Sure other combinations exist, but this is all you'll need (I hope). When you
combine these values, you get three numbers that make up the files the files
permissions. Here are some examples:
|
Permission |
Values |
Meaning |
|
-rw------- |
600 |
The owner has read and write permissions. Nobody else has
privileges.This is what you'll want to set for the majority of your files. |
|
-rw-r--r-- |
644 |
The owner has read and write permissions. The group and world
has read only permissions. Use this when you're sure you want to let others
read this file. |
|
-rw-rw-rw- |
666 |
*THIS IS BAD* Everybody has read and write permissions.You don't
want people to be allowed to change your files. |
|
-rwx------ |
700 |
The owner has read, write and execute permissions. This is what
you'd use for programs you'll want to run. |
|
-rwxr-xr-x |
755 |
The owner has read, write and execute permissions. The group and
rest of the world have read and execute. |
|
-rwxrwxrwx |
777 |
*THIS IS BAD* Everyone has read, write and execute
permissions.Allowing people to edit your files is just asking for trouble. |
|
-rwx--x--x |
711 |
The owner has read, write and execute permissions.The group and
the rest of the world have execute only permissions.This is perfect for
letting others run programs, but not copy. |
|
drwx------ |
700 |
This is a directory. Only the owner can read and write to it.
(Note: All directories must have an executable bit set) |
|
drwxr-xr-x |
755 |
This directory can be changed only by the owner, but everyone
else can view it's contents. |
|
drwx--x--x |
711 |
This is perfect for when you need to keep a directory world
readable, but you don't want people being able to view it's content. Only if
they know the file name they're looking for will they be allowed to read it. |
Chmod Usage
To change the permissions on a file,
log in as root and then enter the following:
# chmod permissions filename
Where permissions is a numeric value (three(3) digits which can be seen above)
and file is the name of the file for which you want to affect.
For example, to set the ty.html file to be read and writeable by the owner, but
only want to allow the group and world read access, the command would be:
]# chmod 644 ty.html
To recursively change the permissions on all the files in a specific directory,
use the -R option in the command. For example, to make all the files on
/home/ty/html set to the permission 755, you would:
# chmod -R 755 /home/ty/html
The File Systems Table
fstab stand's for File System TABle. It is where the system
administrator can tell the OS about any filesystems the machine may have access
to. It also allows default parameters to be provided for each filesystem.
A typical fstab looks something like
the following:
## /etc/fstab## <device> <mountpoint> <filesystemtype><options> <dump> <fsckorder> /dev/hdb5 / ext2 defaults 1 1/dev/hdb2 /home ext2 defaults 1 2/dev/hdc /mnt/cdrom iso9660 noauto,ro,user 0 0/dev/hda1 /mnt/dos/c msdos defaults 0 0/dev/hdb1 /mnt/dos/d msdos defaults 0 0/dev/fd0 /mnt/floppy ext2 noauto,user 0 0/dev/hdb4 none ignore defaults 0 0 none /proc proc defaults/dev/hdb3 none swap sw
Note that this system has two IDE partitions, one which is used as
/, and the other used as /home. It also has two DOS partitions which are
mounted under /mnt. Note the user option provided for the cdrom, and the floppy
drive. This is one of the many default parameters you can specify. In this case
it means that any user can mount a cdrom, or floppy disk.
·
Motivation of the second extended file system
The Second Extended File System has
been designed and implemented to fix some problems present in the first
Extended File System. Ext2fs was designed to have excellent performance, a very
robust filesystem in order to reduce the risk of data loss in intensive use and
designed to include provision for extensions to allow users to benefit from new
features without reformatting their filesystem.
The
Ext2fs supports standard Unix file types: regular files, directories, device
special files and symbolic links. Ext2fs is able to manage filesystems created
on really big partitions. While the original kernel code restricted the maximal
filesystem size to 2 GB, recent work in the VFS layer have raised this limit to
4 TB. Thus, it is now possible to use big disks without the need of creating
many partitions.
Ext2fs
provides long file names. It uses variable length directory entries. The
maximal file name size is 255 characters. This limit could be extended to 1012
if needed.
Ext2fs
reserves some blocks for the super user (root).
Normally, 5% of the blocks are reserved. This allows the administrator to
recover easily from situations where user processes fill up filesystems.
Ext2fs
allows the administrator to choose the logical block size when creating the
filesystem. Block sizes can typically be 1024, 2048 and 4096 bytes. Using big
block sizes can speed up I/O since fewer I/O requests, and thus fewer disk head
seeks, need to be done to access a file
Ext2fs
implements fast symbolic links. A fast symbolic link does not use any data
block on the filesystem. The target name is not stored in a data block but in
the inode itself. This policy can save some disk space (no data block needs to
be allocated) and speeds up link operations (there is no need to read a data
block when accessing such a link).
Ext2fs
keeps track of the filesystem state. A special field in the superblock is used
by the kernel code to indicate the status of the file system. When a filesystem
is mounted in read/write mode, its state is set to ``Not Clean''. When it is
unmounted or remounted in read-only mode, its state is reset to ``Clean''. At
boot time, the filesystem checker uses this information to decide if a
filesystem must be checked. The kernel code also records errors in this field.
When an inconsistency is detected by the kernel code, the filesystem is marked
as ``Erroneous''. The filesystem checker tests this to force the check of the
filesystem regardless of its apparently clean state.
Always
skipping filesystem checks may sometimes be dangerous, so Ext2fs forces checks
at regular intervals
The
Ext2fs kernel code contains many performance optimizations, which tend to
improve I/O speed when reading and writing files.
Ext2fs takes advantage of the buffer cache
management by performing readaheads: when a block has to be read, the kernel
code requests the I/O on several contiguous blocks. This way, it tries to
ensure that the next block to read will already be loaded into the buffer
cache. Readaheads are normally performed during sequential reads on files and
Ext2fs extends them to directory reads.
Ext2fs
also contains many allocation optimizations. Block groups are used to cluster
together related inodes and data: the kernel code always tries to allocate data
blocks for a file in the same group as its inode. This is intended to reduce
the disk head seeks made when the kernel reads an inode and its data blocks.
When
writing data to a file, Ext2fs preallocates up to 8 adjacent blocks when
allocating a new block. Preallocation hit rates are around 75% even on very
full filesystems. This preallocation achieves good write performances under
heavy load. It also allows contiguous blocks to be allocated to files, thus it
speeds up the future sequential reads.
Why do you want to migrate from ext2 to ext3? Four main reasons:
availability, data integrity, speed, and easy transition.
Availability
After an unclean system shutdown (unexpected power failure, system
crash), each ext2 file system cannot be mounted until the e2fsck program has
checked its consistency. The amount of time that the e2fsck program takes is
determined primarily by the size of the file system, and for today's relatively
large (many tens of gigabytes) file systems, this takes a long time. Also, the
more files you have on the file system, the longer the consistency check takes.
File systems that are several hundreds of gigabytes in size may take an hour or
more to check. This severely limits availability.
By contrast, ext3 does not
require a file system check, even after an unclean system shutdown, except for
certain rare hardware failure cases (e.g. hard drive failures). This is because
the data is written to disk in such a way that the file system is always
consistent. The time to recover an ext3 file system after an unclean system
shutdown does not depend on the size of the file system or the number of files;
rather, it depends on the size of the "journal" used to maintain
consistency. The default journal size takes about a second to recover
(depending on the speed of the hardware).
Data
Integrity
Using the ext3 file system
can provide stronger guarantees about data integrity in case of an unclean
system shutdown. You choose the type and level of protection that your data
receives. You can choose to keep the file system consistent, but allow for
damage to data on the file system in the case of unclean system shutdown; this
can give a modest speed up under some but not all circumstances. Alternatively,
you can choose to ensure that the data is consistent with the state of the file
system; this means that you will never see garbage data in recently-written
files after a crash. The safe choice, keeping the data consistent with the
state of the file system, is the default.
Speed
Despite writing some data
more than once, ext3 is often faster (higher throughput) than ext2 because
ext3's journaling optimizes hard drive head motion. You can choose from three
journaling modes to optimize speed, optionally choosing to trade off some data
integrity.
·
TuningSuggestions
Most Linux block device drivers use a generic tunable
"elevator" algorithm for scheduling block I/O. The /sbin/elvtune
program can be used to trade off between throughput and latency
References:
http://www.redhat.com/docs/manuals/linux/
http://www.redhat.com/docs/manuals/linux/RHL-8.0-Manual/getting-started-guide/
www.124.ibm.com/developerworks/oss/jfs/
www.jan.joh.cam.ac.uk/~adm36/StegFS
www.perso.wanadoo.fr/matthieu.willm/ext2-os2
www.kalamazoolinux.org/presentations/19981015