Sunday 7 June 2015

Soft Links and Hard Links in Linux


Soft link (symbolic link / symlink):


It is similar to “shortcuts” in windows operating system. Removing the original file does not remove the attached symbolic link or symlink, but without the original file, the symlink is useless.
      
Command to create soft link is
#ln –s <target file> <new file (name of symlink)>

symlink is simply contains the pointer to the location of the destination file. In more technical words, in soft link, a new file is created with a new inode, which have the pointer to the inode location of the original file.

Usage:
Links to directory: If you want to link directories, then you must be using Soft links, as you can’t create a hard link to a directory.
Link across filesystems: If you want to link files across the filesystems, you can only use symlinks/soft links.

This is a  special file type whose data part carries a path to another file. 

Hard link:


In technical words, only an entry into directory structure is created for the file with its inode number, but it points to the inode location of the original file. This means there is no new inode creation in the hard link. Inode of file maintains count of links pointing to this file, when count becomes zero, inode is free to de-allocate. 

Hard link can only refer to data that exists on the same file system.
Example :  '..' is a hardlink to the file that implements the immediate parent of the current directory.

  • If you move the source file to some other location on the same filesystem, the hard link will still work, but soft link will fail. 
  • Hard links takes very small amount of space, as there are no new inodes created while creating hard links.
  • If you want to make sure safety of your data, you should be using hard link, as in hard link, the data is safe, until all the links to the files are deleted, instead of that in soft link, you will lose the data if the master instance of the file is deleted.


Command to create hard link is 
#ln <target file> <hardlink name>


If you try #ls -il  , you will see that both files (link and source file) have same inode number ,  files have the same file permissions and the same size. Because that size is reported for the same inode, we can see that a hard link does not occupy any extra space on your space.

SCSI Report LUNS Command


The SCSI Report LUN command (opcode A0 h) requests for logical unit inventory accessible to the I_T nexus on which command is being sent. The logical unit inventory is a list that shall include the logical unit numbers of all logical units having a valid PERIPHERAL QUALIFIER value.
Below is the CDB format of this command. 




The SELECT REPORT field specifies the types of logical unit addresses that shall be reported.
If SELECT REPORT = 00h
The list shall contain the logical units accessible to the I_T nexus with the following addressing methods
a) Logical unit addressing method,
b) Peripheral device addressing method;
c) Flat space addressing method; and
d) Extended logical unit addressing method.
If there are no logical units, the LUN LIST LENGTH field shall be zero.

If SELECT REPORT = 01h
The list shall contain only well-known logical units, if any. If there are no well known
logical units, the LUN LIST LENGTH field shall be zero.
I
If SELECT REPORT = 02h
 The list shall contain all logical units accessible to the I_T nexus.

The ALLOCATION LENGTH field specifies the maximum number of bytes or blocks that an Initiator has allocated in the Data-In Buffer (result buffer).

The SCSI device should report logical unit inventory using below format.



The LUN LIST LENGTH field shall contain the length in bytes of the LUN list that is available to be transferred. The LUN list length is the number of logical unit numbers in the logical unit inventory multiplied by eight.



Reference: SPC-4 Document

SCSI Test Unit Ready Command


The SCSI Test Unit Ready command (opcode 00h) is used to determine if a SCSI device is ready to transfer data (read/write). The SCSI target device will then return either good status or a “check condition”.  Implementation of this command is mandatory for target device.
This command allows an SCSI initiator to poll a logical unit until it is ready. At the end of this command the SCSI target returns a Status Code byte which is usually 00h => success, 02h => Check Condition (error), or 08h => busy.
When the target returns a Check Condition in response to a command, the initiator usually then issues a SCSI Request Sense command (03h) in order to obtain more information/sense key and additional sense codes.

Below is 6 byte CDB structure for this command.


Reference: SPC-4 doc 

SCSI Inquiry Command


The INQUIRY command (opcode 12h) is a SCSI primary command, used by application server (SCSI initiator) to fetch information regarding logical unit and SCSI target device. This request is processed by all SCSI device servers. Implementation of this command is mandatory for SCSI target.

Initiator can request a SCSI target for standard device information, vital product data (VPD), or information about which commands are supported by the device server. 

When SCSI initiator sends this command, it also informs the target about the type of information required. To notify this, VPD (vital product data) page codes are used. Below is the CDB (command descriptor block) structure of INQUIRY command. 




If EVPD = 0, here enable EVPD ( enable vital product data) is disabled so SCSI target sends standard device information. This information includes.. 
1. Peripheral device type
2. T10 vendor identification (8 bytes),
3. Product identification (16 bytes)
4. Product revision level (4 bytes)
5. Version descriptor identifying up to eight standards to which the device           claims conformance. (in 2 byte fields, such as 0900h for FCP-2, 0940h for         SRP, and 0960h for iSCSI). 

If EVPD = 1, then SCSI initiator requests for VPD data by specifying the VPD page code.
All SCSI targets must respond to inquiries to page 0x80. Often they have a vendor-specific response to page 0x83 too. All SCSI targets must respond to inquiries to below VPD page.
1. Supported VPD page , Page code = 00h
2. Device identification, Page code = 83h
Optionally, the INQUIRY command may support the unit serial number page (code 80h), the target operating definition page (code 82h) and other VPD pages.

If the SCSI Target does not implement the requested VPD page, then the command shall be terminated with CHECK CONDITION status with appropriate sense code.

Allocation Length specifies the length of inquiry data the initiator is prepared to accept in bytes.


Reference : SPC-4 doc

Monday 25 May 2015

MCS locks in Linux kernel


This lock added in kernel from kernel version 3.15 and 3.18. It is fair version of spinlock where each CPU tries to acquire the lock spinning on local variable.

MCS lock is defined as
struct mcs_spinlock {
         struct mcs_spinlock *next;
         int locked; /* 1 if lock acquired */
    };


To acquire lock:-

void mcs_spin_lock(struct mcs_spinlock **global_queue, struct mcs_spinlock *node);

Arg1: global queue one for each type of spinlock.
Arg2: This is defined per CPU of same type as Arg1 , will be added in queue.

This function tries to acquire spinlock by performing "unconditional" atomic exchange operation to store its own spinlock struct address in next field of global spinlock struct (main lock).

global spinlock struct's next points tail of queue of the waiting CPUs.

global spinlock struct's next points to null if no one is holder, its previous
value (null) in this case.

If global spinlock struct's next is not null it means it is already acquired.
it points to last member in queue, so CPU will save its own struct address in that member's next as well. Now it will spin on local variable (locked) till its not zero.

To release lock:

void mcs_spin_unlock(struct mcs_spinlock **global, struct mcs_spinlock *node)

When thread (CPU) finishes with the lock, it will do a compare-and-swap operation on the main lock (global), trying to set the next pointer to NULL on the assumption that this pointer still points to its own structure.

If that operation succeeds, the lock was never contended and the job is done.

If some other CPU has changed by some other thread (CPU), the compare-and-swap will fail. In that case, thread will not change the main lock at all;
instead, it will change the locked value to "0" in lock structure pointed by its next filed.


MCS lock v/s Spinlock

  • MCS lock maintains FIFO order while acquiring lock, spinlock does not maintain FIFO order of acquiring lock. So whoever comes first to acquire the lock will get first.
  • In spinlock every process waits on single variable, MCS lock reduces waiting order from n to 1 by maintaining per CPU local variable and spin on it.
  • Every attempt to acquire a spinlock requires moving the cache line containing that lock to the local CPU, hence lot of cache bouncing happens, this issue is avoided in MCS lock.


References:
http://lwn.net/Articles/590243/

Defined in linux/3.18.1/kernel/locking/mcs_spinlock.h

Friday 22 May 2015

Hard links and Soft links


Soft link (symbolic link / symlink):

It is similar to “shortcuts” in windows operating system. Removing the original file does not remove the attached symbolic link or symlink, but without the original file, the symlink is useless      

To create a soft link already existing file:-
#ln –s <target file> <new file (name of symlink)>

It simply contains the pointer to the location of the destination file. In more technical words, in soft link, a new file is created with a new inode, which have the pointer to the inode location of the original file. It's a special file type whose data part carries a path to another file.

Usage:

  • If you want to link directories, then you must be using Soft links, as you can’t create a hard link to a directory.
  • If you want to link files across the filesystems, e.g EXT4, NFS. you can only use symlinks/soft links. 

Hard link:

It is simply giving one more name or label to existing file.

In technical words, only an entry into directory structure is created for the file with its inode number, but it points to the inode location of the original file. This means there is no new inode creation in the hard link. Inode of file maintains count of links pointing to this file, when count becomes zero, inode is free to deallocate.

To create hard link to existing file:-
#ln <target file> <hardlink name>

If you try #ls -il  , you will see that both files have same inode number ,  files have the same file permissions and the same size. Because that size is reported for the same inode, we can see that a hard link does not occupy any extra space on your space
  •   '..' is a hardlink to the file that implements the immediate parent of the current directory.
  • If you move the source file to some other location on the same filesystem, the hard link will still work, but soft link will fail.
  • Hard links takes very small amount of space, as there are no new inodes created while creating hard links.
  • If you want to make sure safety of your data, you should be using hard link, as in hard link, the data is safe, until all the links to the files are deleted, instead of that in soft link, you will lose the data if the master instance of the file is deleted.

limitation :
  • Hard link can only refer to data that exists on the same file system.
  • Hard link does not work on directory. 



Thursday 21 May 2015

Bash script variables

Below is the list standard variables can be used in bash script.

  • $$ = Expands to the process ID of the shell
  • $0 = Name of bash script file.
  • $# = Number of arguments passed to bash script 
  • $1 = Argument 1
  • $2 = Argument 2
  • $@ = expands to the positional parameters, starting from one. i.e. "$@" is equivalent to "$1" "$2" …
  • $? = Exit status of mostly recently run process


Monday 23 March 2015

Multipathing in Storage:


  • Multipathing in SAN is a technique of creating more than one physical path between the server and its storage devices.
  • This technique uses redundant physical path components — adapters, cables, and switches — to create logical paths between the server and the storage device. In the event that one or more of these components fails (i.e. HBA failure or FC cable failure or SAN switch failure or Array controller port failure), causing the path to fail, multipathing logic uses an alternate path for I/O so that applications can still access their data.
  • Each network interface card (in the iSCSI case) or HBA should be connected by using redundant switch infrastructures to provide continued access to storage in the event of a failure in a storage fabric component.
  • The multipathing software handles all IO requests, passes them through the best possible path, and takes care of business if one of the paths dies. Using Multipathing automatic failover and recovery and Optimized load balancing can be implemented to ensure applications performance and availability.
  • It can be implemented in:-
    • Active/Active mode: I/O can be spread among all paths.

    •  Active/Passive mode: I/O can be spread in half of total path.

  • Each server requires multipath driver to support this feature
    • In Windows server, MPIO driver is required.
    •  In Linux server, dm_multipath device mapper is used along with mpathconf utility, multipath command, and multipathd daemon. Multipath configuration information is stored in /etc/multipath.conf