roger :: Blog


June 21, 2008

RAID stands of Redundant Array of Independent Disks. It is used to describe a storage systems' resilience to disk

failure through the use of multiple disks and by the use of data distribution and correction techniques.

RAID can be software, hardware or combination of both.

SOFTWARE RAID uses more system resources as more disk ports and channels are required. It may have lower

cost than hardware RAID because it has no dedicated RAID controller but has lower performance.

HARDWARE RAID offloads parity generation and checking. It also allows for greater disk capacity per disk port.

It requires expensive RAID controller.

RAID has got many levels.

Level 0: Also known as disk stripping, because it uses a disk file system called a strip set. This level does not

provide fault tolerance. Data is divided into blocks and is spread in a fixed order amond all the disks in the array.

This level improves read and write performance by spreading operations across multiple disks, so that operations

can be performed independently.

 

Level1: This level is also known as disk mirrroring because it uses a disk file system called a mirror set. This level

provides fault tolerance.

It provides a  redundant, identical copy of a selected disk. All data written to the primary disk is written to the

mirror disk. It also generally improves read performance but may degrade write performance.

 

Level 2: RAID level 2 uses error correction algorithm that employs disk stripping strategy that breaks a file into

bytes and spreads it across multiple disks. The error correction method requires several disks. RAID level 2 is

more advanced than level 0, because it provides fault tolerance, but is not as efficient as other RAID levels and

is hence not generally used.

 

Level 3: It is similar to level 2, because it uses the same stripping method as level 2, but it requires only one disk

for parity data. This level suffers from a write bottleneck, because all parity data is written to a single drive, but

provides some read and write performance improvement.

 

Level 4: It is similar to level 3, because the stripping method stands the same and requires only one disk for

parity data, but it employs striped data in much larger blocks or segments. This level is not as efficient as level 5

because(as in level 3) all parity data is written to a single drive, so it also suffers from a write bottleneck.

 

Level 5: Known as Stripping with parity. It is most popular and is similar to level 4 in that it stripes the data in

large blocks across all the disks in the array. It differs in that it writes tha parity across all the disks. The data 

redundancy is provided by the parity information. Data and parity information are arranged on the disk array so

that the two are always on different disks. It also has better performance than level 1 and provides fault tolerance.

 

 

 

Keywords: Disk stripping, mirroring, RAID, RAID levels, Redundant disks, storage, stripping with parity

Posted by IT Virtualization - Anshul Malik | 0 comment(s)


June 05, 2008

Generally speaking, storage virtualization refers to providing a logical, abstracted view of physical storage devices. It provides a way for many users or applications to access storage without being concerned with where or how that storage is physically located or managed. It enables the physical storage in an environment to be shared across multiple application servers, and physical devices behind the virtualization layer to be viewed and managed as if they were one large storage pool with no physical boundaries.

Virtualizing storage networks enables two key additional capabilities:

  • The ability to mask or hide volumes from servers that are not authorized to access those volumes, providing an additional level of security.
  • The ability to change and grow volumes on the fly to meet the needs of individual servers.

Essentially, anything other than a locally attached disk drive might be viewed in this light. Typically, storage virtualization applies to larger SAN (storage area network) arrays, but it is just as accurately applied to the logical partitioning of a local desktop hard drive, redundant array of independent disks (RAID), volume management, virtual memory, file systems and virtual tape. A very simple example is folder redirection in Windows, which lets the information in a folder be stored on any network-accessible drive. Much more powerful (and more complex) approaches include SANs. Large enterprises have long benefited from SAN technologies, in which storage is uncoupled from servers and attached directly to the network. By sharing storage on the network, SANs enable highly scalable and flexible storage resource allocation, high efficiency backup solutions, and better storage utilization.

Keywords: networks, storage, Virtualization

Posted by IT Virtualization - roger | 0 comment(s)


May 27, 2008

There are 5 classes : A, B, C, D, E

 

                         byte1                    byte2                    byte3                     byte4

              |<---------------->|<---------------->|<---------------->|<---------------->|

Class A     |0  Net ID            |                             Host ID                                 |

Class B     |10                      Net ID              |                        Host ID              |

Class C     |110                                            Net ID               |       Host ID      |

Class D     |1110                                      Multicast Address                             |

Class E      |1111                                     Reserved for future use                    |

 

Now Class A addresses are numberically the lowest they use 1 byte to identify class type and net id.

Now number of net ids possible in class A are : 2^7 and number of hosts possible are 2^24

Similarily number of net ids possible in class B are : 2^14 and number of hosts possible are : 2^16

Number of net ids possible in class C are : 2^21 and number of hosts possible are:2^8

We can represent these addresses as string of 1 and  0 like:

10000000 00001011 00000011 00011111

or we have dotted decimal notation to represent the same address in decimal numbers, with

each byte being separated by a '.' i.e.:

128.11.3.31

Class Ranges of Internet addresses

                              From                                            To

Class A                  0.0.0.0                                127.255.255.255

Class B               128.0.0.0                                191.255.255.255

Class C               192.0.0.0                                223.255.255.255

Class D               224.0.0.0                                239.255.255.255

Class E               240.0.0.0                                240.255.255.255

Keywords: class full addresses, class full addressing, classfull addresses, classfull addressing, ip addressing

Posted by Networking basics - Anshul Malik | 0 comment(s)


May 25, 2008

Processor Overview

The Opteron is AMD's x86 server processor line, and was the first processor to implement the AMD64 instruction set architecture (known generically as x86-64). It was released on April 22, 2003 with the SledgeHammer core (K8) and was intended to compete in the server market, particularly in the same segment as the Intel Xeon processor. Processors based on the AMD K10 microarchitecture (codenamed Barcelona) were announced on September 10, 2007 featuring a new quad-core configuration.

 

  • AMD Opteron
  • AMD Opteron 64
  • AMD Athlon 64 FX
  • AMD Turion
  • AMD Sempron




 

 

Keywords: AMD processor

Posted by Computer Architecture - nick | 0 comment(s)


May 23, 2008

Today, i am going to tell you about the AMD 64 Architecture overview. We are going to divide our session in few Blog Entries over a couple of days. I will cover the following topics :

+ Processor Overview

I. AMD64 Architecture Overview

   + Operating Modes

   + Register Set

   + Segmentation

   + Task Management

   + Interrupts and Exceptions

   + Demand Mode Paging

   + Instruction Set Extensions

   + x86 Virtualization Overview 

 II. CPU Microarchitecture

   + Processor Core Introduction and Terminology

   + Integer Pipeline

   + FPU Pipeline

   + Load/Store Unit

   + Caches

III. System Architecture

   + System Chip Components

   + Hypertransport

   + Configuration Space

   + PC Memory Technologies

   + System Power Up Process

   + APIC

   + Debug Registers

   + Performance Monitoring Registers

   + Machine Check Architecture

   + Power Management

 

 Keep looking this blog for further explanations and details of CPU.

 

 

Keywords: architecture, CPU, Debug, Memory, power management, Segmentation, Virtualization

Posted by Computer Architecture - nick | 0 comment(s)


May 17, 2008

Embedded software often runs on processors with limited computation power, thus optimizing the code becomes a necessity. In this article we will explore the following optimization techniques for C and C++ code developed for Real-time and Embedded Systems.
  1. Adjust structure sizes to power of two
  2. Place case labels in narrow range
  3. Place frequent case labels first
  4. Break big switch statements into nested switches
  5. Minimize local variables
  6. Declare local variables in the inner most scope
  7. Reduce the number of parameters
  8. Use references for parameter passing and return value for types bigger than 4 bytes
  9. Don’t define a return value if not used
  10. Consider locality of reference for code and data
  11. Prefer int over char and short
  12. Define lightweight constructors
  13. Prefer initialization over assignment
  14. Use constructor initialization lists
  15. Do not declare “just in case” virtual functions
  16. In-line 1 to 3 line functions

Adjust structure sizes to power of two

When arrays of structures are involved, the compiler performs a multiply by the structure size to perform the array indexing. If the structure size is a power of 2, an expensive multiply operation will be replaced by an inexpensive shift operation. Thus keeping structure sizes aligned to a power of 2 will improve performance in array indexing.

Place case labels in narrow range

If the case labels are in a narrow range, the compiler does not generate a if-else-if cascade for the switch statement. Instead, it generates a jump table of case labels along with manipulating the value of the switch to index the table. This code generated is faster than if-else-if cascade code that is generated in cases where the case labels are far apart. Also, performance of a jump table based switch statement is independent of the number of case entries in switch statement.

Place frequent case labels first

If the case labels are placed far apart, the compiler will generate if-else-if cascaded code with comparing for each case label and jumping to the action for leg on hitting a label match. By placing the frequent case labels first, you can reduce the number of comparisons that will be performed for frequently occurring scenarios. Typically this means that cases corresponding to the success of an operation should be placed before cases of failure handling.

Break big switch statements into nested switches

The previous technique does not work for some compilers as they do not generate the cascade of if-else-if in the order specified in the switch statement. In such cases nested switch statements can be used to get the same effect.
To reduce the number of comparisons being performed, judiciously break big switch statements into nested switches. Put frequently occurring case labels into one switch and keep the rest of case labels into another switch which is the default leg of the first switch.
Splitting a Switch Statement

//This switch statement performs a switch on frequent 
//messages and handles the infrequent messages
//with another switch statement in the default
//leg of the outer
// switch statement
pMsg = ReceiveMessage();
switch (pMsg->type)
{
case FREQUENT_MSG1:
handleFrequentMsg1();
break;
case FREQUENT_MSG2:
handleFrequentMsg2();
break;
. . .
case FREQUENT_MSGn:
handleFrequentMsgn();
break;
default:
// Nested switch statement for
//handling infrequent messages.
switch (pMsg->type)
{
case INFREQUENT_MSG1:
handleInfrequentMsg1();
break;
case INFREQUENT_MSG2:
handleInfrequentMsg2();
break;
. . .
case INFREQUENT_MSGm:
handleInfrequentMsgm();
break;
}
}

Minimize local variables

If the number of local variables in a function is less, the compiler will be able to fit them into registers. Hence, it will be avoiding frame pointer operations on local variables that are kept on stack. This can result in considerable improvement due to two reasons:

  All local variables are in registers so this improves performance over accessing them from memory.
? If no local variables need to be saved on the stack, the compiler will not incur the overhead of setting up and restoring the frame pointer.

Declare local variables in the inner most scope

Do not declare all the local variables in the outermost function scope. You will get better performance if local variables are declared in the inner most scope. Consider the example below; here object a is needed only in the error case, so it should be invoked only inside the error check. If this parameter was declared in the outermost scope, all function calls would have incurred the overhead of object a’s creation (i.e. invoking the default constructor for a).
Local varialble scope

int foo(char *pName)
{
if (pName == NULL)
{
A a;
...
return ERROR;
}
...
return SUCCESS;
}

Reduce the number of parameters

Function calls with large number of parameters may be expensive due to large number of parameter pushes on stack on each call. For the same reason, avoid passing complete structures as parameters. Use pointers and references in such cases.

Use references for parameter passing and return value for types bigger than 4 bytes

Passing parameters by value results in the complete parameter being copied on to the stack. This is fine for regular types like integer, pointer etc. These types are generally restricted to four bytes. When passing bigger types, the cost of copying the object on the stack can be prohibitive. In case of classes there will be an additional overhead of invoking the constructor for the temporary copy that is created on the stack. When the function exits the destructor will also be invoked.

Thus it is efficient to pass references as parameters. This way you save on the overhead of a temporary object creation, copying and destruction. This optimization can be performed easily without a major impact to the code by replacing pass by value parameters by const references. (It is important to pass const references so that a bug in the called function does not change the actual value of the parameter.

Passing bigger objects as return values also has the same performance issues. A temporary return object is created in this case too.

Don’t define a return value if not used

The called function does not “know” if the return value is being used. So, it will always pass the return value. This return value passing may be avoided by not defining a return value which is not being used.

Consider locality of reference for code and data

The processor keeps data or code that is referenced in cache so that on its next reference if gets it from cache. These cache references are faster. Hence it is recommended that code and data that are being used together should actually be placed together physically. This is actually enforced into the language in C++. In C++, all the object’s data is in one place and so is code. When coding is C, the declaration order of related code and functions can be arranged so that closely coupled code and data are declared together.

Prefer int over char and short

With C and C++ prefer use of int over char and short. The main reason behind this is that C and C++ perform arithmetic operations and parameter passing at integer level, If you have an integer value that can fit in a byte, you should still consider using an int to hold the number. If you use a char, the compiler will first convert the values into integer, perform the operations and then convert back the result to char.

 

Lets consider the following code which presents two functions that perform the same operation with char and int.
Compaing char and int operations

char sum_char(char a, char b)
{
char c;
c = a + b;
return c;
}
 
int sum_int(int a, int b)
{
int c;
c = a + b;
return c;
}

A call to sum_char involves the following operations:

  1. Convert the second parameter into an int by sign extension (C and C++ push parameters in reverse)
  2. Push the sign extended parameter on the stack as b.
  3. Convert the first parameter into an int by sign extension.
  4. Push the sign extended parameter on to the stack as a.
  5. The called function adds a and b
  6. The result is cast to a char.
  7. The result is stored in char c.
  8. c is again sign extended
  9. Sign extended c is copied into the return value register and function returns to caller.
  10. The caller now converts again from int to char.
  11. The result is stored.

A call to sum_int involves the following operations:

  1. Push int b on stack
  2. Push int a on stack
  3. Called function adds a and b
  4. Result is stored in int c
  5. c is copied into the return value register and function returns to caller.
  6. The called function stores the returned value.

Thus we can conclude that int should be used for all interger variables unless storage requirements force us to use a char or short. When char and short have to be used, consider the impact of byte alignment and ordering to see if you would really save space. (Many processors align structure elements at 16 byte boundaries)?

Define lightweight constructors

As far as possible, keep the constructor light weight. The constructor will be invoked for every object creation. Keep in mind that many times the compiler might be creating temporary object over and above the explicit object creations in your program. Thus optimizing the constructor might give you a big boost in performance. If you have an array of objects, the default constructor for the object should be optimized first as the constructor gets invoked for every object in the array.

Prefer initialization over assignment

Consider the following example of a complex number::

Initialization and assignment

void foo()
{
Complex c;
c = (Complex)5;
}
 
void foo_optimized()
{
Complex c = 5;
}

In the function foo, the complex number c is being initialized first by the instantiation and then by the assignment. In foo_optimized, c is being initialized directly to the final value, thus saving a call to the default constructor of Complex.

Use constructor initialization lists

Use constructor initialization lists to initialize the embedded variables to the final initialization values. Assignments within the constructor body will result in lower performance as the default constructor for the embedded objects would have been invoked anyway. Using constructor initialization lists will directly result in invoking the right constructor, thus saving the overhead of default constructor invocation.? br /> In the example given below, the optimized version of the Employee constructor saves the default constructor calls for m_name and m_designation strings.
Constructor initialization lists

Employee::Employee(String name, String designation)
{
m_name = name;
m_designation = designation;
}
/* === Optimized Version === */
Employee::Employee(String name, String designation): m_name(name), m_destignation (designation)
{
}

Do not declare “just in case” virtual functions

Virtual function calls are more expensive than regular function calls so do not make functions virtual “just in case” somebody needs to override the default behavior. If the need arises, the developer can just as well edit the additional base class header file to change the declaration to virtual.

In-line 1 to 3 line functions

Converting small functions (1 to 3 lines) into in-line will give you big improvements in throughput. In-lining will remove the overhead of a function call and associated parameter passing. But using this technique for bigger functions can have negative impact on performance due to the associated code bloat. Also keep in mind that making a method inline should not increase the dependencies by requiring a explicit header file inclusion when you could have managed by just using a forward reference in the non-inline version.

 

Keywords: c, C++, constructor, embedded, optimization, pointer, real time, variable, virtual functions

Posted by nick | 0 comment(s)


January 05, 2008

Keywords: education, elearning, future

Posted by nick | 0 comment(s)


January 04, 2008

Storage Components

Memory:
–Volatile: retains data only when it is powered
–Non-volatile: retains data even when it is not on
–Primary: hold programs when they are running
–Secondary: Store data and programs

DRAM vsMagnetic disks
–Access time for DRAM:40-80 nanoseconds
–Access time for Disks: 5-15 milliseconds
–However cost per megabyte of disk is 100 times less expensive

Keywords: DRAM, Memory

Posted by Computer Architecture - nick | 0 comment(s)