Abstract: Microcontrollers are the small scale embedded devices that serves in most WSN in processing and communicating the data acquired by sensors. The AVR family encloses classy devices under 8 bit microcontroller category. Light weight crypto algorithms are designed to incorporate rational security in embedded devices. This study aspires in improvising the performance of 64 bit light weight block cipher HIGHT through software optimizations with AVRGCC compiler of Atmel Studio open source tool under windows platform. A modification on HIGHT algorithm is also proposed in this study to enhance its security level by additional manipulation on plain text prior to processing by the original algorithm. The coding is tailored towards decline in cost (memory requirement) and lift in performance (execution cycles, throughput). The obtained results of original HIGHT implementation are compared against a similar study. The modified HIGHT algorithm suggested in this study brings out augmented security with trifling drop off in unique attributes of embedded devices, the cost and performance.
INTRODUCTION
Exchange of information between a sender and a receiver is no longer secured when communicated in an open medium. The possible intruders in the communication channel always works to diminish the confidentiality of the private data (Liu et al., 2011; Rabah, 2005a, b). Making the data invisible from the eyes of the intruders, are suggested by experts with diverse schemes of steganography (Hmood et al., 2010; Amirtharajan et al., 2012; Amirtharajan and Rayappan, 2013; Amirtharajan et al., 2013a-j; Praveenkumar et al., 2014a-l). Cryptographic techniques (Al-Somani et al., 2006, 2009; Jaberi et al., 2012; Salem et al., 2011) have been suggested to make the visible data in a shared channel secured from unintended receiver.
Crypto algorithms (Schneier, 2007) do substitution, transpositioning and transformation mechanisms involving a key in order to formulate the secret data in an unreadable form from the perspective of inadvertent users. The kind of key used divides it into two categories namely symmetric (private) (Rabah, 2005c; Abomhara et al., 2010; Muda et al., 2010) and asymmetric (public) (Mousa, 2005; Rabah, 2006; Wang et al., 2007) algorithms. Symmetric key algorithms that takes a sequence of bits as input and encrypts the plain text bit by bit is called stream ciphers whereas, when it takes a block of bytes as input and encrypts a block of data at a time is known as block ciphers (Cakiroglu, 2010). These block ciphers are used in many application areas like banking, etc. The key size and number of rounds are the chief factor deciding the security level of crypto algorithms (Zaidan et al., 2010).
Although, several crypto algorithms exists, the special kind Light Weight Cryptography (LWC) (Eisenbarth et al., 2007, 2012; Sadanandan and Mahalingam, 2009) suits well for the resource constrained devices (Rinne et al., 2007) serving the purpose of computational engine in WSNs. LWC is aimed at reduction in various factors based on the kind of target used for implementation. For fully customized hardware implementation on ASIC, the requirement on number of logic gates must be minimized. This sort of implementation provides greatest level of power and size reduction at the cost of rise in Non-Recurring Engineering (NRE) cost.
The arrival of Field Programmable Gate Arrays (FPGAs) (Rajagopalan et al., 2012; Yalla and Kaps, 2009) brings down the NRE cost with minimized power consumption and development time. Further cutback in cost can be achieved with semi custom devices like microcontrollers (Standaert et al., 2006). These devices come with two different architectures namely Complex Instruction Set Computers (CISC) and Reduced Instruction Set Computers (RISC). Both architectures contain less amount of memory storage and can be operated under optimal frequency ranges. Among them, RISC architectures are popular for its low power operation with high performance and throughput measured by Million Instructions Per Second (MIPS).
Atmega series microcontrollers from AVR family developed by ATMEL are the popular one widely used in Wireless Sensor Nodes (WSNs). WSNs enclose sensor devices and communication module integrated with computing facility provided by microcontrollers. These WSNs are deployed in several discrete points for data collection and they tend to communicate their data to a central node. The security, during this data communication, can be accomplished by cryptography algorithms. HIGHT, PRESENT and Humming Bird are some of the examples for light weight algorithms which are suitable for implementation with microcontrollers (Janakiraman et al., 2012). ATmega8 is a popular 8 bit RSIC architecture that contains on-chip data and program storage of 1 and 8 kB, respectively. It is capable of running at a maximum frequency of 16MHz at which it can provide a throughput upto 16 MIPS. Its on-chip peripheral modules include Analog to Digital Converter (ADC) with 10 bit resolution and serial communication devices (Twin Wire Interface (TWI), Serial Peripheral Interface (SPI) and Universal Synchronous Asynchronous Receiver Transmitter (USART)) that satisfy the requirement for WSN connectivity. The HIGHT (Hong et al., 2006) is a block cipher algorithm that takes 64 bit plaintext and 128 bit symmetric key as input and generates 64 bit cipher text as output. As the algorithm uses only 8 bit XOR, bit wise rotate and addition based on modulo 28 operations, the 8 bit microcontroller is the finest target for implementation.
This study provides a simple yet an ingenious way to encrypt information in under 8 bit microcontroller.
MATERIALS AND METHODS
The performance of various LWC algorithms on different microcontroller platforms and its implementation efficiency based on execution cycles and memory foot print have been analyzed by many researchers. Janakiraman et al. (2012) showed a method to randomize the sequence of sub keys selection with a 4 bit Linear Feedback Shift Register (LFSR) (Sundararaman and Upadhyay, 2011) used in the existing Humming bird algorithm (Janakiraman et al., 2014) to raise its security level. Humming bird is a 16 bit block cipher that uses a 256 bit symmetric key and a Substitution Box (S-box) to encrypt the data with 16 rounds. LFSR can be used to generate pseudo random numbers (Rajagopalan et al., 2014) based on the number of stages used.
The non-existence of S-box in HIGHT algorithm minimizes the claim on memory size of the microcontrollers. In this study, the HIGHT algorithm is implemented on AVR ATmega8 RISC microcontroller. The algorithm consists of 32 round functions with first round preceded by an initial transformation and last round succeeded by a final transformation. The 128 bit original key is used to produce eight; byte sized whitening keys in order to provide four different keys to be used by initial and final transformations each. Subsequently, four 8 bit sub keys are generated to satisfy the unique key requirement for each of the 32 rounds. All the keys, plain text and cipher text are stored in the internal SRAM of ATmega8 microcontroller.
The overall encryption process is shown in Fig. 1. The decryption process of HIGHT algorithm generates the whitening keys and sub keys using the procedure similar to the one used for encryption. The three transformation process namely initial transformation, round functions and final transformation uses the reverse procedure with same number of rounds as in encryption to obtain the original plain text as output.
Fig. 1: | Flow Chart of HIGHT algorithm |
METHODOLOGY
Key shuffle and plaintext pre processing: To improve the security level of the HIGHT algorithm, a 3 bit LFSR as shown in Fig. 2 is newly introduced in it. An n bit LFSR always produces results for 2n-1 stages. This 3 bit LFSR circuit generates a seven stage sequence in a pseudo random fashion based on its initial state (seed value) as mentioned in the Table 1.
The LFSR output is used by two new functions that are introduced to make intended modifications in the actual algorithm flow to provide additional level of security. The original HIGHT algorithm uses the first four out of the eight whitening keys (WK0-7) for the initial transformation and the last four for the final transformation. The modified HIGHT proposed in this study uses a function that shuffles the order of eight whitening keys as per the 7 stage sequence output of the 3 bit LFSR. A sample key shuffling process is shown as follows:
Fig. 2: | The 3 bit LFSR with EX-OR feedback |
Table 1: | Output sequence of 3 bit LFSR |
Table 2: | Plain text pre-process |
A simple manipulation on the plain text information is carried out in the modified HIGHT algorithm before the start of initial transformation. As this module processes the plain text prior to the beginning of plain text processing by the actual algorithm, the job of this module is termed as plain text pre-processing. Here, either one byte value modification or position swapping of two bytes of plaintext may occur with a probability of 0.5 based on the last stage value of 3 bit LFSR. On the remaining probability of 0.5, any change may not occur in plaintext. The plain text pre process operation is described in Table 2.
The pseudo code for the encryption of modified HIGHT algorithm is given as follows.
Pseudo code for modified HIGHT encryption |
The operations performed by the newly added functions whitening Key Shuffle and Plain Text Pre Process, the point of the call for these added functions along with their preceding and succeeding functions are mentioned in the pseudo code for encryption.
The coding for the HIGHT encryption/decryption with and without the proposed added functions is written in embedded C language for AVR ATmega8 microcontroller. The coding was simulated using the open source tool Atmel Studio 6 which has the integrated compilers AVR GCC and ARM GCC for AVR and ARM family of devices.
RESULTS AND DISCUSSION
All the results in terms of code size and performance analysis in terms of execution time are obtained from the debug options provided in the Atmel Studio. The raise in security level of the modified HIGHT algorithm in comparison with the original one is obvious due to the additional randomization done on whitening key sequence and plain text manipulation done prior to the initial round. In order to find the effectiveness of this modified algorithm implementation against the original one, the parameters code size, execution cycles and throughput are analyzed in this session. The optimization features of the AVR GCC compiler are used to find the impact of optimization on code size and execution time. The O3 level of optimization which can produce the fastest code at the cost of demanding more number of bytes in FLASH memory is analyzed and the values for encryption/decryption of HIGHT and Modified HIGHT are tabulated in Table 3. On the other side, Os level of optimization which is mainly to optimize for code size that also gives good optimization in all aspects. Table 4 gives this code optimization results.
The overhead in terms of FLASH memory bytes on the encryption and decryption process of the modified HIGHT algorithm when compared with original HIGHT algorithm is shown in Fig. 3 and 4. Although, around 25-30% of code rise is reported in modified HIGHT, the overall code memory occupied by these algorithms on the existing 8KB FLASH of ATmega8 is only less than 40% even in time optimized version that produced larger code.
Table 3: | HIGHT vs. Modified HIGHT-time optimized |
Table 4: | HIGHT vs. modified HIGHT-code optimized |
Fig. 3: | Memory overhead in modified HIGHT-time optimized |
Fig. 4: | Memory overhead in modified HIGHT-code optimized |
In contrast to the overhead raised in code size of the modified algorithm, Fig. 5 and 6 shows the fee paid in terms of timing overhead is only negligible for the raise in security with added functions of the modified algorithm.
To justify the efficiency of our C code and the advanced AVR GCC compiler of Atmel studio 6, the results of the original code implemented in this study are compared with the earlier results reported by Eisenbarth et al. (2012). Table 5 and 6 gives the performance comparison between the results by Eisenbarth et al. (2012) and the results obtained in this study for encryption and decryption algorithms respectively. In the Table 5 and 6 the highlighted rows of column one gives the results of algorithms implemented in this study. The time optimized execution cycle counts given in Table 3 are duplicated in column 3 of Table 5 and 6. The fourth column of the Table 5 and 6 gives the number cycles taken by the implementations to encrypt a bit of plaintext including the time for initialization and key generations.
Fig. 5: | Timing overhead in modified HIGHT-time optimized |
Fig. 6: | Timing overhead in modified HIGHT-code optimized |
Table 5: | Performance comparison-encryption |
Table 6: | Performance comparison-decryption |
Based on the execution cycles for HIGHT encryption and decryption reported by Eisenbarth et al. (2012), the throughput values are calculated assuming 16 MHz operating frequency and indicated in the last column of the Table 5 and 6. Throughput of our implementation is raised considerably in comparison with Eisenbarth et al. (2012). It is also proven from Fig. 7 and 8, the inclusion of new functions in the modified HIGHT algorithm does not make considerable dip in the throughput.
The sample values taken as 64 bit Plaintext and 128 bit symmetric key; the corresponding Ciphertext produced by both HIGHT and Modified HIGHT algorithm in ASCII and Hexadecimal forms are shown as follows.
Fig. 7: | Throughput comparison-encryption |
Fig. 8: | Throughput comparison-decryption |
Sample input | |
Plain text: | HIGHTenc [0x48 0x49 0x47 0x48 0x54 0x65 0x6E 0x63] |
Key: | KE4MODIFIEDhight [0x4B 0x45 0x34 0x4D 0x4F 0x44 0x49 0x46 0x49 0x45 0x44 0x68 0x69 0x67 0x68 0x74] |
Sample output-HIGHT | |
Cipher text: |
..OIí:Íâf [0xA8 0x4F 0x49 0xED 0x3A 0xCD 0xE2 0x83] |
Sample output-modified HIGHT |
|
Pre processed plain text: | cIGHTenH [0x63 0x49 0x47 0x48 0x54 0x65 0x6E 0x48] |
Cipher text: (Modified HIGHT) |
Œ..Bù = Ò [0x04 0x8C 0x0B 0xBC 0x42 0xF9 0x3D 0xD2] |
CONCLUSION
The above comparison charts proves that the embedded C code for HIGHT algorithm optimized with AVR GCC compiler of Atmel Studio 6 for ATmega8 can perform better than the prior implementation reported in Eisenbarth et al. (2012). This improvement in terms of execution cycles and throughput are achieved through time optimization resulted at the cost of memory size. In addition, the enhancement in security level through the additional 3 bit LFSR and other additional functionalities proposed in the modified HIGHT algorithm of this study produces performance metrics that are competing with the original algorithm.
ACKNOWLEDGMENT
The authors wish to acknowledge SASTRA University for providing infrastructural support to carry out this study.