Tuesday, June 27, 2006

Mind your struct's!

This should be elementary to seasoned C programmers but one rule of thumb is, in a struct, one should arrange the fields based on the size of their types in reverse order. This would typically lead to a more compact struct which can be crucial to an embedded system with limited memory. To illustrate, struct_a below is a struct similar to what I've found in a library provided by a partner of my company. One the other hand, struct_b is the same struct but with its fields "properly ordered". In addition, the array arr_a also appears in similar fashion in the partner's library.

#include

struct struct_a {
short f1;
char f2;
int f3;
char f4;
};

struct struct_b {
int f3;
short f1;
char f2;
char f4;
};

struct struct_a arr_a[32];
struct struct_b arr_b[32];

int main() {
printf("sizeof(struct_a):%d\n", sizeof(struct struct_a));
printf("sizeof(struct_b):%d\n", sizeof(struct struct_b));
printf(" sizeof(arr_a):%d\n", sizeof(arr_a));
printf(" sizeof(arr_b):%d\n", sizeof(arr_b));
return 0;
}
Compiling the code above using the GNUARM toolchain and running it the ARM simulator in GDB, we get this output:

sizeof(struct_a):12
sizeof(struct_b):8
sizeof(arr_a):384
sizeof(arr_b):256
Simply by reordering the fields, we could save 4 bytes in the struct and 128 bytes from the array above. Another example is given below. Again struct_a is something found in the library and struct_b is an optimized version of the struct.

#include

struct struct_a {
short f1;
int f2;
char f3;
char f4;
short f5;
char f6;
short f7;
short f8[3];
};

struct struct_b {
int f2;
short f8[3];
short f1;
short f7;
short f5;
char f3;
char f4;
char f6;
};

struct struct_a arr_a[32];
struct struct_b arr_b[32];

int main() {
printf("sizeof(struct_a):%d\n", sizeof(struct struct_a));
printf("sizeof(struct_b):%d\n", sizeof(struct struct_b));
printf(" sizeof(arr_a):%d\n", sizeof(arr_a));
printf(" sizeof(arr_b):%d\n", sizeof(arr_b));
return 0;
}
The output of the program compiled through GNUARM toolchain is given below.

sizeof(struct_a):24
sizeof(struct_b):20
sizeof(arr_a):768
sizeof(arr_b):640

Again, we could save 4 bytes from the struct and 128 bytes from the array. Note that struct_b above has a short array with 3 elements as a field. Why it was not made the first field of struct_b? After all it is bigger than an int. It has something to do with int alignment on ARM. Try it out and see.

Friday, June 23, 2006

Using interrupts for super responsive tasks?

I read an interesting article at Embedded.com titled "A data-centric OS for MCUs using a real-time publisher-subscriber-mechanism: Part1" and it is very interesting. Typically, in embedded system, you would use a preemptive OS which run tasks based on task priorities on a predefined timeslice. This article, however, proposes a publish-and-subscribe task model using interrupts and interrupt priorities. So no OS in the traditional sense. Basically, each task is an interrupt handler.

A task can be a consumer or a producer. The producer task would trigger an interrupt through software when an interesting event occurs. The consumer task would subscribe to that event by registering an interrupt handler as the event call back function. The interrupt priorities would be used to decide which event handler would be called in the event that many event handlers are eligible to run. Nested interrupt is also enabled.

The interrupt controller we are using for our SoC supports up to 64 interrupt sources that can also be programmatically triggered. It also has 16 interrupt priorities with programmable interrupt handler each to play with. It would be interesting to see how the suggested technique in the article could be mapped to it.

Tuesday, May 16, 2006

But it worked before...

I helped a colleague early last week. His PPP server running on VxWorks on our Carmen II board used to run okay a few weeks back. But when he started to work on it again a couple of weeks ago, it gave no response to PPP connection attempts from a Windows PC. My first instinct was to check the RS232 connection to the board because the connection requires manual wiring to the board (the only DB-9 connector on the board has been allocated for the console). However, according to my colleague the connection worked because his AT command processor running on the board received bytes from the PC. Moreover, the test program we had to test the two serial ports seemed to transfer bytes just fine. So both of us went on to debug the PPP configuration. We went as far as using the base BSP we have by including PPP in it. Still we could not get PPP to work. The following morning, I finally decided to look at the connection. Well, lo and behold, the connection was "half correct" which explains why the AT command processor could still receive bytes. I think you can guess what had happened and I will leave it at that.

Yesterday, I helped another colleague to debug the board with the design she had downloaded into the FPGA. A "similar" design had worked before. By "working", we mean we could get our in-circuit-emulator (ICE) to bring ARM into the background debug mode (BDM). She has been debugging her design for a number of days already. Initially, I thought the ICE used the DBGRQ and DBGACK signals to enter BDM. Apparently not. So there is no other way for the ICE to do that except through the JTAG serial protocol. With the help of a logic analyzer we watched the 5 JTAG signals both with the good and the bad designs on the board's FPGA. Immediately we saw that there was some problem with the nTRST signal. The signal was oscillating on the bad design whereas, in the good design, it started low and then high, followed by some activity on the TMS, TDI and TDO signals. Looking at the board schematic, we could see that the nTRST signal from the JTAG connector goes first to the FPGA and then from there to the ARM. So there must have been bad connection between these two points on the FPGA. Sure enough. When she rechecked her UCF file for the design, the nTRST input into the FPGA was not connected to the reset controller module which drives some other logic before going out to the ARM.

So, the lesson is, I guess, before going on debugging your design or code check all of the manual connections you have made. Especially when it has worked before.

Monday, May 08, 2006

newlib printf through UART on GNU ARM toolchain

To make function like printf and puts to work on the GNUARM toolchain, the Angel SWI handler for the AngelSWI_Reason_Write command needs to be implemented. This command is indicated through value 5 in register r0 when the Angel SWI call is made. Register r1 will point to an array of 3 32-bit values. The first is a file pointer which I will ignore because I do not do any other file operation except writing to stdout. The second is a pointer to the array of char's to print. The third contains the length of the string. Below is the code snippet for the SWI handler:
__swi_handler:
cmp r0,#5
bne .Ldone

/* skip file pointer and store char pointer in r0
and length into r1 */

ldr r0,[r1,#4]!
ldr r1,[r1,#4]

add r1,r1,r0 /* now r1 points to end of chars to print */

mov r2,#UARTA_BASE

.Lnext:
ldr r3,[r2,#LSR]
ands r3,r3,#LSR_THRE
beq .Lnext

ldrb r3,[r0],#1
str r3,[r2,#THR]

/* append CR if we see a NL */

cmp r3,#0x0A /* NL? */
moveq r3,#0x0D /* CR */
streq r3,[r2,#THR]

cmp r0,r1
blt .Lnext
mov r0,#0
.Ldone:
movs pc,lr
I am being a little bit sloppy above by not checking that the SWI instruction that triggers the call does in fact have a 0x123456 as its argument which is required for Angel SWI. Of course, the ARM exception vectors needs to be configured accordingly:
__vec_start__:
LDR PC, Reset_Addr
LDR PC, Undef_Addr
LDR PC, SWI_Addr
LDR PC, PAbt_Addr
LDR PC, DAbt_Addr
NOP
LDR PC, IRQ_Addr
LDR PC, FIQ_Addr

Reset_Addr: .word start
Undef_Addr: .word Undef_Handler
SWI_Addr: .word __swi_handler
PAbt_Addr: .word PAbt_Handler
DAbt_Addr: .word DAbt_Handler
.word 0
IRQ_Addr: .word IRQ_Handler
FIQ_Addr: .word FIQ_Handler
And the UART needs to be configured properly before use. I have the code below in my crt0.S to initialize the UART. You need to program the DLL and DLH (Divisor Latch registers) according to your UART clock and the desired baud rate:

  mov    r0,#UARTA_BASE
mov r1,#0x07
str r1,[r0,#FCR] /* reset and enable FIFOs */
mov r1,#0x00
str r1,[r0,#IER] /* no interrupt */
mov r1,#0x80
str r1,[r0,#LCR] /* to program divisor */
mov r1,#0x41 /* change this if clock change! */
str r1,[r0,#DLL]
mov r1,#0x00
str r1,[r0,#DLH]
mov r1,#0x03
str r1,[r0,#LCR] /* 8 bit, 1 stop bit */

Tuesday, May 02, 2006

Java SE for Embedded Use

I supposed Sun has now realized that there is a place for Java SE (Java Platform, Standard Edition), in addition to Java ME (Java Platform, Mobile Edition), in the embedded market. They have just released an early access to Java SE for PowerPC running Linux. See http://java.sun.com/j2se/embedded/. But to download you need to answer a questionnaire first. I guess it is good to complete the questionnaire to give Sun what embedded platforms we would like to see Java SE runs on. (But the questionnaire does not list my country for me to pick!)

We are currently using ARM7 at work. Since the full-fledge Linux does not run on ARM7 due to a missing MMU, I guess we won't be seeing Java SE for ARM7 any time soon. Unless someone does the porting to uClinux.

Friday, April 28, 2006

It's hard to pick a name!

It is hard to pick a good name for my blog!

I have been doing embedded software development on ARM processor at my new job for about a year now. I thought kARMa would be a good name play with ARM moniker. Only that name is taken already. So I have to settle with "Koda.Karma."

I would like to believe "koda" is a corruption of the English word "code" into the Malay language but I could not confirm this from the scarce Malay dictionaries on the Internet.

So there you have it. My first blog entry here and an attempt at a good sounding blog name. Welcome to Koda.Karma!