Arithmetic and Data Types - Mixed Data Type Arithmetic

Chapter chap4 section 11

This section discusses the problems of evaluating expressions involving values of different data types. Before any expression involving a binary operator can be evaluated the two operands must be of the same type, this may often require that one (or sometimes both) of the values of the operands be converted to a different type. The rules for such conversions depend on both the types of the operands and the particular operator. Since the C language has 45 operators and 12 different data types this seems a daunting task, suggesting that there are something like 24000 combinations to consider. Fortunately, it is not that complicated, even so the rules are far from simple. Some programming languages take the simple way out and prohibit any expressions involving values of different data types and then provide special type conversion functions. This is called strong typing and such languages are called strongly typed. The C language is not strongly typed and instead infers the required type conversions from the context.

There are six basic methods of converting values from one type to another. In discussing the methods it is common to talk about the width of a data object, this is simply the number of bits of computer memory it occupies. The methods are.

  1. Sign Extension

    This technique is adopted when converting a signed object to a wider signed object. E.g converting a short int to a long int . It preserves the numerical value by filling the extra leading space with 1's or 0's.

  2. Zero Extension

    This is used when converting an unsigned object to a wider unsigned object. It works by simply prefixing the value with the relevant number of zeroes.

  3. Preserve low order data - truncate

    This is used when converting an object to a narrower form. Significant information may be lost.

  4. Preserve bit pattern

    This is used when converting between signed and unsigned objects of the same width.

  5. Internal conversion

    This uses special hardware to convert between floating point types and from integral to floating point types.

  6. Truncate at decimal point

    This is used to convert from floating point types to integral types, it may involve loss of significant information.

The basic conversions listed above are those that take place on assignment. Some examples are shown in the following program.

main()
{
        signed          short   int     ssi;
        signed          long    int     sli;
        unsigned        short   int     usi;
        unsigned        long    int     uli;
        ssi = -10;
        sli = ssi;      /* sign extension - sli should be -10 */
        printf("ssi = %8hd sli = %8ld\n",ssi,sli);
        usi = 40000U;  /* usigned decimal constant */
        uli = usi;      /* zero extension - uli should be 40000 */
        printf("usi = %8hu uli = %8lu\n",usi,uli);
        uli = 0xabcdef12;       /* sets most bits ! */
        usi = uli;      /* will truncate - discard more sig bits */
        printf("usi = %8hx uli = %8lx\n",usi,uli);
        ssi = usi;      /* preserves bit pattern */
        printf("ssi = %8hd usi = %8hu\n",ssi,usi);
        ssi = -10;
        usi = ssi;      /* preserves bit pattern */
        printf("ssi = %8hd usi = %8hu\n",ssi,usi);
}
This produced the following output.
ssi =      -10 sli =      -10
usi =    40000 uli =    40000
usi =     ef12 uli = abcdef12
ssi =    -4334 usi =    61202
ssi =      -10 usi =    65526
It may be interesting to note that the difference between the pairs of values on the last two lines is 65536. Conversions between signed long and unsigned short are typically undefined. The next program shows conversions to and from floating point types.
main()
{
        double  x;
        int     i;
        i = 1400;
        x = i;  /* conversion from int to double */
        printf("x = %10.6le i = %d\n",x,i);
        x = 14.999;
        i = x;  /* conversion from double to int */
        printf("x = %10.6le i = %d\n",x,i);
        x = 1.0e+60;    /* a LARGE number */
        i = x;  /* won't fit - what happens ?? */
        printf("x = %10.6le i = %d\n",x,i);
}
producing the output
x = 1.445000e+03 y = 1445
x = 1.499700e+01 y = 14
x = 1.000000e+60 y = 2147483647
This program was compiled and run on a SUN Sparc station. The loss of significant data, a polite way of saying the answer is wrong, in the final conversion should be noted.

There is an extra complication concerning variables of type char . The conversion rules to be applied depend on whether the compiler regards char values as signed or unsigned. Basically the ANSI C standard says that variables of type char are promoted to type unsigned int or type signed int depending on whether the type char is signed or unsigned. An unsigned int may then be further converted to a signed int by bit pattern preservation. This is implementation dependent. The following program shows what might happen.

main()
{
	char	c;
	signed		int	si;
	unsigned	int	usi;
	c = 'a';	/* MS bit will be zero */
	si = c;		/* will give small +ve integer */
	usi = c;
	printf("   c = %c\n  si = %d\n usi = %u\n",c,si,usi);
	c = '\377';	/* set all bits to 1 */
	si = c;		/* sign extension makes negative */
	usi = c;
	printf("  si = %d\n usi = %u\n",si,usi);
}
producing the output
   c = a
  si = 97
 usi = 97
  si = -1
 usi = 65535
The output shown above was produced using the SUN Sparc Station compiler, identical output was produced by Turbo C but the IBM 6150 gave the result

si = 255 usi = 255

for the final line.

Clearly both the SUN Sparc Station and the Turbo C compiler regarded char as a signed data type applying sign extension when assigning the signed char c to the signed int si . The conversion from signed char c to unsigned int usi is more interesting. This took place in two stages the first being sign extension and the second being bit pattern preservation. On the IBM 6150 char is treated as an unsigned data type, both assignments using bit pattern preservation. The following program with forced signing of char further clarifies the point.

main()
{
	unsigned	char	uc;
	signed	char	sc;
	unsigned	int	ui;
	signed	int	si;
	uc = '\377';
	ui = uc;
	si = uc;
	printf("Conversion of unsigned char ui = %u si = %d\n",
				ui,si);
	sc = '\377';
	ui = sc;
	si = sc;
	printf("Conversion of signed char ui = %u si = %d\n",
				ui,si);
}
producing the output
Conversion of unsigned char ui = 255 si = 255
Conversion of signed char ui = 4294967295 si = -1
For the first line of output the variable "uc" is an unsigned char and the conversion of its value to either the signed int si or the unsigned int ui is by bit pattern preservation. For the second line of output the variable "sc" is a signed char and the conversion of its value to the signed int si is a simple case of sign extension whereas the conversion to the unsigned int ui is by sign extension followed by bit pattern preservation.

The distinction between signed and unsigned char data types only becomes significant when a char data type is used to hold a value with the most significant bit set. For the normal ASCII character set this will not happen, but if you are using extended ASCII character codes (e.g. the box drawing characters found on PCs) then the most significant bit will be set.

It is also quite common practice to use variables of type char to store small integer values.

And a final example.

main()
{
        unsigned        char    uc = '\377';
                        char    c = '\377';
        signed          char    sc = '\377';
        int     v1,v2,v3;
        v1 = 20 + uc;   /* unsigned arithmetic */
        v2 = 20 + c;    /* default */
        v3 = 20 + sc;   /* signed arithmetic */
        printf("v1 = %d v2 = %d v3 = %d\n",v1,v2,v3);
}
producing the output
v1 = 275 v2 = 19 v3 = 19
The most significant point here is the value of "v1". The expression

20 + uc

involves a signed integer (20) and an unsigned char (uc). The unsigned char has been converted to an unsigned int by bit pattern preservation and the signed integer 20 converted to an unsigned integer prior to the execution of the + operation.