This section discusses the problems of evaluating expressions involving values of different data types. Before any expression involving a binary operator can be evaluated the two operands must be of the same type, this may often require that one (or sometimes both) of the values of the operands be converted to a different type. The rules for such conversions depend on both the types of the operands and the particular operator. Since the C language has 45 operators and 12 different data types this seems a daunting task, suggesting that there are something like 24000 combinations to consider. Fortunately, it is not that complicated, even so the rules are far from simple. Some programming languages take the simple way out and prohibit any expressions involving values of different data types and then provide special type conversion functions. This is called strong typing and such languages are called strongly typed. The C language is not strongly typed and instead infers the required type conversions from the context.
There are six basic methods of converting values from one type to another. In discussing the methods it is common to talk about the width of a data object, this is simply the number of bits of computer memory it occupies. The methods are.
This technique is adopted when converting a signed object to a wider signed object. E.g converting a short int to a long int . It preserves the numerical value by filling the extra leading space with 1's or 0's.
This is used when converting an unsigned object to a wider unsigned object. It works by simply prefixing the value with the relevant number of zeroes.
This is used when converting an object to a narrower form. Significant information may be lost.
This is used when converting between signed and unsigned objects of the same width.
This uses special hardware to convert between floating point types and from integral to floating point types.
This is used to convert from floating point types to integral types, it may involve loss of significant information.
The basic conversions listed above are those that take place on assignment. Some examples are shown in the following program.
main()
{
signed short int ssi;
signed long int sli;
unsigned short int usi;
unsigned long int uli;
ssi = -10;
sli = ssi; /* sign extension - sli should be -10 */
printf("ssi = %8hd sli = %8ld\n",ssi,sli);
usi = 40000U; /* usigned decimal constant */
uli = usi; /* zero extension - uli should be 40000 */
printf("usi = %8hu uli = %8lu\n",usi,uli);
uli = 0xabcdef12; /* sets most bits ! */
usi = uli; /* will truncate - discard more sig bits */
printf("usi = %8hx uli = %8lx\n",usi,uli);
ssi = usi; /* preserves bit pattern */
printf("ssi = %8hd usi = %8hu\n",ssi,usi);
ssi = -10;
usi = ssi; /* preserves bit pattern */
printf("ssi = %8hd usi = %8hu\n",ssi,usi);
}
This produced the following output. ssi = -10 sli = -10 usi = 40000 uli = 40000 usi = ef12 uli = abcdef12 ssi = -4334 usi = 61202 ssi = -10 usi = 65526It may be interesting to note that the difference between the pairs of values on the last two lines is 65536. Conversions between signed long and unsigned short are typically undefined. The next program shows conversions to and from floating point types.
main()
{
double x;
int i;
i = 1400;
x = i; /* conversion from int to double */
printf("x = %10.6le i = %d\n",x,i);
x = 14.999;
i = x; /* conversion from double to int */
printf("x = %10.6le i = %d\n",x,i);
x = 1.0e+60; /* a LARGE number */
i = x; /* won't fit - what happens ?? */
printf("x = %10.6le i = %d\n",x,i);
}
producing the output x = 1.445000e+03 y = 1445 x = 1.499700e+01 y = 14 x = 1.000000e+60 y = 2147483647This program was compiled and run on a SUN Sparc station. The loss of significant data, a polite way of saying the answer is wrong, in the final conversion should be noted.
There is an extra complication concerning variables of type char . The conversion rules to be applied depend on whether the compiler regards char values as signed or unsigned. Basically the ANSI C standard says that variables of type char are promoted to type unsigned int or type signed int depending on whether the type char is signed or unsigned. An unsigned int may then be further converted to a signed int by bit pattern preservation. This is implementation dependent. The following program shows what might happen.
main()
{
char c;
signed int si;
unsigned int usi;
c = 'a'; /* MS bit will be zero */
si = c; /* will give small +ve integer */
usi = c;
printf(" c = %c\n si = %d\n usi = %u\n",c,si,usi);
c = '\377'; /* set all bits to 1 */
si = c; /* sign extension makes negative */
usi = c;
printf(" si = %d\n usi = %u\n",si,usi);
}
producing the output c = a si = 97 usi = 97 si = -1 usi = 65535The output shown above was produced using the SUN Sparc Station compiler, identical output was produced by Turbo C but the IBM 6150 gave the result
si = 255 usi = 255 for the final line.
Clearly both the SUN Sparc Station and the Turbo C compiler regarded char as a signed data type applying sign extension when assigning the signed char c to the signed int si . The conversion from signed char c to unsigned int usi is more interesting. This took place in two stages the first being sign extension and the second being bit pattern preservation. On the IBM 6150 char is treated as an unsigned data type, both assignments using bit pattern preservation. The following program with forced signing of char further clarifies the point.
main()
{
unsigned char uc;
signed char sc;
unsigned int ui;
signed int si;
uc = '\377';
ui = uc;
si = uc;
printf("Conversion of unsigned char ui = %u si = %d\n",
ui,si);
sc = '\377';
ui = sc;
si = sc;
printf("Conversion of signed char ui = %u si = %d\n",
ui,si);
}
producing the output Conversion of unsigned char ui = 255 si = 255 Conversion of signed char ui = 4294967295 si = -1For the first line of output the variable "uc" is an unsigned char and the conversion of its value to either the signed int si or the unsigned int ui is by bit pattern preservation. For the second line of output the variable "sc" is a signed char and the conversion of its value to the signed int si is a simple case of sign extension whereas the conversion to the unsigned int ui is by sign extension followed by bit pattern preservation.
The distinction between signed and unsigned char data types only becomes significant when a char data type is used to hold a value with the most significant bit set. For the normal ASCII character set this will not happen, but if you are using extended ASCII character codes (e.g. the box drawing characters found on PCs) then the most significant bit will be set.
It is also quite common practice to use variables of type char to store small integer values.
And a final example.
main()
{
unsigned char uc = '\377';
char c = '\377';
signed char sc = '\377';
int v1,v2,v3;
v1 = 20 + uc; /* unsigned arithmetic */
v2 = 20 + c; /* default */
v3 = 20 + sc; /* signed arithmetic */
printf("v1 = %d v2 = %d v3 = %d\n",v1,v2,v3);
}
producing the output v1 = 275 v2 = 19 v3 = 19The most significant point here is the value of "v1". The expression
20 + uc involves a signed integer (20) and an unsigned char (uc). The unsigned char has been converted to an unsigned int by bit pattern preservation and the signed integer 20 converted to an unsigned integer prior to the execution of the + operation.