The C programming language does not, in fact, support a string data type, however strings are so useful that there is an extensive set of library functions for manipulating strings. Three of the simplest functions are
| Name | Function |
|---|---|
| strlen() | determine length of string |
| strcmp() | compare strings |
| strcpy() | copy a string |
The second function, strcmp(), takes the start addresses of the two strings as parameters and returns the value zero if the strings are equal. If the strings are unequal it returns a negative or positive value. The returned value is positive if the first string is greater than the second string and negative if it is less than. In this context the relative value of strings refers to their relative values as determined by the host computer character set (or collating sequence ).
It is important to realise that you cannot compare two strings by simply comparing their start addresses although this would be syntactically valid.
The third function, strcpy(), copies the string pointed to by the second parameter into the space pointed to by the first parameter. The entire string, including the terminating NUL, is copied and there is no check that the space indicated by the first parameter is big enough.
A simple example is in order. This program, stall3, has the opposite effect to the example given earlier.
main()
{
char *days[] = {
"Sunday",
"Monday",
"Tuesday",
"Wednesday",
"Thursday",
"Friday",
"Saturday"
};
int i;
char inbuf[128];
printf("Enter the name of a day of the week ");
gets(inbuf);
do
{
if(strcmp(days[i++],inbuf)==0)
{
printf("day number %d\n",i);
exit(0);
}
} while(i<7);
printf("Unrecognised day name\n");
}
A typical dialogue $ stall3 Enter the name of a day of the week Tuesday day number 3 $ stall3 Enter the name of a day of the week Bloomsday Unrecognised day name $ stall3 Enter the name of a day of the week Friday day number 6 $The program is totally unforgiving of any errors in the input layout such as leading and trailing spaces or entry all in lower case or entry of abbreviations.
To demonstrate the use of strlen(), here is a simple program, called stall4, that reads in a string and prints it out reversed, a tremendously useful thing to do. The repeated operation of this program is terminated by the user entering a string of length zero, i.e. hitting the RETURN key immediately after the program prompt.
main()
{
char inbuf[128]; /* Hope it's big enough */
int slen; /* holds length of string */
while(1)
{
printf("Enter a string ");
gets(inbuf);
slen = strlen(inbuf); /* find length */
if(slen == 0) break; /* termination condition */
while(slen > 0)
{
slen--;
printf("%c",*(inbuf+slen));
}
printf("\n");
}
}
The program operates by printing the characters one by one, starting with the last non-NUL character of the string. Notice that "slen" will have been decremented before the output of the character, this is correct since the length returned by
strlen()
is the length excluding the NUL but the actual characters are aggregate members 0 .... length-1. A typical dialogue is illustrated below.
$ stall4 Enter a string 1234 4321 Enter a string x x Enter a string abc def ghi ihg fed cba Enter a string $
Here is another version of the same program re-written using a more typical C programming style.
main()
{
char inbuf[128]; /* Hope it's big enough */
int slen; /* holds length of string */
while(1)
{
printf("Enter a string ");
gets(inbuf);
if((slen = strlen(inbuf)) == 0) break;
while(slen--)printf("%c",*(inbuf+slen));
printf("\n");
}
}
It illustrates the use of side-effects and address arithmetic and should be compared with the first version. The next prorgam is designed to drive home the point about comparing strings as distinct from comparing their start addresses.
main()
{
char x[22],*y;
strcpy(x,"A Programming Example");
y = x;
/* First test - compare y with constant */
if( y == "A Programming Example")
printf("Equal 1\n");
else
printf("Unequal 1\n");
/* Second test - compare using strcmp() */
if(strcmp(x,"A Programming Example") == 0)
printf("Equal 2\n");
else
printf("Unequal 2\n");
/* Assign constant address and compare */
y = "A Programming Example";
if( y == "A Programming Example")
printf("Equal 3\n");
else
printf("Unequal 3\n");
}
It produced the following output Unequal 1 Equal 2 Unequal 3
The first comparison compares the address held in the variable "y" with the address of the system place where the string constant "A Programming Example" is stored. Clearly the start address of the aggregate "x" is different from the address of the system place where the string constant "A Programming Example" is stored, since strcpy() has only copied the string.
The second test used strcmp() to compare the strings rather than their start addresses, the result is, not surprisingly, that the strings were, in fact, equal.
The final test looks rather surprising. A value has been assigned to "y" and "y" has then been immediately compared with that value and found to be different. The explanation is that the compiler has not been clever enough to spot the repeated use of the same string constant and has made multiple copies of this constant in memory. This underlines the fact that the actual value of a string constant is the address of the first character. Some compilers may be clever enough to avoid this problem. The ANSI standard does not specify any particular behaviour.
Finally an example using strcpy(). This program, called stall5 twiddles the case of every character in the input string.
main()
{
char istr[128]; /* input buffer */
char tstr[128]; /* translated string here */
int i;
int slen; /* string length */
while(1)
{
printf("Enter a string ");
gets(istr);
if((slen=strlen(istr))==0) break; /* terminate */
strcpy(tstr,istr); /* make a copy */
i = 0;
while(i < slen) /* translate loop */
{
if( tstr[i] >= 'A' &&
tstr[i] <= 'Z') /* upper case */
tstr[i] += 'a'-'A';
else if(tstr[i] >= 'a' &&
tstr[i] <= 'z') /* lower case */
tstr[i] += 'A'-'a';
i++; /* to next character */
}
printf(" Original string = %s\n",istr);
printf("Transformed string = %s\n",tstr);
}
}
The following dialogue is typical $ stall5 Enter string aBDefgXYZ Original string = aBDefgXYZ Transformed string = AbdEFGxyz Enter string ab CD 123 Original string = ab CD 123 Transformed string = AB cd 123 Enter string :::x:::y:::Z::: Original string = :::x:::y:::Z::: Transformed string = :::X:::Y:::z::: Enter stringThe program has preserved the original string by copying it to a different memory area before manipulating it.
It is important that there is somewhere to copy the string to. A common programming error is illustrated below. This variation on the previous program is called stall6.
main()
{
char istr[128];
char *tstr;
int i;
int llen;
while(1)
{
printf("Enter string ");
gets(istr);
if((llen=strlen(istr))==0) break;
strcpy(tstr,istr);
i = 0;
do
{
if(tstr[i]>='A' && tstr[i]<='Z')
tstr[i] += 'a'-'A';
else if(tstr[i]>='a' && tstr[i]<='z')
tstr[i] += 'A'-'a';
} while(i++<=llen);
printf(" Original string = %s\n",istr);
printf("Transformed string = %s\n",tstr);
}
}
This is what happened $ stall6 Enter string abcdefghjikl Segmentation fault (core dumped)The programmer has probably assumed that there really is such a data type as a string and that strcpy() provides the facility to assign strings. The failure of the program is not surprising once you think about the initial value of "tstr". The initial value of non-initialiased variables was discussed earlier. Clearly copying the input character string to whatever location tstr happened to point to, has overwritten something important or has attempted to access a memory location not available to the program. Occassionally this error will not cause program failure because "tstr" happens to point to somewhere relatively safe and the program has only been tested with strings that were not long enough to cause damage when copied to whatever place "tstr" pointed to.