Things That Go Bump in the Night


Two Versions, Same Content

There are two versions of this document. Both contain the same technical descriptions.

  • The one you are reading is the all-in-one (single-page) version. For those wanting a hard-copy to read off-line, this version will support that.
  • The second version displays one topic at a time. For those wanting focus on a specific term without any distractions, you might prefer the “one at a time” version.

Welcome

Historical comments are included to help illuminate the whys and wherefores. While primarily focused on Unix-like realms such as Linux, most of what’s here applies to all contemporary operating systems.

Two source examples are provided, elevate.c and threads1000.c. Please copy and paste these into your own system for experimentation. They were verified on a Linux system. Building them elsewhere may require some adjustments.

In the text, you’ll find plenty of links between topics. For some readers, following your curiosity may be preferable to a straight-through approach.

There’s also an alphabetical index at the end.

Major Topics

Begin

Execution is a property of two things, a thread and an interrupt.

A Process, on the other hand, does not execute code. It is simply a box.

A process contains an Initial Thread that executes code beginning at _start and then main. Any thread may create Additional Threads. The additional threads will not execute main.

The initial thread’s stack is different from that of any additional threads.

Execution priority, round-robin (SCHED_RR), first-in-first-out (SCHED_FIFO) are scheduling considerations.

SCHED_RR attempts to be fair through time-slicing, but rarely is.

SCHED_FIFO ignores time-slicing. Fair is not a consideration.

Returning from main will cause the process to exit().

Returning from one of the additional threads will cause that thread to pthread_exit().

When there are no threads left, the process will be removed.

Background

In this section:

Process History

In 1969–1970, Ken Thompson and Dennis Ritchie invented Unix—now UNIX—operating system.

In those early incarnations, a process would execute code. And it could create a clone of itself by doing a fork(). The clone would sometimes replace its Code, Data, and other segments by doing an exec() system call. This is how one program would start another, with fork() and then exec().

When there were multiple processes wanting to use the system, time-slicing was employed to let all of them get a portion of the computer’s time. Eventually, they would all finish their work.

Thus, the original Unix was a time-sharing operating system. That was its purpose, to make one expensive, large computer available to many users at the same time. It was a shared resource.

Thread History tells the next step in this developing story.

Thread History

In the 1960s and 70s, there was another category of computing, the real-time arena. In this group, computers didn’t have users. Instead, they were destined for what we now call embedded environments.

These computers and their software were integrated into larger machines. In some, the computers and its software was looking through video cameras at the passing terrain, and adjusting the tail fins of a guided missile thousands of miles into enemy territory. Autonomous is a good word for these. And when the on-board warhead was commanded to detonate, the computer and its software was destroyed along with the missile and its target.

Many years would pass before the conflicting needs of the time-sharing and real-time sectors of computing would come together. That was finally achieved in the POSIX pthreads model. But even then, there were and still all plenty of extraordinarily demanding settings that don’t use that model. The world of the Real-Time Operating System, the RTOS, continues to this day.

The RTOSes (almost but not quite universally) use the term task for what executes code (and the interrupt handlers). The RTOS tasks are the inspiration for the POSIX thread, also known as pthreads and found in the libpthread.so library..

While there are some significant differences in the various RTOSes available today, some things are constant. Tasks (or their equivalent) compete to use the processor based on their priority. At any given instant, the RTOS has taken steps to be sure the highest-priority ready task is the one selected for execution. And when that task surrenders the processor to wait for something, or if an interrupt or a clock tick makes a different and higher-priority task ready, the RTOS does a context switch to what is now the highest-priority ready task.

The hallmark of the RTOS environment is the ability to make these decisions, flawlessly, hundreds and even thousands of times per second.

To achieve this stellar performance, early machines ran with a minimum of overhead. There was no memory protection and very little error checking. Programs written by human beings had to be tested to perfection because bugs were fatal, not just to the execution of computer code, but to the hardware, the room, the terrain, the populations where failing missiles might crash. This is the realm of guided missiles, aircraft landing on automated controls in heavy fog, deep-space exploration, rovers that land on and explore the planet Mars, of cars and trucks that will–in spite of our trepidations–someday travel our public highways.

Over time as processor speeds have increased, hardware features offering memory protection and other error-detection mechanisms have become less rare. Some would say that, at least in some categories, they are now the norm. Bugs are more easily detected because the hardware, not just humans, can keep tabs on what the programs, the threads, are doing.

That “app” you run on your Android tablet is full of threads. At any given moment, there are dozens, often a hundred or more of them ready to service any tap you make on the screen, and to wake up unprovoked to check for email, a text message, or an incoming phone call.

Threads do the work. Not “apps”, not a process, but a thread. (Or an interrupt.)

Exception versus Interrupt History

In the early days of microprocessors, two companies–Intel and Motorola–were major competitors. What one did, the other would claim, “They did it wrong. We’re better. Buy our microprocessors so you’ll be right.”

One of the areas in which they differed were in defining what the words exception and interrupt mean.

In one corporation’s products, Intel to be precise, the concept of exceptions was the over-arching term. There were lots of exceptions that might happen inside a computer. Some of these were caused by the programs doing things, and some were caused by external hardware. That latter group, the one from I/O devices, were collectively called interrupts. They were a subset of the bigger topic of exceptions.

But the other corporation said, “No, they’ve got it wrong.” At Motorola, they said there were a lot of things that could interrupt a program’s execution. Some of them were because the program has done something. “We’ll call those kinds of things exceptions.” These program exceptions were one category within the larger group of things that would interrupt a program’s execution. The system clock and its I/O devices were another category of things that could cause other interrupts.

Thus, in the Intel camp, exceptions is the big classification with interrupts being a subset.

But in the Motorola camp, it’s reversed. There are lots of kinds of interrupts with only a few of them caused by programs, and those are called exceptions.

That difference in definitions persists today.

In this presentation, we take neither side. Instead, we use the term exception to refer to things a program might do, and an interrupt as belonging to the world of input, output, and the passage of time.

Herein, interrupt and exception are unrelated topics.

Getting Started

In this section:

Process

A process is like a box.

It has an inside and an outside. It comes in various sizes but, once glued together, it houses a specific amount of space. Note also that if the lid of the box is closed, things inside stay inside, and things outside can’t get in.

Box

If we were inside an empty box, what would we see?

Six surfaces: a floor, a ceiling, and four walls.

And if someone’s been in there with a pen, pencil, or Magic Marker, there could some graffitti.

A process also has six surfaces. They’re called segments. For discussion, we’ll map them to the inside of our box.

  1. Code - the floor under our feet
  2. Data - left-hand wall
  3. Libraries - the wall at the far side
  4. Library Data - the wall behind us
  5. Heap - right-hand wall
  6. Dynamic Mappings - up there: the ceiling

And someone’s definitely been in there because there’s writing on the walls, the floor, and even the ceiling of our process. It consists of outlines, most of them square or rectangular, and words written on the edge of each outline.

Segments

  1. If the floor of our process is the Code Segment, somewhere on it will be written main. And in other places on the floor will be the names of all the functions written for this application. (You’ll have to ask the application programmer what names he or she used. It’s entirely up to them.)
  2. The Data Segment is the left-hand wall. All our static data variables will be there. Some of them will be globals –that means any part of the program can refer to them by name. And some will be locals. To access them by name, you have to be standing in the right area of the floor.
  3. The entire right-hand wall is something called the heap, and part of it has a rectangular area marked-off in pencil. It’s labelled Stack, and inside that is another rectangle–really tiny–labelled argc.
  4. Straight across on the far wall are the libraries. They have executable code like the floor, but the application programmer didn’t write them. They’re mapped into the box because the programmer wants to use what they do, and because they also provide access to the operating system that’s somewhere outside of the box. There are some big divisions on the far wall: libc.so and libpthread.so are two of them. And inside each are smaller, marked-off areas like getenv() and mmap() in the first, and pthread_create() and pthread_exit() in the second.
  5. On the wall behind our back is the Library Data Segment. It has major divisions the same as libraries on the far wall, and minor chunks within those. And like the Data Segment to your left, the wall behind us contains static variables, but these are the domain of the libraries.
  6. Finally, there’s the ceiling. It’s for dynamic mappings. For many applications, it will remain unused.

Things That Move

A box may also contain things that move around. Maybe our box is the temporary home for a pet such as a cute little mouse with a fluffy grey coat, long, black whiskers, glistening eyes like tiny eight balls, and a serpentine, nearly hairless tail. In our box, he’s over there on the left, all curled up in the corner. It looks like he’s sleeping.

A process also has things that move around. That’s what a thread does. It moves, it wiggles, it looks at the insides of the box, it marks on the walls…

A process must have at least one because if it doesn’t, the operating system will throw away it away. When a process is first created, it’ll have one thread. It’ll be over there on the left in the corner where it says, _start. Like the mouse with his grey coat, black eyes, and hairless tail, the thread will also have some unique properties.

Old Thinking

If you think of a process as executing its code, STOP!

Processes do not execute code. They house it and contain a lot of things needed for execution, but they don’t execute code. Processes don’t do anything. They just take up space and have a lot of drawing and writing on their inside surfaces.

It’s the threads inside the process, like our pet mouse inside the box, that have the property of execution. They move. They do stuff. They get the work done. (And sometimes, they even make messes!)

Code

On the floor–our code segment –is the function called main. It’s near the back left corner, but out just a little bit because there’s something already there.

Exactly in the corner is our sleeping mouse. And exactly in the corner of our process is the initial thread, and it’s sitting at the beginning of a small chunk of code named _start.

When the mouse wakes up, it will start execution there.

Back in the “good old days,” that piece of code had to be positioned exactly in the corner because that’s where the operating system put the mouse: In the corner. (Contemporary operating systems look at the object file to figure out where _start is, but in most cases you’ll probably find it right at the beginning of the code segment.)

When that mouse, the initial thread wakes up, one of the first things it will do is get some stack from the right-hand wall, look to the left at the argv and envp arrays in the Data Segment, put the count of arguments into argc in the call stack and leap (call) main.

Incidentally, _start has one thing more. Right after the call to main, the programmer who wrote it put another function call: to exit(). It’s put there so if the initial thread returns, the entire process will terminate.

Tip: If you want to get rid of your initial thread without returning from main, code a pthread_exit() for it to execute. As long as there are other threads within the process, it will continue. (As long as nobody calls exit(), of course.)

Data

To our left on the wall is the data segment. That’s where all of the static variables sit, each one with a little square or rectangle or other shape drawn around it in pencil. Some have the shape and size of a static integer variable. Others have a lot of subdivisions. These are each of the static array variables needed by the application. And some have odd, irregular shapes. These are the static struct variables.

Tip: There is another set of static variables inside the box. This other set belongs–is defined and used by–the libraries. Their s_tatic data_ is the Library Data Segment and it’s on the wall behind you. To know what’s there, we’d need to look at the source code for each of the shared libraries needed by this process.

Libraries

On the far wall are some large marked-off areas. One of them says, libc.so. Another says libpthread.so, and so forth.

The libc.so library contains functions that connect our process to the operating system. Names such as open(), close(), read(), and write() are there. So are malloc() and the exit() function. Since the _start routine contains a call to it, we’re guaranteed that libc.so will be present.

This application program contains another one of the system-supplied shared libraries, libpthread.so. It is used by applications that want more than one of those things called threads. It contains a lot of functions including pthread_create() and pthread_exit().

Static data needed by each of the shared libraries is found on a different wall, the one behind you.

Remember, the wall to your left is your Data. But the wall behind you is not for you. It’s where the libraries keep their Library Data.

Library Data

The wall of the box directly behind you is for Library Data. Various parts of it are used (and declared) by each of the libraries that’s been linked-into this application.

This is where you’ll find an array of file descriptors. File descriptors 0– stdin, 1– stdout, and 2– stderr are there in an array of static struct definitions. And depending on the implementation, the printf() function may have a static array for some of its work, as well as others with similar requirements.

Heap and Stacks

In this section:

The wall to your right has changed over the years. Back in ancient times where processes executed code, the stack and heap were located at opposites ends of the available memory space. The malloc() and free() library functions operated at one end while function calls and returns worked at the other. There were checks built-in at various places to detect when the two ran-over each other.

These days, malloc() and free() continue to allocate and free chunks of memory from the heap. But the stack is now in its own, segregated area–still on that same wall to your right–but in a area that won’t bump into the heap.

Some text books refer to the stack as part of the heap and, since they’re both on that same wall, that’s not completely wrong. It’s just not relevant any more.

Stack ? Meh.

Heap ? Meh.

They do their things without interfering with each other.

Heap

Each process contains a single, dynamically-sized heap of additional memory (right-hand wall in the box). That memory is malloc() ’d and free() ’d by each thread within the process.

Text books many decades ago would show the heap and stack growing toward each other. But that’s no longer correct. In contemporary systems with the potential for multiple threads in each process, there will be many stacks, one for each thread.

And to be 100% accurate, heap and stack memory is simply allocated from a common pool of available memory.

Stack

Each thread has a call/return stack over there on the right-hand wall. Automatic variables are allocated and destroyed during code execution. (Static variables are allocated in the Data Segment. They exist as long as the process continues to live.)

A thread’s stack is created through the pthread_create() system call. One of the arguments the programmer must specify is the size of the thread’s stack. It can be any number, but it must be sufficient for the thread’s needs. Coming up with that number can be mystifying. The usual approach is to be generous—say a megabyte [1024*1024]—and see if the thread breaks. If it does, make it bigger, if not, try something smaller. (I try going up or down 10x and then fine tune by 2x.) Once a survivable size is determined, double it and consider the job done. That final doubling is to allow for unforeseen (untested) circumstances and future growth (enhancement by a different programmer).

Interesting Aside

In some implementations, when a thread terminates (via pthread_exit() directly or indirectly), its stack is also deallocated. In others, however, the stacks of dead threads persist for the life of the process.

dynamic mappings

Up on the ceiling of our process box, threads may request dynamic additions to their memory space. The mmap() system call is the most common (and flexible) mechanism for this. It is used to bring files into memory so the program can use a simple character pointer to access all of the file. It is also used to obtain access to other chunks of memory such as a block that is shared between two processes.

Execution

A thread executes code within a process. Nothing else.

A process is merely a container—a box —and nothing more.

Depending on the hardware, the operating system, and circumstances that vary from one instant to the next, multiple threads within a single process could be executing at the same instant, or in difficult or impossible to predict sequences. This makes communication and coordination between threads a challenging area. (Semaphores, message queues, and other inter-thread/process synchronization services become extremely useful in those situations.)

_start

This assembler-language routine initializes the stack pointer, counts the number of arguments in the argv array, and then passes that count, the pointer to the arguments array, and the pointer to the envp array to the main entry point. (If main returns, a call to exit() is present in _start at that point to terminate the process.)

In older systems, _start was always located at the beginning of the process address space so its location could be guaranteed. In contemporary systems, the application’s object file now has this information.

main

In this section:

When execution of the initial thread reaches main, it receives four (not three) arguments.

  1. The first, argc is for convenience. It is the number of arguments in the null-terminated argv array. Some programs use argc to iterate through the array while others walk through it until finding the null pointer.
  2. The argv parameter is a pointer to an array of pointers, each one pointing to a string, an argument, to be passed to the process for a specific purpose. For example, the compiler is an application program used to translate a source language such as C into machine language. The name of the C source file to be compiled is specified each time the compiler is executed by one of the argv strings. That’s how the compiler know which source file is supposed to be translated.
  3. The envp is a pointer to another array (of pointers to strings). These are the environment variables. Since they are less commonly used, it is assumed that programmers searching for a given environment variable will either walk the array (watching for the null terminator), or use a function such as getenv() to acquire a specific one. The environment contains information about the person using the computer, the shell (/bin/sh, for example) being used by that person, in which file system directory that shell is p currently residing, and what directories to search when a new command is typed. Many other things could be included in the environment of which an application program might (or might not) be interested.
  4. The fourth (and final) argument passed to main is always a null pointer. Like a vestigial tail, this is a leftover from ancient days. “In the beginning,” when a program–yes, this predates the concept of process –was launched, it was passed an indefinite number of arguments instead of today’s argc, argv, and envp values. Back in those early days, a program would walk-through its arguments until finding the null pointer. That’s how it knew it’d reached the end of the list.

Today, of course, the initial thread calls main and we expect three useful arguments, argc, argv, and envp, but no more than that.

Finally, note that in some environments, that fourth parameter may now be used for something else. Good-bye to that vestigial tail.

argc

As tallied in _start, this is an integer count of the number of elements (pointers) in the argv array.

Note that argc is an automatic variable, and other than being on the stack of the initial thread, it exists nowhere elsewhere.

argv

An array of pointers to strings, the argv array is constructed during the birth of the process before _start begins execution. The array is stored in the Data Segment as a static array.

Note that the pointer to this array is an automatic variable whereas the array itself is static. Hence, one is on the stack (the argv argument to main), whereas the other, the array itself, is a static array in the Data Segment.

envp

This is another array of pointers to strings. As with the argv array, they are constructed before the process begins execution (at _start ). They are stored in the Data Segment as another static array.

Again, as with the argv pointer, the envp pointer is an automatic variable on the stack when passed to main, whereas the array itself is a static array in the Data Segment built before _start ever saw the light of day.

libraries

In this section:

libc.so

In this section:

  • getenv()
  • mmap()

The standard C library, libc.so is included (added) to every process. It contains wrappers for many of the calls to the operating system including fork(), exec(), exit(), getenv(), mmap() and a great many others.

getenv()

This shared library function searches the static array of environment variables. It is used when something in a process needs to know one of the settings (in the environment).

mmap()

This system call can enable access to memory-mapped files as well as chunks of memory including RAM and other hardware spaces that may be shared, or excluded from sharing, with other processes.

libpthread.so

In this section:

  • pthread_create()
  • pthread_exit()

The pthreads shared library –the POSIX Threading Library–is required when a process needs additional threads. That library contains wrappers for the operating system services for pthread_create(), pthread_exit(), and other thread-related services.

pthread_create()

Threads are created using this system call. Each of these additional threads are provided with a fixed size stack, beginning execution at a function whose name is one of the parameters, and other information.

pthread_exit()

The thread invoking this system call is terminated.

Note that in some systems, a thread’s stack may be de-allocated at this point. All of its automatic variables would disappear from existence at that point in time.

But in other systems, that memory is not de-allocated, so in theory, another thread could reference something on the now-dead thread’s stack.

This is considered bad style. Don’t do it.

When a thread goes away, you should consider it and all of its belongings as gone, gone, gone.

It’s worth adding that when the last thread leaves a process, the operating system invokes exit() on its behalf. All open files are closed, allocated memory is released, and the control structures used by the operating system for that process are destroyed.

Last one out? The operating system will turn off the lights.

Types of Variables

In this section:

Automatic

Variables with this property exist within a stack, and only when the calling-tree that created them still persists.

Conversely, when a function returns, all of its automatic variables are de-allocated.

Note that de-allocated does not mean the same thing as being destroyed or made inaccessible. On the contrary, while an automatic variable may be de-allocated, the memory in which it resides is still there. And if it that memory has not been changed by subsequent execution of another function, then the old value may–by pure happenstance–continue to remain. But it is not considered valid.

Generally speaking, generating a pointer to an automatic variable and passing it to another function that stores it in a static variable is suspect. If that static copy is later used but the original function has executed a return, the contents of the automatic are no longer valid but there’s no way for any to know. It’s a latent bug in the application and, like all bugs, it will eventually surface and lead to incorrect behaviour by one of the threads in the process.

Automatic variables may be defined with either global and local symbols.

Static

Variables with this property exist in the Data Segment for the lifetime of the process. They are readable and writable by all threads.

(The other kind of variables are called automatic. They come and go during execution.)

static variables may be defined with either global and local symbols.

Example Static Variables

In this section:

  • static integer variable
  • static array
  • static struct

static integer variable

The size of a static integer depends on the machine architecture. Typically, it is 32 or 16 bits long, but some machines use only 8 bits, and in a few they will be 64 bits long.

In the stone age of computing, variables were even more variable in size. Twelve-bit integers are found in a few machines as are eighteen bits in others. And some analog-to-digital converters only provide ten bits of unsigned data that must be zero-padded in the missing significant bits when stored in larger integers.

static array

A static array is (several of) one kind of variable repeated over and over, and stored in the Data Segment.

Note that each item (element) could be a struct ( structure ). We would then have an array of structures, and if they are static, then they would appear in the Data Segment.

Conversely, elements in a structure could themselves be arrays, so we could have structures containing arrays, and arrays containing structures. Static puts them in the Data Segment, whereas automatic places them on the stack of a thread.

static struct

A complex structure containing multiple variables. As mentioned for the static array, there can be arrays of structures as well as structures containing arrays.

Global and Local Symbols

Write this down: “A symbol is not a variable.”

If you’re in the habit of saying “global variable” or “local variable,” please stop. It’s not the variable that is global or local. Only the name has that property, not the memory assigned to the variable.

Symbols—names—are known by the compiler. You specify them, and the compiler keeps track of where in your source code it mentions them. If use that name wrong, you’ll get an error message from the compiler.

But once the code leaves the compiler, all bets are off.

Variables exist when your threads exist, either on a stack as an automatic variable, or in the Data Segment as a static variable. (Those names, the symbols by which we refer to variables, have been forgotten.)

At run-time, bugs in the software can damage memory locations in address spaces of the process. Any software, any thread can damage any of the writable memory. Everything in the Data and the Library Data Segment is fair game. And so is the stack of each and every thread within the process.

A buggy program can damage any variable, even those whose symbol names are unknown to it.

You’ve heard about loose cannons? They’re so-called because they can shoot and damage any and all things, regardless of what they’re called.

Bugs will damage local variables just as easily as global ones. And they’ll mess up the Library Data sometimes as well. And they’ll smudge, scribble, and eradicate stuff of the stack of your threads.

Global and local are no protection at run-time.

Only the compiler knows the difference, and he’s long gone when you run that program as a process.

Exceptions and Interrupts

In this section:

Exception

In this section:

An exception occurs when either a thread or an interrupt handler attempts something that is not permitted, or something that is specifically intended to cause ( raise a condition) and result in an exception.

The exception will fall into either of two categories: Fatal or Non-Fatal.

Important Note

In some machine architectures, an interrupt from an input/output device may be termed an exception, whereas in other machines, the reverse is true. (Motorola and Intel are two of the disagreeing parties.)

Herein, the concepts of interrupt and exception are discussed as separate and unrelated phenomena.

For more information, please see Exception versus Interrupt History.

Fatal

In this section:

When a fatal error occurs during the execution of one of the threads, the processor is unable to continue. For example, if a thread attempts to read or write an inaccessible memory location, the instruction has failed: it cannot be completed. Similarly, in some machines, attempting to divide by zero–a mathematically impossible operation–also causes an exception because the operation cannot be completed. There is no correct answer except “You can’t.”

In these cases, a fatal error is raised in the thread.

Note: Exception handling for the thread takes place at the same execution priority as normal. Nothing really changes in the thread ’s world other than to figure out what to do next.

There are two possibilities: either the thread has provided an exception handler, or it has not.

The signal() system call is used to provide exception handlers within threads.

If the thread has provided an exception handler for the specific condition, the thread ’s execution address is changed to that of the handler, and information about the offending (exception-causing) instruction is placed on the stack. The thread is then allowed to continue to execute (in the provided exception handler). Note that, as long as execution does not reach the close curly bracket (or otherwise return from the exception handler, the thread is allowed to continue.

If the thread has not provided an exception handler for this specific condition, the default behavior is to terminate the thread. For most cases, this is desired because there’s probably something wrong with the program.

But not always. Sometimes, the capability of Surviving Fatal Errors is needed.

Incidentally, an exception in an interrupt handler is always fatal. There is no possible recovery. A system crash is the result. If you’re lucky, the computer will restart (reboot) and it won’t happen again.

There’s an interesting—and perhaps alarming—discussion of rebooting a commercial aircraft’s computer at https: aviation.stackexchange.com/questions/2072/is-rebooting-the-computer-normal-before-during-flights. (Don’t say I didn’t warn you.)

Surviving Fatal Errors

There is a way for threads to survive most fatal errors. The problem to be surmounted is the rule, “When a fatal error occurs, the signal handler may not return.”

Okay, so how do I get out of the exception handler without returning?

Remember setjmp() and longjmp() from school? This is what they’re for, getting out of one place and into another without calling or returning from the current execution.

Neat trick!

Three Steps

  1. In the threads initialization, the programmer should code a signal() and name the fatal exception to be caught and the address (name) of a signal handler for that specific situation.
  2. Then, just before attempting something that might cause the fatal exception, the programmer codes a setjmp(). In essence, the setjmp() says, “I’m going to jump off the bridge in a moment. If it kills me, please resurrect me back to this same spot, and when I get there–er, here–tell me I’ve been reborn.” This sets a mark in the code (and stack). Think of them as pointing to “here in my current reality.” Okay, the thread is ready. Let it jump off the bridge. If an exception does not occur, execution continues normally.
  3. But if an exception does happen, execution jumps to the signal handler. There, code a longjmp(). This restores the saved stack point and execution address. The thread, in essence, “jumps” back to the location of the setjmp(), and with a return value that indicates this “Oops, it happened!” condition. Suddenly, your thread is back on the bridge at the setjmp() but now with the knowledge that it did jump, and it did die.

Special Considerations

Two caveats are warranted.

  1. When a fatal error occurs and is “caught,” the system disarms (unregisters) the handler against a recurrence. A second instance of the same error, if the thread doesn’t re-register, will cause the thread to be terminated. But in the case where a thread might want to survive jumping off the bridge a second time, it must re-arm the signal() before each possible leap.
  2. POSIX says that efforts to survive SIGSEGV (Segment Violation) cannot be guaranteed. POSIX says this is because the “context” of the thread has been damaged. Taking that as the literal truth, it might be possible to sacrifice one context –one thread –and replace it with another. In a sense, that would be one way to “survive” Fatal errors.

Non-Fatal

These exceptions are sometimes erroneously compared to interrupts in that the thread’s execution is unexpectedly yanked away.

But this is incorrect.

It is important to know that the signal handling code is executed as a normal part (and priority) of the thread. The transfer from normal code to signal handler is a “call” just like any other in the code except that you can’t see it.

signal()

This system call arms and disarms the reception of signals. The programmer provides the name of the signal to be handled, and the function name of the code to be executed. It is the original, now one of two methods for doing so.

FYI: Sending signals employs the sigkill() system call.

Something interesting in this regard is the Surviving Fatal Errors maneuver.

Interrupt

An interrupt tells the system that something of interest has taken place with some input/output device or that an increment of time has passed. The former typically means that a thread that has been waiting while reading or writing some data may now resume. And the latter–the clock tick –may mean that the current thread–if it is using the SCHED_RR policy–has completed its time-slice and it’s now time for some other thread to run.

In this presentation, the concepts of interrupt and exception are treated as separate and unrelated, regardless of how processor manufacturers may try to differentiate themselves.

For more information, please see the Exception versus Interrupt History.

Process System Calls

In this section:

  • fork()
  • exec()
  • exit()

fork()

Creates a clone of the current process including all of its allocated memory (and thread stacks), but with only a single thread of execution.

It is often used with exec() allowing one program’s process to launch a different program. That new program will have its own process with a new initial thread that begins life at _start that, in turn, calls its own main.

exec()

Replaces a process ’s executable code, data, shared libraries, etc. with those from an object file. An initial thread is created and allowed to execute.

exit()

When invoked by any of the threads within a process, all registered exit handlers are invoked. Typically, most of the system libraries have termination handlers. The libc.so exit handler, for example, closes all open files and does a free() for all chunks of memory allocated by the library.

All threads within the process are then terminated, and memory belonging to the process is released back to the operating system.

The process is no more.

Kaput.

Finito.

The End.

Thread

In this section:

Threads are all about execution.

When a process is born, the initial thread executes. It always begins at _start.

Important Note: The initial thread has some properties that are different from those of any additional threads.

All threads, like the mouse in our box, have properties. It’ll have a unique ID so we can manipulate it. Each thread will have two parameters that determine its scheduling. These will include the thread’s execution priority and a scheduling policy. It will also have a stack, a place to temporarily store it’s personal errno, and a signal mask.

During the lifetime of the process, additional threads may be generated by existing ones. And while any thread is executing, it may suffer an exception.

The Thread History discusses how the idea for threads came from the real-time world.

Two examples, elevate.c and threads1000.c are available to compile and run. (Verified on Linux. YMMV.)

Initial Thread

When a process is given a new body of code (by the action of exec() typically soon after a fork()), a single thread, the initial thread, is created. The address of execution for this thread is set to the address of _start. Because this is the initial thread, it is typically given a generous stack, and during its execution, it is allowed to grow it up to some limit. That thread may choose to spawn additional threads via the pthread_create() operation.

Note that if this initial thread invokes pthread_exit(), only the one (initial) thread is affected. The only exception is if it is the last (or only) thread in the process. In that case, the process is removed.

Also note that if any thread in the process invokes exit() then the entire process and all of its threads are terminated.

Additional Threads

Any thread within the process may create additional threads with the pthread_create() system call in libpthread.so. Each of these additional threads will be given a stack of the size stated in the arguments. These additional threads are not permitted to grow their stacks. Should they attempt to use more than is allocated, an exception is raised, and by default, the offending thread is terminated along with its process.

errno

The errno global, static variable, according to the text books, resides in the Data Segment. Any of the active threads may apparently reference it, cause it to change value, and then expect to see its content to explain why some system call didn’t work.

But in a multi-threaded process, each thread could have a different value in errno. How is this possible?

To explain how one global variable can be made to work with multiple threads of execution, we need to consider two kinds of processors. Those that execute one thread at a time, and those that truly do execute multiple threads at the same instant.

One at a Time Processors

When the processor is only capable of executing one thread at a time, the contents of the global errno variable are copied in and out of the thread’s control block. That is, when the operating system decides to run a given thread, it copies the thread’s private copy of it’s variable into the global errno. Then, the thread is allowed to execute.

The thread can then look at the global errno, make system calls, and if they indicate an error, check the global errno to find out what happened.

When the operating system decides to make that thread inactive–perhaps it is being preempted by a higher priority thread, or maybe it has just decided to sleep for a while–the OS copies the contents of the global errno variable back into the thread’s private copy for safekeeping.

True Concurrent Execution Processors

In processors that execute multiple threads concurrently, a different mechanism is used. In these, there will be no errno global, static variable. Instead, a compile-time macro will change all references to errno so that a privately-stored, thread-specific location is used instead. (It’s usually the same place each thread would use in the One at a Time Processors case previously discussed.)

Scheduling

Trivial example C code: elevate.c and threads1000.c

Fair or Prejudiced?

That’s the choice.

Should we be fair and let everyone’s computer program get done, or are some programs special ?

The answer, of course, is it depends.

Is the computer running the check-out at a public library, or is it flying an airplane with 467 souls on board?

Scheduling policy tells the system how a thread, not the process, is to be treated.

If it’s going to be fair and let other threads time-share (via time-slicing) the computer, then it will tell the operating system it will use SCHED_RR, round-robin scheduling.

But if this thread is landing a Boeing 747–8F weighing a quarter of a million pounds on Chicago’s O’Hare 10L with a gusty crosswind and might need all of those 13,000 feet of runway, then little Bobby in 49E is going to have to wait to scan the bar code on the back of his checkout, “Knuffle Bunny.”

And that thread better be using SCHED_FIFO, running with superuser privilege, and at an elevated execution priority.

“Try it now, Bobby,” Mom says as the sixteen tires of the main gear screech in the center of the broad white stripes of 10L’s runway aim point. “Maybe the computer was busy.”

The video and audio system in a commercial airliner might be running on a Linux box. It serves the data streams from a common store, and we’d want the computer to be fair. Everyone gets access to the same material. As system designers, we’d just need to make sure the computer can sustain 467 simultaneous MP4 data streams on a fair and equal basis.

The programs in these kinds of computers typically use the non-privileged settings. When their pthread_create() is issued, it’ll say, “This thread will use round-robin (SCHED_RR) scheduling at the default, non-superuser priority.”

But the flight control system? Fair and equal? Or is it special?

Some computers do special work. They fly airplanes, steer guided missiles, drive autonomous vehicles on public highways, pump blood and regulate oxygenation as a donor heart is carefully guided into the opening in a patient’s chest.

In the early realm of real-time computing, several Real-Time Operating Systems (RTOSes) such as pSOS, VxWorks, Nucleus, and Integrity established what the pThreads standard adopted as the SCHED_FIFO policy with elevated execution priority. When the committee designing the pThreads standard came along, they knew about the RTOS realm, and wanted to be able to include their requirements.

Threads doing this kind of real-time work will, in the pThreads model, be created with superuser privilege. They truly are special. And in the pthread_create() call that starts their execution at some function–not main–the programmer would specify SCHED_FIFO and some non-zero, elevated execution priority, possibly the highest available (255 in some implementations). That way, when a sudden gust from the right shoves the airplane toward the grass beside the runway, the flight control system can shove the massive rudder to the right now, and adjust the ailerons to keep the right wing down but not so far as to drag the tip on the concrete. And, if that’s not enough, then it can decide to sound the abort klaxon and throttle up the engines because “We’re going around, baby!”

Choose one combination or the other:

  • SCHED_RR, lowest execution priority, don’t have to be superuser, and because its SCHED_RR then time-slicing will apply, or
  • SCHED_FIFO, an elevated execution priority, superuser privilege is required to use SCHED_FIFO, and it will ignore time-slicing.

The first let’s the computer try to be fair when considering this thread.

The second is decidedly prejudiced. This thread is special.

Note that while it is technically possible to use SCHED_RR at an elevated, non-zero priority, it doesn’t make a lot of sense. That’s because as long as any thread at that priority is ready to run, the operating system is going to run them. And if there’s only one thread ready at that priority, it’ll use up its time, it’ll still be ready, and so the operating system will give it another slice, and another, and another. Time-slicing yourself accomplishes nothing. And if you and your buddies choose to time-slice amongst yourselves–all at the same, elevated priority–and amongst yourselves never relinquish the computer all at the same time, then nobody at a lower priority will ever get to run.

In the pecking order of deciding which thread gets to run, the pthreads standard dictates that a thread’s execution priority is the sole determinant. Later, and only when the system clock interrupts with a tick, will the scheduling policy (SCHED_FIFO versus SCHED_RR) be consulted. And when SCHED_RR thread’s allotment for time-slicing is up but it’s still ready to run, then it will continue to run at the same elevated priority. The operating system is obligated to choose the highest-priority ready thread, and if it’s the same one, then so be it.

The operating system may adjust time-slicing amongst the peons to try and be fair. But a thread’s execution priority is inviolate. Only that thread or another in the same process can change it. And they almost never do.

“But that’s not fair!” You cry.

“Yes,” the RTOS world answers. “It’s not supposed to be.”

elevate.c

    /* elevate.c - Example use of POSIX priority */
    #include <stdio.h>
    #include <sched.h>
    int
    main()
    {
    	int	rc;
    	struct	sched_param	posix_param;
    	printf("Raising my priority into POSIX land.\n");
    	posix_param.sched_priority = 1;
    	rc = sched_setscheduler(0, SCHED_FIFO, &posix_param);
    	if (rc != 0) {
    		perror("sched_setscheduler() failed");
    		fprintf(stderr, "\tHint: You must be the root user.\n");
    		exit(1);
    	}
    /* WARNING
     *
     * This process is now running at an elevated POSIX priority.
     * If this process does printf's and you are running it in an XWindow,
     * you will not see the printf output until this process blocks which
     * then permits the XWindows subsystem (running as a normal process
     * in some Unixes) to update your display.
     *
     * Also note that this will affect mouse position updates on your
     * display if you modify the following code and put this process
     * into a hard loop. BEFORE DOING ANY SUCH MODIFICATION, make sure
     * you safe any edits in progress, and then "sync" your system
     * because, if you've given the program no way out of a hard spin,
     * you won't be able to abort it (^C in an XWindow requires the
     * XWindow subsystems operation but if this program is in a hard
     * spin, it will preempt the XWindow subsystem and you won't be able
     * to cancel this program).
     *
     */
    	printf("Now using an elevated priority and SCHED_FIFO.\n");
    	printf("Sleeping for 10 seconds.\n");
    	sleep(10);
    	printf("Good-bye!\n");
    	exit(0);
    }

threads1000.c

    /* threads1000.c - Create MAX_THREADS (1000) and have them exist at the same time
     *
     * Build thusly: cc threads1000.c -l pthread -o threads1000
     */
    #include <unistd.h>
    #include <stdio.h>
    #include <pthread.h>
    #define MAX_THREADS 1000
    int thread_counter; /* Must be a global variable so all threads can reference it by name */
    void * thread_code(void *); /* Threads execute this function (see below) */

    int main() {
    	pthread_t child_id;
    	int result;
    	for (thread_counter = 1; thread_counter <= MAX_THREADS; thread_counter++) {
    		/* Pass thread_counter to each thread as his "id" */
    		result = pthread_create(&child_id, NULL, &thread_code, (int *)thread_counter);
    		if(result != 0) {
    			perror("pthread_create()");
    			break;
    		}
    	}
    	thread_counter--; /* Off by 1. Fix it. */
    	while(thread_counter > 0) {
    		/* See "Warning" below */
    		printf("Waiting for %d threads to go away.\n", thread_counter);
    		sleep(1);	/* Not very timely but it works */
    	}
    	printf("All threads are gone\n");
    	return 0;
    }
    void * thread_code(void * thread_id) {
    	printf("Thread %d is here and going to sleep for 10 seconds.\n", (int) thread_id);
    	sleep(10);
    	printf("Thread %d going away now.\n", (int) thread_id);
    	/* Warning: The following "thread_counter--" is somewhat dangerous.
    	 * If one thread were to preempt another in the midst of it's decrement,
    	 * the count *could* get messed up. But as this program is written,
    	 * any such preemption is unlikely. For simplicity, we will run the risk.
    	 * (The *correct* solution would be to protect the updates * of thread_counter
    	 * with a semaphore so that only one thread at a time is modifying it.)
    	 */
    	thread_counter--;
    }

Index

History

EDSkinner.net began in 2023. Fiction and non-fiction publications are included as well as (blog) posts and supplemental materials from flat5.net (2004-present).

Comments submitted on individual pages are subject to approval. General suggestions may be sent via the Contact page.

© Copyright 2024 by E D Skinner, All rights reserved