Pointer
A pointer is a data type that is used to store a memory address, or a value of such a data type.
The meaning of a pointer
In most language implementations, a pointer is represented as the distance, or offset, from some fixed reference point in memory to the location to which it points. Many mainstream languages take this reference point to be the very start of the computer's memory, so most possible values for pointers will be inaccessible memory locations. Some languages even hide the existence of a reference point altogether; in such languages, a pointer's value cannot be determined, but sometimes you can still determine the difference between two pointers (an operation that uses one as the reference point and returns the distance to the other). In esoteric languages, the reference point normally corresponds to something that can be accessed, such as the first instruction of the program or the start of the data area; in such cases, pointers are normally small integers (and many languages with this sort of pointer don't make a distinction between pointers and integers). For instance, in Malbolge, a pointer value of 0 is the start of the shared code/data area.
Data structures using pointers
In both mainstream and esoteric programming languages, pointers are often used to build data structures.
Arrays
In some languages, the easiest way to represent an array is by reserving a lot of memory, and then storing a pointer to its first element. Other elements of the array can be accessed by taking an offset from that pointer; for instance, if an array used single-byte elements, it's third element would be pointed to by a base pointer plus 2 (arrays implemented this way are normally taken as zero-based). This technique is commonly used to implement brainfuck in languages that use pointers, taking the leftmost element of the tape as the base pointer; then the <
and >
instructions decrement and increment that pointer, respectively, and the +
and -
instructions increment and decrement what it points to.
Linked lists
A linked list consists of a chain of data structures; each structure contains some data, and a pointer to another data structure of the same type. This data structure can implement stacks and queues without difficulty, which makes it useful to implement many stack-based and queue-based languages. It's also much easier to insert elements into the middle of a linked list than it is to insert them into an array.
Instruction pointers
Pointers can point to code as well as to data. (In some languages, there is no distinction made between code and data; in such cases, whether a pointer points to code or data is not a feature of the pointer at all, and a code or data interpretation is placed on it when it's used. In other languages, such as the non-esoteric language C, pointers to code and to data are different data types and mixing their usage is not allowed.) Nearly all programming language implementations have something that corresponds to an instruction pointer; a pointer that points to the currently executing instruction. In compiled languages, the instruction pointer will often not correspond to anything in the original program; however, in interpreted esoteric languages, the instruction pointer often points to the original program itself. In FukYorBrane, each of the competing programs can compare its current data pointer to the other program's instruction pointer(s), to try to determine which bits of the other program are executing. This brings up an interesting point: in multithreaded programming languages, there can be more than one instruction pointer at a time. In such cases, all the pointed-to instructions may execute at once. In most one-dimensional languages, normally after executing an instruction, the next instruction will execute (this corresponds to incrementing the instruction pointer), but some commands like loops and branches will cause the pointer to move in different ways. A GOTO instruction effectively assigns to the instruction pointer (and in some languages, GOTO is written as an assignment to the instruction pointer); some languages even have a computed GOTO, which assigns a non-constant value to the instruction pointer. In two-or-more-dimensional languages, and a few one-dimensional languages such as REVERSE, the direction in which the instruction pointer is moving is just as important as the target of the pointer itself in determining the flow of the program (in Befunge, for instance, flow control is done by changing the direction in which the instruction pointer is moving); note also that in multiple-dimensional languages, a pointer has to contain an offset from its reference point in each dimension, not just in one. Zero-dimensional languages like NULL often don't have the concept of an instruction pointer at all.
Data pointers
One concept that exists in many esoteric programming languages, but few mainstream programming languages, is that of a designated data pointer. In such languages, any accesses by the program to its data storage are done via the pointer; brainfuck and Malbolge are examples. The pointer normally starts in a standard location, and there are normally instructions provided to move it around the data storage (although in Malbolge, it keeps moving through the program at a steady rate unless an instruction is used to set it to a new location). These languages normally need no way of specifying what instructions operate on other than the location of the data pointer, and so don't have named variables or any similar concept.
Pointer terminology
- Dereference
- To return what a pointer points to; dereferencing a pointer to the first element of an array will return the value of that array's first element, for instance.
- Taking an address
- Taking the address of a variable returns a pointer to the memory in which that variable is stored.
- Instruction pointer
- A pointer that points to the currently executing instruction of a program; abbreviated to IP.