This section gives a brief overview of the C++ features you will need for this course. It is not a complete introduction to C++, and it intentionally omits many advanced language features. Our goal is to give you the right mental model for reading and writing the code used in the labs and in the provided library.
C++ is often regarded as a scary language, because it contains many advanced features. While this is true, you do not need to know most of those features to write most code! In particular, in 15-210, we will not require any of the following C++ features:
You are assumed to already be comfortable with basic C-style syntax (variables, conditionals, loops, functions) from 15-122. This guide is meant as a quick start on extending your knowledge of C from 15-122 with the minimum amount of C++ needed to succeed in 15-210.
The most important idea to understand first about C++, because it differs from most other languages you may have learned (Java, Python, JavaScript), is that it uses value semantics. Variables store values, and assigning a variable or passing it to a function makes a copy by default.
int x = 5;
int y = x;
y = 7;
// x is still 5
This also applies to containers, which is even more important, since the copy-behavior of most containers is a deep copy.
std::vector<int> a{1, 2, 3};
std::vector<int> b = a;
b[0] = 10; // a is still {1, 2, 3}, while b is now {10, 2, 3}
This behavior is very different from languages like Python or Java, where assignment usually creates another reference to the same object. It is also different from SML, where values are immutable. In C++, values are mutable, but copying them produces an independent copy. For objects like containers, this means that the default behavior is expensive! Copying an entire container takes linear work since every element must be copied, while producing a reference is essentially free.
const
A reference is an alias for an existing object. Reference types are written by adding &
after the name of the type, so int& (usually pronounced "int ref" when spoken) is the type of
a reference to an integer.
int x = 5;
int& r = x; // r is a reference to the same object as x
r = 7; // x is now 7
& symbol, this is not in any way shape
or form related to C's address-of operator, which is applied to an object to obtain its address.
When written to the right of a typename, & denotes a reference type.
References have two main purposes:
std::vector<int> a{1, 2, 3};
const std::vector<int>& b = a;
b[0] = 10; // compilation error! b is declared const so it can not be mutated
In this example, the reference is const but the underlying
vector is still mutable. The reference can therefore see mutation:
std::vector<int> a{1, 2, 3};
const std::vector<int>& b = a;
a[0] = 10; // valid, and now b[0] == 10 since its a reference
For additional safety, const can be applied to object types, not
just reference types. This makes the object logically immutable.
const std::vector<int> a{1, 2, 3};
const std::vector<int>& b = a;
a[0] = 10; // compile error!
b[0] = 10; // compile error!
Function parameters also follow the same value category rules as variables. They can be value types, reference types, and can be const or not. Accepting a parameter by value results in a copy. This is okay for cheap types (like integers) but is not ideal for large types where deep copying them could be expensive, like most container types (vectors, sequences, etc).
void f(std::vector<int> v); // Copies v: takes linear work, bad
Function parameters can also be passed as mutable references. This avoids copies but allows the function to mutate the argument which makes it not functional style and harder to reason about. You also can not pass temporaries/literals as mutable references.
void f(std::vector<int>& v); // No copy, but mutable, dangerous
For the best of both worlds, a good default way to take parameters is as const references, since this both prevents expensive copies and prevents mutation.
void f(const std::vector<int>& v); // No copy, no mutation, nice
A function parameter of type const T& means "share this object read-only." In other
words, it is close to the semantics of SML and other functional languages, and therefore
very useful for functional-style code!
Most sequence parameters in this course are passed as constant references inside generic code to
avoid accidental mutation or accidental expensive copies.
std::vector<int>&&, but this is more advanced than what we need for this class.
auto and Generic Functions
The auto keyword asks the compiler to deduce a type automatically.
You may see it used for variable declarations:
auto x = 3; // int
auto y = 2.5; // double
This is not however the most important use of auto. In this course, auto is especially important for writing generic functions.
auto f(auto x) {
// works for any parameter type
}
This says that the function f can take an argument of any type. The type of the parameter s is deduced from the type of the
argument.
Importantly, generic function parameters follow the same value category rules
as ordinary function parameters (with one exception that we will mention below).
They can be accepted by value, mutable reference, or const reference.
auto f(auto x); // Makes a copy of the argument
auto f(auto& x); // (Potentially) mutable reference to the argument
auto f(const auto& x); // Immutable reference to the argument
You will see auto frequently in function parameters and return types throughout
the library. As a rule of thumb: if a generic function should not modify nor copy its argument,
it should take it as a const auto& parameter. Just like non-generic functions, this should be our default way of passing parameters.
f(auto& x) binds its parameter as a potentially mutable reference? Because that pattern can in fact bind to a const reference as well! If you call f(auto& x) using an argument of type const int, then auto will deduce the underlying type as const int, which materializes the generic parameter into const int&. In other words, auto is a placeholder for a type, and const is not just an annotation, its part of the type.
There is one final way of taking generic parameters, which you will not have to write but you may see frequently in the library signatures: auto&&. The full description and rules of what auto&& does are too complicated, but the summary is that auto&& binds to the best-matching value category given its argument. You should just think of auto&& as binds to literally anything.
auto&& exists, consider a generic function f that we want to accept any argument. What value category should we use if we don't know anything about the type(s) in advance?
f(auto x) won't work for uncopyable types, and copying is often undesirable due to efficiency anyway.f(auto& x) will not compile when called with a temporary (e.g., f(5)) since a reference cannot bind to a temporary.f(const auto&) won't accept types that require mutation to function. Unfortunately some such types exist.f(auto&& x) solves this problem! It binds to references or non-referencable objects (e.g., temporaries/literals) and it binds to const or non-const. It is the most generic parameter possible!
Lambda functions are a convenient syntax for defining function objects in C++. In other words, they are the embodiment of 15-150's motto that "functions are values!". They are heavily used in this course to express functional-style algorithms.
Syntactically, lambdas are defined by writing
square brackets [] followed by what looks like a usual function definition (a parameter list in brackets followed by a body).
auto square = [](int x) {
return x * x;
};
Lambdas are often passed directly as arguments to algorithms:
auto squares = par::map(a, [](int x) {
return x * x;
});
By default, lambdas are not allowed to access local variables from their enclosing
scope. To allow this, we must explicitly capture variables by listing them inside the
square brackets [].
int k = 3;
auto add_k = [k](int x) { // captures a copy of k
return x + k;
};
Lambda captures follow the same value/reference rules as ordinary variables.
By default, captures are copies. To capture a reference, add &.
int k = 3;
auto add_k = [&k](int x) { // captures k by reference
return x + k;
};
When capturing by reference, changes to the variable are visible both inside and outside the lambda.
int k = 3;
auto f = [&k](int x) { return x + k; };
k = 10;
int y = f(5); // y == 15
You can also use a default capture. Writing [=] captures everything by value,
and writing [&] captures everything by reference.
In this course, we will usually use [&] when writing temporary lambdas that
are immediately passed to higher-order functions (such as tabuluate, map, or reduce).
This avoids unnecessary copies and is safe in these cases.
The C++ standard library includes several lightweight "vocabulary types" that are used throughout the code in this course.
std::pairA pair stores a pair of values, which may be of different types.
std::pair<int, int> p{3, 4};
Pairs are often used to return two values from a function. The elements of a pair can be
read either from their member variables p.first and p.second or
by using a structured binding:
auto [x, y] = p; // x = 3 and y = 4
Structured bindings follow similar semantics to variable declarations: they can be declared as non-references (which will cause the right-hand side to be copy initialized), or as references or const references to the elements of the right-hand side.
std::pair<int, int> p{3, 4};
auto [x, y] = p; // copies
auto& [x, y] = p; // mutable reference
const auto& [x, y] = p; // immutable reference
std::tupleTuples generalize pairs to any number of elements and are commonly used with functions like
zip.
std::tuple<int, double, char> t{1, 2.5, 'a'};
auto [x, y, z] = t; // x = 1, y = 2.5, z = 'a'
std::optional
An std::optional<T> represents either "a value of type T"
or "no value." This is very useful as a return type when a function may or may not return a meaningful result.
std::optional<int> opt;
if (opt.has_value()) {
// value is present, use opt.value() to access it
some_function(opt.value());
}
We can write opt.has_value() in a conditional statement
to test whether they contain a value or not. To access the stored value, you can write
opt.value() or the more abbreviated syntax *opt (I prefer the
former because the latter makes it look like a pointer, but it's not!)
With these concepts in mind, you should be well equipped to read and write the C++ code used throughout this course.