Review of C++ Programming Basics

2025-04-27

This note is suitable for readers who have systematically studied C++ and wish to increase programming proficiency (i.e., reduce reliance on Copilot, search engines, and documentation) and review basic concepts. However, this article will also progressively explain difficult points in programming.

There are many topics that could be included but are not, such as inheritance, polymorphism, operator overloading, etc. However, the author currently has fewer opportunities to use them and can review them when needed. This article prioritizes reviewing more fundamental knowledge (such as references) and more commonly used tools (such as the Standard Template Library or STL).

C Language Programming Basics

C++ is an extension of the C language and inherits most of its syntax. Questions that arise during C++ programming may be legacy issues from C programming. We discuss several concepts that can easily cause difficulties, including constants (const), static, and pointers.

Before discussing, please ensure you are first familiar with how to deduce the meaning of types using the spiral rule.

Constant Pointers and Pointers to Constants

const T * p; // pointer to constant, you cannot modify the object being pointed to via p
T const * p; // same as above
T* const p; // constant pointer, once initialized, cannot be changed to point to a different memory location
T const * const p; // constant pointer to constant
const T* const p; // same as above

General pointers can be converted to constant pointers, but not vice versa.

void writeit(int * x);
void readit(const int * x);
int x = 3;
int * p = &x;
const int * cp = p;
writeit(cp); // Error
readit(cp);

Arrays and Pointers

Arrays and pointers are methods of pointing to storage space. The "two-dimensional" pointer structure has the following types, which can be determined by the spiral rule:

int a[5][5]; // 2d array of ints
int *b[5]; // array of (pointer to int)
int ** c; // pointer to (pointer to int)

Their semantics are different: a can be converted to a pointer to int int *p = a; b and c are both pointers to pointers. For a, b, c, the memory layout generated by the compiler is also different:

a: int[ a[0][1], ..., a[0][4], a[1][0], ... ,a[1][4], ..., a[4][4] ]
b: ptr[ b[0], ..., b[4] ]
         \-> int      \-> int
c: ptr 
     \-> ptr
           \-> int

Ultimately, multidimensional arrays are a notation in C language that makes it more convenient for the compiler and programmers. A multidimensional array of int is just a constant pointer to int. You can convert a multidimensional array to a pointer to T for use, but you cannot turn a pointer into any multidimensional array. Note the way array parameters are declared in the example below.

void use_array(int a[]){...}
void use_ptr(int *a){...}

int a[5];
int *b = a; // arr to ptr OK
int *c = new int[5];
use_array(c); // ptr to arr OK

void use_array_2d(int a[5][5]){...} // a[][5] also works (1)
void use_ptr_ptr(int **a){...}

int d[5][5];
use_array_2d(d);
use_ptr_ptr(d); // ERROR (different memory layout)
use_ptr((int*) d); // OK, same memory layout

int *e = new int[25];
use_array_2d( (int[5][5]) e ); // ERROR (compiler forbids)

Multidimensional arrays in C can be completely replaced by one-dimensional arrays and manual index calculations:

We can use arr+((i*j_dim)+j)*k_dim+k for &arr[i][j][k]
That's why we do not need to specify first dimension in (1)

Static

"Static" concerns lifecycle and scope:

The lifecycle of a static variable is from the start to the end of the entire program. If uninitialized, it is stored in the .bss segment; if initialized, it is stored in the .data segment. For example:

void f(){
  static int cnt = 5;
  cout << cnt++ << endl;
}
int main(){
  f(); f(); // prints 5, 6
}

The scope of static global variables and static functions is limited to the file in which they are defined.

References in C++

C++ introduced the reference mechanism.

Lvalues and Rvalues

Lvalues: Variables with addresses
Rvalues: Variables without addresses, such as literals and intermediate results

References

References, by default, are lvalue references and are an alias mechanism. Compared to pointers, their differences or advantages are:

  1. Must be initialized and cannot be null, making them safer

  2. Cannot be changed: Once bound to a variable, the object it points to cannot be changed.

int a = 5;
int &ref_a = a;
int const &ref_a = a; // constant reference is the same as reference to constant
int &const ref_a = a; // Error: no "constant reference" of this kind because every reference cannot be changed
int &ref_5 = 5; // Error: a Lvalue reference cannot refer to a Rvalue
int const &ref_5 = 5; // a constant Lvalue reference can refer to a Rvalue

General references can be converted to constant references, but not vice versa.

Rvalue References

Rvalue references are a new feature in C++11 and are aliases for temporary objects. Use the && symbol to reference rvalues. Rvalue references allow modification of temporary objects and are mainly used to implement move semantics and avoid unnecessary copies. For example:

int &&ref_5r = 5; // ref_5r is an rvalue reference bound to the literal 5
ref_5r = 6; // Valid, rvalue references can modify the object they are bound to

Rvalue references cannot be directly bound to lvalues, but std::move can be used to convert lvalues to rvalues. For example:

#include <utility>
int a = 5;
int &&ref_r = std::move(a); // Use std::move to convert lvalue a to an rvalue expression, allowing ref_r to bind to it

The situation where lvalue references bind to rvalues has already been seen: int const &ref_5 = 5;


The variable after `std::move` is undefined, its "value" has been taken by the move target. You can refer to the following content (from [here](https://blog.csdn.net/menghaocheng/article/details/102504602)).

```C++
int main() {
    string strA = "Hello";
    cout << strA << endl;
    string strB = move(strA);
    cout << strA << endl; // undefined, might print blank
    cout << strB << endl;
}

The new function signature for C++ 11 vector's pushback is void push_back (value_type&& val);, which uses rvalue references to reduce the cost of copying.

C++ Object-Oriented Programming Summary

In object-oriented programming, how to initialize and clean up is determined by the class designer; when to initialize and clean up is determined by the compiler. We control the former by designing constructors and destructors. We will first discuss these two, then discuss the new rules for the const and static modifiers in classes.

To avoid some common typos:

Constructors

Before executing the function body of the constructor, the class's member variables must be initialized first. If the member variables are in the initializer list, they are initialized accordingly; otherwise, their default constructor is called.

Default Constructor

A default constructor is a constructor that takes no parameters or has default arguments for every parameter. It is implicitly defined when the user does not define any constructor, otherwise, it can be manually requested by using A() = default.

Copy Constructor

The copy constructor, A(const A& src), is called in the following scenarios:

A a; A b(a);
A c = a;
void f(A a) // as argument
A f() // as return value

const A& src can bind to both lvalues and rvalues, but it incurs the cost of copying.

Move Constructor

The move constructor, A(A&& src), can directly utilize the heap memory of a temporary object that is about to be destructed. You can test the move constructor with the following example:

class T{
    public:
    int *buf;
    T(): buf(new int(0)) {}
    T(T&& t): buf(t.buf) {
      cout << "Test(Test&& t) called, buf=" << hex << buf << endl;
      t.buf=nullptr;
    }
};
T getT(){
    T t;
    return t; //calls T(T&& t)
}
int main(){
    T x = getT(); //calls T(T&& t)
}

The default compiler return value optimization will prevent the move constructor from being called, resulting in no output. You can disable it, for example, when compiling with g++: g++ move-sem.cpp -std=c++11 -fno-elide-constructors -o move-sem

Here's another example of using move to reduce the cost of copying:

template <class T>
swap (T& a, T& b){
  T tmp(std::move(a));
  a = std::move(b);
  b = std::move(tmp);
}

Destructor

Destructors are unique, have no return value, and take no parameters. After executing the function body of the destructor, the destructors of member variables are automatically called.

Constant Members

Constant data members can only be initialized in place or in the initializer list, and cannot be assigned in the constructor's function body.

Constant member functions cannot modify the state of the class. They are of the form type func(...) const {...}. If the object itself is a constant variable const A a;, it can only call constant member functions, which makes sense.

Static Members

Besides having a lifecycle from the start to the end of the program, static member variables static int x; and static member functions static int func() have the core feature of being shared by all objects of the class. Static function members cannot access non-static members because non-static members depend on specific objects.

Static variables are declared in the class definition but must be defined and initialized outside the class, for example:

class T{
  public:
    static int count; // Declaration
    T(){count++;}
};
int T::count = 0; // Definition and Initialization

Utility: C++ Standard Library

The C++ STL (Standard Template Library) is frequently used in actual programming, so this section is created as a memo to record its features and usage. The STL consists of many parts, including containers, algorithms, iterators, input and output, etc.

Containers

Pair and Tuple

template<class T1, class T2> struct pair{
  T1 first;
  T2 second;
};
auto p = std::make_pair("Foo", 1.0);
p.first; p.second;
std::tuple<int, char, std::string> get_tuple(){
   return {0, 'a', "Alpha"};
}
auto t = get_tuple();
std::get<0>(t); // same for 1,2
int e0; char e1; std::string e2;
std::tie(e0, e1, e2) = get_tuple();

Sequence Containers

Elements in sequence containers have an order. Containers like vector and list belong to sequence containers.

Associative Containers

Elements in associative containers need to be accessed according to their numerical order. However, for examples like set (unordered set) and map (associative array), they can also be iterated over for sequential access: Internally, elements are sorted by key, and the iterator returned iterates in order. The data structure used for their implementation is a binary search tree (specifically, a red-black tree).

If you don't need to access elements in order externally, you can use unordered_set and unordered_map, which are implemented internally using hash tables. Therefore, they (rather than set and map) correspond to Set and Dict in Python. Nevertheless, we can still use iterators to access them (the internal implementation might just traverse all the buckets).

set<int> members;
members.insert(x);
auto it = members.find(p);
if(it!=members.end()) members.erase(it);
members.count(5); // 0 or 1

map<string, int> s;
s["foo"] = 1;
cout << s["bar"] << endl; // outputs 0. When key not already exists, it gets inserted, and mapped to a default value of the mapped type.
// find, count, erase is the same as set

Iterators and Their Algorithms

Iterators are data types that inspect and traverse elements within a container. Iterators are a nested class of the container type (note the distinction from derived class / subclass).

class vector {
  class iterator {
    ... // iterator is a nested class
  }
};

int solve(vector<int>& nums){
  sort(nums.begin(), nums.end()); // left inclusive, right exclusive
  vector<int>::iterator bd = upper_bound(nums.begin(), nums.end(), val);
  int r = bd - nums.begin() - 1; // subtraction for iterators
  // above is same as distance(nums.begin(), bd) - 1
}

Note that iterators become invalid when the container is moved or changed, such as with vector's insert, erase, push_back. The iterator validity in the container's method documentation will indicate whether a method will invalidate iterators.

algorithm

The algorithm standard library contains algorithms for manipulating containers, with very commonly used ones including:

  1. std::sort, which sorts elements in a range in ascending order by default, with the default comp being std::less, meaning there is no case where the right is less than the left. If you need to sort in descending order?

    • std::sort(s.begin(), s.end(), std::greater());

    • std::sort(s.begin(), s.end(), [] (int a, int b){return a>b;} );

  2. std::lower_bound, which finds the leftmost element that is not less than value.

Assuming the array is sorted in ascending order:

const std::vector<int> v{1, 2, 4, 5, 5, 6};
int y = 5
auto i1 = std::lower_bound(v.begin(), v.end(), y); // leftmost element not less than y, i1=3. Therefore, i1-1 is the rightmost element less than y.
auto i2 = std::upper_bound(v.begin(), v.end(), y); // leftmost element greater than y, i2=5。Therefore, i2-1 is the rightmost element no greater than y.

Illustration:

 value
  /|\
   |                   eee
 y |     eeeeeeeeeeeeee|
   | eeee|             |
   |     |             |
   +-----+-------------+---->idx
    (lower_bd)    (upper_bd)

Assuming the array is sorted in descending order:

const std::vector<int> v{6, 5, 5, 4, 2, 1};
int y = 5
auto i1 = std::lower_bound(v.begin(), v.end(), y, std::greater<int>());
auto i2 = std::upper_bound(v.begin(), v.end(), y, std::greater<int>());

Illustration:

 value
  /|\
   | eeee
 y |     eeeeeeeeeeeeee
   |     |             eee
   |     |             |
   +-----+-------------+---->idx
    (lower_bd)    (upper_bd)

Regarding the meaning of ordered before and ordered after in the documentation:
If comp is std::less, then ordered before means less than, and ordered after means greater than.
If comp is std::greater, then ordered before means greater than, and ordered after means less than.