# SMP(Symmetric Multi-Processor) Primer for Android

## 0)简要介绍

SMP 的全称是“Symmetric Multi-Processor”。 它表示的是一种双核或者多核CPU的设计架构。在几年前，所有的Android设备都还是单核的。

## 1)理论篇

### 1.1)内存一致性模型(Memory consistency models)

• 所有的内存操作每次只能执行一个。
• 所有的操作，在单核CPU上，都是顺序执行的。

To illustrate these points it’s useful to consider small snippets of code, commonly referred to as litmus tests. These are assumed to execute in program order, that is, the order in which the instructions appear here is the order in which the CPU will execute them. We don’t want to consider instruction reordering performed by compilers just yet.

Here’s a simple example, with code running on two threads:

Thread 1 Thread 2 A = 3 B = 5 reg0 = B reg1 = A

A = 3 B = 5 reg0 = B reg1 = A

In this and all future litmus examples, memory locations are represented by capital letters (A, B, C) and CPU registers start with “reg”. All memory is initially zero. Instructions are executed from top to bottom. Here, thread 1 stores the value 3 at location A, and then the value 5 at location B. Thread 2 loads the value from location B into reg0, and then loads the value from location A into reg1. (Note that we’re writing in one order and reading in another.)

Thread 1 and thread 2 are assumed to execute on different CPU cores. You should always make this assumption when thinking about multi-threaded code.

Sequential consistency guarantees that, after both threads have finished executing, the registers will be in one of the following states:

Registers States
reg0=5, reg1=3 possible (thread 1 ran first)
reg0=0, reg1=0 possible (thread 2 ran first)
reg0=0, reg1=3 possible (concurrent execution)
reg0=5, reg1=0 never

To get into a situation where we see B=5 before we see the store to A, either the reads or the writes would have to happen out of order. On a sequentially-consistent machine, that can’t happen.

Most uni-processors, including x86 and ARM, are sequentially consistent. Most SMP systems, including x86 and ARM, are not.

## 2)实践篇

### 2.2)在Java中不应该做的事

#### 2.2.1)Java中的"synchronized"与"volatile"关键字

“synchronized”关键字提供了Java一种内置的锁机制。每一个对象都有一个相对应的“monitor”，这个监听器可以提供互斥的访问。

“synchronized”代码段的实现机制与自旋锁(spin lock)有着相同的基础结构: 他们都是从获取到CAS开始，以释放CAS结束。这意味着编译器(compilers)与代码优化器(code optimizers)可以轻松的迁移代码到“synchronized”代码段中。一个实践结果是：你不能判定synchronized代码段是执行在这段代码下面一部分的前面，还是这段代码上面一部分的后面。更进一步，如果一个方法有两个synchronized代码段并且锁住的是同一个对象，那么在这两个操作的中间代码都无法被其他的线程所检测到，编译器可能会执行“锁粗化lock coarsening”并且把这两者绑定到同一个代码块上。

volatile的访问效果可以用下面这个例子来说明。如果线程1给volatile字段做了赋值操作，线程2紧接着读取那个字段的值，那么线程2是被确保能够查看到之前线程1的任何写操作。更通常的情况是，任何线程对那个字段的写操作对于线程2来说都是可见的。实际上，写volatile就像是释放件监听器，读volatile就像是获取监听器。

#### 2.2.2)Examples

class Counter {
private int mValue;

public int get() {
return mValue;
}
public void incr() {
mValue++;
}
}


1. reg = mValue
2. reg = reg + 1
3. mValue = reg

class MyGoodies {
public int x, y;
}
class MyClass {
static MyGoodies sGoodies;

void initGoodies() {    // runs in thread 1
MyGoodies goods = new MyGoodies();
goods.x = 5;
goods.y = 10;
sGoodies = goods;
}

void useGoodies() {    // runs in thread 2
if (sGoodies != null) {
int i = sGoodies.x;    // could be 5 or 0
....
}
}
}


(请注意仅仅是sGoodies的引用本身为volatile，访问它的内部字段并不是这样的。赋值语句z = sGoodies.x会执行一个volatile load MyClass.sGoodies的操作，其后会伴随一个non-volatile的load操作：：sGoodies.x。如果你设置了一个本地引用MyGoodies localGoods = sGoodies, z = localGoods.x，这将不会执行任何volatile loads.)

class MyClass {
private Helper helper = null;

public Helper getHelper() {
if (helper == null) {
synchronized (this) {
if (helper == null) {
helper = new Helper();
}
}
}
return helper;
}
}


if (helper == null) {
// acquire monitor using spinlock
while (atomic_acquire_cas(&this.lock, 0, 1) != success)
;
if (helper == null) {
newHelper = malloc(sizeof(Helper));
newHelper->x = 5;
newHelper->y = 10;
helper = newHelper;
}
atomic_release_store(&this.lock, 0);
}


• 删除外层的检查。这确保了我们不会在synchronized代码段之外做任何的检查。
• 声明helper为volatile。仅仅这样一个小小的修改，在前面示例中的代码就能够在Java 1.5及其以后的版本中正常工作。

class MyClass {
int data1, data2;
volatile int vol1, vol2;

void setValues() {    // runs in thread 1
data1 = 1;
vol1 = 2;
data2 = 3;
}

void useValues1() {    // runs in thread 2
if (vol1 == 2) {
int l1 = data1;    // okay
int l2 = data2;    // wrong
}
}
void useValues2() {    // runs in thread 2
int dummy = vol2;
int l1 = data1;    // wrong
int l2 = data2;    // wrong
}


useValues2()使用了第2个volatile字段：vol2，这会强制VM生成一个memory barrier。这通常不会发生。为了建立一个恰当的“happens-before”关系，2个线程都需要使用同一个volatile字段。在thread 1中你需要知道vol2是在data1/data2之后被设置的。(The fact that this doesn’t work is probably obvious from looking at the code; the caution here is against trying to cleverly “cause” a memory barrier instead of creating an ordered series of accesses.)

### 2.3)What to do

Be extremely circumspect with "volatile” in C/C++. It often indicates a concurrency problem waiting to happen.

In Java, the best answer is usually to use an appropriate utility class from the java.util.concurrent package. The code is well written and well tested on SMP.

Perhaps the safest thing you can do is make your class immutable. Objects from classes like String and Integer hold data that cannot be changed once the class is created, avoiding all synchronization issues. The book Effective Java, 2nd Ed. has specific instructions in “Item 15: Minimize Mutability”. Note in particular the importance of declaring fields “final" (Bloch).

If neither of these options is viable, the Java “synchronized” statement should be used to guard any field that can be accessed by more than one thread. If mutexes won’t work for your situation, you should declare shared fields “volatile”, but you must take great care to understand the interactions between threads. The volatile declaration won’t save you from common concurrent programming mistakes, but it will help you avoid the mysterious failures associated with optimizing compilers and SMP mishaps.

The Java Memory Model guarantees that assignments to final fields are visible to all threads once the constructor has finished — this is what ensures proper synchronization of fields in immutable classes. This guarantee does not hold if a partially-constructed object is allowed to become visible to other threads. It is necessary to follow safe construction practices.(Safe Construction Techniques in Java).

#### 2.3.2)Synchronization primitive guarantees

The pthread library and VM make a couple of useful guarantees: all accesses previously performed by a thread that creates a new thread are observable by that new thread as soon as it starts, and all accesses performed by a thread that is exiting are observable when a join() on that thread returns. This means you don’t need any additional synchronization when preparing data for a new thread or examining the results of a joined thread.

Whether or not these guarantees apply to interactions with pooled threads depends on the thread pool implementation.

In C/C++, the pthread library guarantees that any accesses made by a thread before it unlocks a mutex will be observable by another thread after it locks that same mutex. It also guarantees that any accesses made before calling signal() or broadcast() on a condition variable will be observable by the woken thread.

Java language threads and monitors make similar guarantees for the comparable operations.

#### 2.3.3)Upcoming changes to C/C++

The C and C++ language standards are evolving to include a sophisticated collection of atomic operations. A full matrix of calls for common data types is defined, with selectable memory barrier semantics (choose from relaxed, consume, acquire, release, acq_rel, seq_cst).

See the Further Reading section for pointers to the specifications.

## 3)Closing Notes

While this document does more than merely scratch the surface, it doesn’t manage more than a shallow gouge. This is a very broad and deep topic. Some areas for further exploration:

• Learn the definitions of happens-before, synchronizes-with, and other essential concepts from the Java Memory Model. (It’s hard to understand what “volatile” really means without getting into this.)
• Explore what compilers are and aren’t allowed to do when reordering code. (The JSR-133 spec has some great examples of legal transformations that lead to unexpected results.)
• Find out how to write immutable classes in Java and C++. (There’s more to it than just “don’t change anything after construction”.)
• Internalize the recommendations in the Concurrency section of Effective Java, 2nd Edition. (For example, you should avoid calling methods that are meant to be overridden while inside a synchronized block.)
• Understand what sorts of barriers you can use on x86 and ARM. (And other CPUs for that matter, for example Itanium’s acquire/release instruction modifiers.)
• Read through the java.util.concurrent and java.util.concurrent.atomic APIs to see what's available.
• Consider using concurrency annotations like @ThreadSafe and @GuardedBy (from net.jcip.annotations).

The Further Reading section in the appendix has links to documents and web sites that will better illuminate these topics.