Image positioning model
The panorama tools model to position images in a panorama assumes that all images are shot from a common viewpoint. A common viewpoint is the only way to avoid Parallax between adjacent images, which might cause unrecoverable stitching[*] errors. This point is determined by lens geometry and commonly called no-parallax point.
To shoot a panorama the camera can be rotated in three angles around this point: From side to side, up and down and around the optical axis[*] (like a steering wheel). The names used in all panotools products[*] for this three angles are Yaw, Pitch and Roll, a special case of wikipedia: Euler angles[*], the so called wikipedia: Tait–Bryan angles[*]. With same name and meaning they are also used for the wikipedia: Aircraft principal axes[*].
To take a (little) movement of the camera into account, this model has been extended with the translation parameters TrX, TrY and TrZ, which describe the movement of the camera in 3D (see Stitching a photo-mosaic for a more detailed description).
Images are positioned inside a virtual sphere no matter what output projection[*] is used. The center of the result canvas is always Yaw=0 and Pitch=0. Positive Yaw values mean image is positioned right, negative values left. Positive Pitch is up, negative down. Positive Roll values mean the image is rotated clockwise, negative counterclockwise. Yaw and Roll range is from -180° to +180° with both 180° and -180° meaning the same position, Pitch range is from -90° (Nadir) to +90° (Zenith). Yaw=0 is a vertical line through the canvas center, Pitch=0 is the equator of the virtual output sphere, a horizontal line through the canvas center. Roll=0 means the camera was exactly horizontal (for landscape oriented images) or exactly vertical (for portrait oriented images).
Yaw, Pitch and Roll values of a source image always refer to the optical axis[*]. An image positioned in Yaw=0 and Pitch=0 means the source image optical axis[*] is in the canvas center. Please note that the actual source image center does not need to match the optical axis[*] due to Lens shift correction. Hence the image boundaries of an image with Yaw=0 and Pitch=0 need not to be centered on the result canvas. Other lens correction parameters don't affect image positioning since they are performed symmetrically around the optical axis.
Relative image positions are usually obtained by control points in the optimization step of image alignment. Same as you need at least two needles to pin a printed image to the wall such that it can't move you need at least two control point pairs per image pair to fix their relative position. However, the distance between the control points might be different in both images. It is the duty of the optimizer to find the best approximation. See this post by Helmut Dersch[*]: "Number of Control Points". More control points might be needed to optimize for lens distortion.