Quantisation errors can be minimised by keeping values large - so that the maximum number of bits is used to represent them.
There is a limit to how large numbers can be, determined by the precision of the hardware used for processing. If the maximum number size is exceeded, the hardware may allow overflow or saturation:
Saturation and overflow are both non linear quantisation errors.
Note that overflow, although looking more drastic than saturation, may be preferred. It is a property of two's complement integer arithmetic that if a series of numbers are added together, even if overflow occurs at intermediate stages, so long as the result is within the range that can be represented the result will be correct.
Overflow or saturation can be avoided by scaling the input to be small enough that overflow does not occur during the next stage of processing. There are two choices:
Scaling reduces the number of bits left to represent a signal (dividing down means some low bits are lost), so it increases quantisation errors.
Scaling requires an extra multiplier in the filter, which means more hardware:
Note that hardware with higher precision or using floating point arithmetic, may not require scaling and so can implement filters with less operations.