They also use three internal buses - one for program, two for data (two operands). This is an extension of the standard Harvard architecture which goes beyond the usual trick of simply adding a cache, to allow access to two operands and the instruction at the same time.
Of course, the problem of 24 bit fixed point is its expense: which probably explains why Motorola later produced the cheap, 16 bit DSP56156 - although this looks like a 16 bit variant of the DSP56002:
And of course there has to be a floating point variant - the DSP96002 looks like a floating point version of the DSP56002:
The DSP96002 supports multiprocessing with an additional 'global bus'which can connect to other DSP96002 processors: it also has a new DMA controller with its own bus