When there are several processes and each have multiple threads, then we have two levels parallelism present given below.
Scheduling in such systems differs substantially depending on whether user-level or kernel-level threads or both are supported.
Since the kernel is not aware of the existence of threads, it operates as it always does, picking a process say X, and giving X control for its quantum.
The thread scheduler inside X decides which thread to run, say X1. Since there are no clock interrupts to multiprogram threads, this thread may continue running as long as it wants to.
If it uses up the process entire quantum, then the kernel will select the another process to run.
Whenever the process X finally runs again, then thread X1 will resume running.
The thread X1 will continue to consume all the time of thread X until it is finished. Other processes will not affected by this behaviour. They will only get whatever the scheduler considers their appropriate share, no matter what is going on inside the X process.
In this, the kernel picks a particular thread to run. It doesn't have to take into the account which process the thread belongs to, but it can if it wants to.
The thread is given a quantum and is forcibly suspended if it exceeds the quantum.
With a 50 msec quantum but threads block after 5 msec, the thread order for some period of 30 msec might be X1, Y1, X2, Y2, X3, Y3, something impossible with these parameters and user-level threads.