Excess copying the packets is basically the enemy of high-performance communication in a multicomputer systems.
Therefore, for best case, there will be one copy from random access memory (RAM) to the interface board at source node, one copy from there to destination random access memory, now it's a three copies total.
But, it is worse in many systems. Therefore, if the interface board is mapped into kernel virtual address space and not a user virtual address space, a user process can only send a packet just by issuing a system call that traps to kernel.
Now, the kernel may have to copy those packets to its own memory on input and output both.