matmul benchmark

Running a few different WebGPU matmul programs on 4096x4096 matrices.