Other than the uop cache, all lines are 64 bytes in all levels, in all x86 for the last couple decades from all vendors. (Some hardware prefetch likes to complete 128-byte-aligned pairs of cache lines, e.g. in Intel's L2, so you can get some false-sharing interference between separate cache lines, but not as bad as same-line.)
Probably the OC-fetch spanning two L1i lines is showing that a single uop-cache entry can contain uops that span a boundary between L1i lines. (Intel Sandybridge couldn't do this, and AFAIK that's still the case in current Intel, that all uops in 1 way of the cache have to start in the same 32-byte chunk of machine code.)
The diagram is showing that fetch from L1i (to feed the legacy decoders) is 32 bytes wide, and can be misaligned by 16. But not arbitrary misalignments.