The results you found are very typical, and match what the flashbench tool referenced in the last sentence of the article finds on many media. The other interesting number is how many (4MB) segments can be written to alternating, which you can find out with
with varying values for NR. With low numbers, it will be fast for all block sizes, while with large numbers of open segments, the time to write all segments is basically independent of the block size, because every write forces a garbage collection on one of the other open segments.
There is usually a very sharp contrast between the slow and fast results, e.g. five being very fast but six already being very slow.