On MacOS, building and testing without openssl is much easier.
The tests should skip tests that fail because of missing openssl
instead of aborting.
Fixes https://github.com/rfjakob/gocryptfs/issues/123
Fixed by including the correct header. Should work on older openssl
versions as well.
Error was:
locking.go:21: undefined reference to `CRYPTO_set_locking_callback'
Due to RMW, we always need read permissions on the backing file. This is a
problem if the file permissions do not allow reading (i.e. 0200 permissions).
This patch works around that problem by chmod'ing the file, obtaining a fd,
and chmod'ing it back.
Test included.
Issue reported at: https://github.com/rfjakob/gocryptfs/issues/125
Currently neither gocryptfs nor go-fuse automatically call load_osxfuse
if the /dev/osxfuse* device(s) do not exist. At least tell the user
what to do.
See https://github.com/rfjakob/gocryptfs/issues/124 for user pain.
Previously we ran through the decryption steps even for an empty
ciphertext slice. The functions handle it correctly, but returning
early skips all the extra calls.
Speeds up the tar extract benchmark by about 4%.
go-fuse caps MaxWrite at MAX_KERNEL_WRITE anyway, and we
actually depend on this behavoir now as the byte pools
are sized according to MAX_KERNEL_WRITE.
So let's use MAX_KERNEL_WRITE explicitely.
We use two levels of buffers:
1) 4kiB+overhead for each ciphertext block
2) 128kiB+overhead for each FUSE write (32 ciphertext blocks)
This commit adds a sync.Pool for both levels.
The memory-efficiency for small writes could be improved,
as we now always use a 128kiB buffer.
128kiB = 32 x 4kiB pages is the maximum we get from the kernel. Splitting
up smaller writes is probably not worth it.
Parallelism is limited to two for now.
Spawn a worker goroutine that reads the next 512-byte block
while the current one is being drained.
This should help reduce waiting times when /dev/urandom is very
slow (like on Linux 3.16 kernels).
On my machine, reading 512-byte blocks from /dev/urandom
(same via getentropy syscall) is a lot faster in terms of
throughput:
Blocksize Throughput
16 28.18 MB/s
512 83.75 MB/s
For a single-threaded streaming write, this drops the CPU usage of
nonceGenerator.Get to almost 1/3:
flat flat% sum% cum cum%
Before 0 0% 95.08% 0.35s 2.92% github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
After 0.01s 0.092% 92.34% 0.13s 1.20% github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
This change makes the nonce reading single-threaded, which may
hurt massively-parallel writes.
This check would need locking to be multithreading-safe.
But as it is in the fastpath, just remove it.
rand.Read() already guarantees that the value is random.
We got another warning for force_other:
cli_args.go:26:45: don't use underscores in Go names; struct field force_owner should be forceOwner
Use a broader grep.
Before Go 1.5, GOMAXPROCS defaulted to 1, hence it made
sense to unconditionally increase it to 4.
But since Go 1.5, GOMAXPROCS defaults to the number of cores,
so don't keep it from increasing above 4.
Also, update the performance numbers.
Previously, it was at the go-fuse default of 64KiB. Getting
bigger writes should increase throughput somewhat.
Testing on tmpfs shows an improvement from 112MiB/s to 120MiB/s.