We use two levels of buffers:
1) 4kiB+overhead for each ciphertext block
2) 128kiB+overhead for each FUSE write (32 ciphertext blocks)
This commit adds a sync.Pool for both levels.
The memory-efficiency for small writes could be improved,
as we now always use a 128kiB buffer.
128kiB = 32 x 4kiB pages is the maximum we get from the kernel. Splitting
up smaller writes is probably not worth it.
Parallelism is limited to two for now.
Spawn a worker goroutine that reads the next 512-byte block
while the current one is being drained.
This should help reduce waiting times when /dev/urandom is very
slow (like on Linux 3.16 kernels).
On my machine, reading 512-byte blocks from /dev/urandom
(same via getentropy syscall) is a lot faster in terms of
throughput:
Blocksize Throughput
16 28.18 MB/s
512 83.75 MB/s
For a single-threaded streaming write, this drops the CPU usage of
nonceGenerator.Get to almost 1/3:
flat flat% sum% cum cum%
Before 0 0% 95.08% 0.35s 2.92% github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
After 0.01s 0.092% 92.34% 0.13s 1.20% github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
This change makes the nonce reading single-threaded, which may
hurt massively-parallel writes.
This check would need locking to be multithreading-safe.
But as it is in the fastpath, just remove it.
rand.Read() already guarantees that the value is random.
We got another warning for force_other:
cli_args.go:26:45: don't use underscores in Go names; struct field force_owner should be forceOwner
Use a broader grep.
Before Go 1.5, GOMAXPROCS defaulted to 1, hence it made
sense to unconditionally increase it to 4.
But since Go 1.5, GOMAXPROCS defaults to the number of cores,
so don't keep it from increasing above 4.
Also, update the performance numbers.
Previously, it was at the go-fuse default of 64KiB. Getting
bigger writes should increase throughput somewhat.
Testing on tmpfs shows an improvement from 112MiB/s to 120MiB/s.
Travis failed on Go 1.6.3 with this error:
internal/pathiv/pathiv_test.go:20: no args in Error call
This change should solve the problem and provides a better error
message on (real) test failure.
We have accumulated so many options over time that they
no longer fit on the screen.
Display only a useful subset of options to the user unless
they pass "-hh".
With hard links, the path to a file is not unique. This means
that the ciphertext data depends on the path that is used to access
the files.
Fix that by storing the derived values when we encounter a hard-linked
file. This means that the first path wins.
Instead of redirecting stdout and stderr to /tmp/gocryptfs_paniclog,
where it is hard to find, redirect them to a newly spawned logger(1)
instance that forwards the messages to syslog.
See https://github.com/rfjakob/gocryptfs/issues/109 for an example
where the paniclog was lost due to a reboot.
Also, instead of closing stdin, redirect it to /dev/null, like most
daemons seem to do.
This fixes a few issues I have found reviewing the code:
1) Limit the amount of data ReadLongName() will read. Previously,
you could send gocryptfs into out-of-memory by symlinking
gocryptfs.diriv to /dev/zero.
2) Handle the empty input case in unPad16() by returning an
error. Previously, it would panic with an out-of-bounds array
read. It is unclear to me if this could actually be triggered.
3) Reject empty names after base64-decoding in DecryptName().
An empty name crashes emeCipher.Decrypt().
It is unclear to me if B64.DecodeString() can actually return
a non-error empty result, but let's guard against it anyway.
When a user calls into a deep directory hierarchy, we often
get a sequence like this from the kernel:
LOOKUP a
LOOKUP a/b
LOOKUP a/b/c
LOOKUP a/b/c/d
The diriv cache was not effective for this pattern, because it
was designed for this:
LOOKUP a/a
LOOKUP a/b
LOOKUP a/c
LOOKUP a/d
By also using the cached entry of the grandparent we can avoid lots
of diriv reads.
This benchmark is against a large encrypted directory hosted on NFS:
Before:
$ time ls -R nfs-backed-mount > /dev/null
real 1m35.976s
user 0m0.248s
sys 0m0.281s
After:
$ time ls -R nfs-backed-mount > /dev/null
real 1m3.670s
user 0m0.217s
sys 0m0.403s