503 Commits

Author SHA1 Message Date
Jakob Unterwurzacher
538cae610c syscallcompat: Getdents: warn once if we get DT_UNKNOWN
...and if Getdents is not available at all.

Due to this warning I now know that SSHFS always returns DT_UNKNOWN:

    gocryptfs[8129]: Getdents: convertDType: received DT_UNKNOWN, falling back to Lstat

This behavoir is confirmed at http://ahefner.livejournal.com/16875.html:

    "With sshfs, I finally found that obscure case. The dtype is always set to DT_UNKNOWN [...]"
2017-09-03 15:05:54 +02:00
Jakob Unterwurzacher
276567eb13 fusefrontend: use DirIVCache in OpenDir()
Previously, OpenDir() did not use the cache at all, missing
an opportunity to speed up repeated directory reads.
2017-09-03 13:59:53 +02:00
Jakob Unterwurzacher
7da0e97c8b dirivcache: add better function comments + a sanity check on Store()
The comments were unclear on whether relative or absolute paths
have to be passed.
2017-09-03 13:53:50 +02:00
Jakob Unterwurzacher
ed046aa359 Fix misspellings reported by goreportcard.com
https://goreportcard.com/report/github.com/rfjakob/gocryptfs#misspell
2017-08-21 21:06:05 +02:00
Jakob Unterwurzacher
312ea32bb7 cryptocore: add urandom + randprefetch benchmarks
The benchmark that supported the decision for 512-byte
prefetching previously lived outside the repo.

Let's add it where it belongs so it cannot get lost.
2017-08-16 18:33:00 +02:00
Jakob Unterwurzacher
989b880989 fusefrontend: use Getdents if available
Getdents avoids calling Lstat on each file.
2017-08-15 19:04:02 +02:00
Jakob Unterwurzacher
e50a6a57e5 syscallcompat: implement Getdents()
The Readdir function provided by os is inherently slow because
it calls Lstat on all files.

Getdents gives us all the information we need, but does not have
a proper wrapper in the stdlib.

Implement the "Getdents()" wrapper function that calls
syscall.Getdents() and parses the returned byte blob to a
fuse.DirEntry slice.
2017-08-15 19:03:57 +02:00
Jakob Unterwurzacher
0c520845f3 main: purge masterkey from memory as soon as possible
Remove the "Masterkey" field from fusefrontend.Args because it
should not be stored longer than neccessary. Instead pass the
masterkey as a separate argument to the filesystem initializers.

Then overwrite it with zeros immediately so we don't have
to wait for garbage collection.

Note that the crypto implementation still stores at least a
masterkey-derived value, so this change makes it harder, but not
impossible, to extract the encryption keys from memory.

Suggested at https://github.com/rfjakob/gocryptfs/issues/137
2017-08-11 19:02:26 +02:00
Jakob Unterwurzacher
e80b5f2049 nametransform: extend diriv cache to 100 entries
* extend the diriv cache to 100 entries
* add special handling for the immutable root diriv

The better cache allows to shed some complexity from the path
encryption logic (parent-of-parent check).

Mitigates https://github.com/rfjakob/gocryptfs/issues/127
2017-08-09 22:00:53 +02:00
Jakob Unterwurzacher
75ec94a87a nametransform: add Dir() function
Dir is like filepath.Dir but returns "" instead of ".".
This was already implemented in fusefrontend_reverse as saneDir().

We will need it in nametransform for the improved diriv caching.
2017-08-06 23:14:39 +02:00
Jakob Unterwurzacher
5190cc09bb nametransform: move diriv cache into it's own package
Needs some space to grow.

renamed:    internal/nametransform/diriv_cache.go -> internal/nametransform/dirivcache/dirivcache.go
2017-08-06 21:59:15 +02:00
Jakob Unterwurzacher
32611ff97a nametransform: deduplicate code to encryptAndHashName()
This operation has been done three time by identical
sections of code. Create a function for it.
2017-08-06 21:23:42 +02:00
Jakob Unterwurzacher
d12aa57715 fusefronted_reverse: fix ino collision between .name and .diriv files
A directory with a long name has two associated virtual files:
the .name file and the .diriv files.

These used to get the same inode number:

  $ ls -di1  * */*
             33313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw
  1000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw/gocryptfs.diriv
  1000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw.name

With this change we use another prefix (2 instead of 1) for .name files.

  $ ls -di1 * */*
             33313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw
  1000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw/gocryptfs.diriv
  2000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw.name
2017-07-29 16:15:49 +02:00
Jakob Unterwurzacher
d5133ca5ac fusefrontend_reverse: return ENOENT for undecryptable names
This was working until DecryptName switched to returning
EBADMSG instead of EINVAL.

Add a test to catch the regression next time.
2017-07-27 20:31:22 +02:00
Jakob Unterwurzacher
ccf1a84e41 macos: make testing without openssl work properly
On MacOS, building and testing without openssl is much easier.
The tests should skip tests that fail because of missing openssl
instead of aborting.

Fixes https://github.com/rfjakob/gocryptfs/issues/123
2017-07-14 23:22:15 +02:00
Jakob Unterwurzacher
61e964457d stupidgcm: fix openssl 1.1 build failure
Fixed by including the correct header. Should work on older openssl
versions as well.

Error was:
locking.go:21: undefined reference to `CRYPTO_set_locking_callback'
2017-07-14 20:44:07 +02:00
Jakob Unterwurzacher
3062de6187 fusefronted: enable writing to write-only files
Due to RMW, we always need read permissions on the backing file. This is a
problem if the file permissions do not allow reading (i.e. 0200 permissions).
This patch works around that problem by chmod'ing the file, obtaining a fd,
and chmod'ing it back.

Test included.

Issue reported at: https://github.com/rfjakob/gocryptfs/issues/125
2017-07-11 23:19:58 +02:00
Jakob Unterwurzacher
b6bda01c33 contentenc: MergeBlocks: short-circuit the trivial case
Saves 3% for the tar extract benchmark because we skip the allocation.
2017-07-02 16:23:24 +02:00
Jakob Unterwurzacher
52ab0462a4 fusefrontend: doRead: skip decryption for an empty read
Previously we ran through the decryption steps even for an empty
ciphertext slice. The functions handle it correctly, but returning
early skips all the extra calls.

Speeds up the tar extract benchmark by about 4%.
2017-07-02 16:02:13 +02:00
Jakob Unterwurzacher
9f4bd76576 stupidgcm: add test for in-place Open
Adds a test for the optimization introduced in:

	stupidgcm: Open: if "dst" is big enough, use it as the output buffer
2017-07-01 09:56:05 +02:00
Jakob Unterwurzacher
12c0101a23 contentenc: add PReqPool and use it in DecryptBlocks
This gets us a massive speed boost in streaming reads.
2017-06-30 23:30:57 +02:00
Jakob Unterwurzacher
e4b5005bcc stupidgcm: Open: if "dst" is big enough, use it as the output buffer
This means we won't need any allocation for the plaintext.
2017-06-30 23:24:12 +02:00
Jakob Unterwurzacher
b2a23e94d1 fusefrontend: doRead: use CReqPool for ciphertext buffer
Easily saves lots of allocations.
2017-06-30 23:15:31 +02:00
Jakob Unterwurzacher
06398e82d9 fusefrontend: Read: use provided buffer
This will allow us to return internal buffers to a pool.
2017-06-30 23:11:38 +02:00
Jakob Unterwurzacher
80676c685f contentenc: add safer "bPool" pool variant; add pBlockPool
bPool verifies the lengths of slices going in and out.

Also, add a plaintext block pool - pBlockPool - and use
it for decryption.
2017-06-29 23:44:32 +02:00
Jakob Unterwurzacher
0cc6f53496 stupidgcm: use "dst" as the output buffer it is big enough
This saves an allocation of the ciphertext block.
2017-06-29 18:52:33 +02:00
Jakob Unterwurzacher
3c6fe98eb1 contentenc: use sync.Pool memory pools for encryption
We use two levels of buffers:

1) 4kiB+overhead for each ciphertext block
2) 128kiB+overhead for each FUSE write (32 ciphertext blocks)

This commit adds a sync.Pool for both levels.

The memory-efficiency for small writes could be improved,
as we now always use a 128kiB buffer.
2017-06-20 21:22:00 +02:00
Jakob Unterwurzacher
a4563e21ec main, syscallcompat: use Dup3 instead of Dup2
Dup2 is not implemented on linux/arm64.

Fixes https://github.com/rfjakob/gocryptfs/issues/121 .

Also adds cross-compilation to CI.
2017-06-18 15:43:22 +02:00
Jakob Unterwurzacher
e52594dae6 contentenc: parallelize encryption for 128kiB writes
128kiB = 32 x 4kiB pages is the maximum we get from the kernel. Splitting
up smaller writes is probably not worth it.

Parallelism is limited to two for now.
2017-06-11 21:56:16 +02:00
Jakob Unterwurzacher
9837cb0ddc cryptocore: prefetch nonces in the background
Spawn a worker goroutine that reads the next 512-byte block
while the current one is being drained.

This should help reduce waiting times when /dev/urandom is very
slow (like on Linux 3.16 kernels).
2017-06-11 21:29:50 +02:00
Jakob Unterwurzacher
80516ed335 cryptocore: prefetch nonces in 512-byte blocks
On my machine, reading 512-byte blocks from /dev/urandom
(same via getentropy syscall) is a lot faster in terms of
throughput:

Blocksize    Throughput
 16          28.18 MB/s
512          83.75 MB/s

For a single-threaded streaming write, this drops the CPU usage of
nonceGenerator.Get to almost 1/3:

        flat  flat%   sum%        cum   cum%
Before     0     0% 95.08%      0.35s  2.92%  github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
After  0.01s 0.092% 92.34%      0.13s  1.20%  github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get

This change makes the nonce reading single-threaded, which may
hurt massively-parallel writes.
2017-06-09 22:05:14 +02:00
Charles Duffy
da1bd74246 Fix missing Owner coercion for already-open files (#117) 2017-06-09 22:04:56 +02:00
Jakob Unterwurzacher
d2be22a07f cryptocore: remove lastNonce check
This check would need locking to be multithreading-safe.
But as it is in the fastpath, just remove it.
rand.Read() already guarantees that the value is random.
2017-06-07 23:08:43 +02:00
Jakob Unterwurzacher
294628b384 contentenc: move EncryptBlocks() loop into its own functions
This allows easy parallelization in the future.
2017-06-07 22:09:15 +02:00
Jakob Unterwurzacher
71978ec88a Add "-trace" flag (record execution trace)
Uses the runtime/trace functionality.

TODO: add to man page.
2017-06-07 22:09:06 +02:00
Jakob Unterwurzacher
a24faa3ba5 fusefrontend: write: consolidate and move encryption to contentenc
Collect all the plaintext and pass everything to contentenc in
one call.

This will allow easier parallization of the encryption.

https://github.com/rfjakob/gocryptfs/issues/116
2017-06-01 22:19:27 +02:00
Jakob Unterwurzacher
f44902aaae Fix two comments
One out-of-date and the other with a typo.
2017-06-01 18:53:57 +02:00
Charles Duffy
cf1ded5236 Implement force_owner option to display ownership as a specific user. 2017-06-01 00:26:17 +02:00
Jakob Unterwurzacher
fc2a5f5ab0 pathiv: fix test failure on Go 1.6
Travis failed on Go 1.6.3 with this error:

	internal/pathiv/pathiv_test.go:20: no args in Error call

This change should solve the problem and provides a better error
message on (real) test failure.
2017-05-31 08:21:36 +02:00
Jakob Unterwurzacher
9a217ce786 pathiv: move block IV algorithm into this package
This was implemented in fusefrontend_reverse, but we need it
in fusefrontend as well. Move the algorithm into pathiv.BlockIV().
2017-05-30 17:04:46 +02:00
Jakob Unterwurzacher
d202a456f5 pathiv: move derivedIVContainer into the package
...under the new name "FileIVs".

This will also be used by forward mode.
2017-05-30 17:04:46 +02:00
Jakob Unterwurzacher
857507e8b1 fusefrontend_reverse: move pathiv to its own package
We will also need it in forward mode.
2017-05-30 17:04:46 +02:00
Jakob Unterwurzacher
d6ef283c3f cryptocore: improve comments and add tests for hkdfDerive
These should make it easier to re-implement the key derivation
that was enabled with the "HKDF" feature flag.
2017-05-27 14:41:20 +02:00
Jakob Unterwurzacher
9ecf2d1a3f fusefrontend_reverse: store derived values for hard-linked files
With hard links, the path to a file is not unique. This means
that the ciphertext data depends on the path that is used to access
the files.

Fix that by storing the derived values when we encounter a hard-linked
file. This means that the first path wins.
2017-05-25 21:33:16 +02:00
Jakob Unterwurzacher
9a3f9350fe nametransform: reject all-zero dirIV
This should never happen in normal operation and is a sign of
data corruption. Catch it early.
2017-05-25 14:21:55 +02:00
Jakob Unterwurzacher
2ce269ec63 contenenc: reject all-zero file ID
This should never happen in normal operation and is a sign of
data corruption. Catch it early.
2017-05-25 14:20:27 +02:00
Jakob Unterwurzacher
c0e411f81d contentenc: better error reporting in ParseHeader
Log the message ourselves and return EINVAL.

Before:

	gocryptfs[26962]: go-fuse: can't convert error type: ParseHeader: invalid version: got 0, want 2

After:

	gocryptfs[617]: ParseHeader: invalid version: want 2, got 0. Returning EINVAL.
2017-05-25 14:18:44 +02:00
Jakob Unterwurzacher
e827763f2e nametransform: harden name decryption against invalid input
This fixes a few issues I have found reviewing the code:

1) Limit the amount of data ReadLongName() will read. Previously,
you could send gocryptfs into out-of-memory by symlinking
gocryptfs.diriv to /dev/zero.

2) Handle the empty input case in unPad16() by returning an
error. Previously, it would panic with an out-of-bounds array
read. It is unclear to me if this could actually be triggered.

3) Reject empty names after base64-decoding in DecryptName().
An empty name crashes emeCipher.Decrypt().
It is unclear to me if B64.DecodeString() can actually return
a non-error empty result, but let's guard against it anyway.
2017-05-23 21:26:38 +02:00
Jakob Unterwurzacher
508fd9e1d6 main: downgrade panic log create failure from fatal error to warning
Exiting with a fatal error just pushes users to use "-nosyslog",
which is even worse than not having a paniclog.
2017-05-23 18:01:21 +02:00
Jakob Unterwurzacher
245b84c887 nametransform: diriv cache: fall back to the grandparent
When a user calls into a deep directory hierarchy, we often
get a sequence like this from the kernel:

LOOKUP a
LOOKUP a/b
LOOKUP a/b/c
LOOKUP a/b/c/d

The diriv cache was not effective for this pattern, because it
was designed for this:

LOOKUP a/a
LOOKUP a/b
LOOKUP a/c
LOOKUP a/d

By also using the cached entry of the grandparent we can avoid lots
of diriv reads.

This benchmark is against a large encrypted directory hosted on NFS:

Before:

  $ time ls -R nfs-backed-mount > /dev/null
  real	1m35.976s
  user	0m0.248s
  sys	0m0.281s

After:

  $ time ls -R nfs-backed-mount > /dev/null
  real	1m3.670s
  user	0m0.217s
  sys 	0m0.403s
2017-05-22 22:36:54 +02:00