MacOS creates lots of these files, and if the directory is otherwise
empty, we would throw an IO error to the unsuspecting user.
With this patch, we log a warning, but otherwise pretend we did not
see it.
Mitigates https://github.com/rfjakob/gocryptfs/issues/140
...and if Getdents is not available at all.
Due to this warning I now know that SSHFS always returns DT_UNKNOWN:
gocryptfs[8129]: Getdents: convertDType: received DT_UNKNOWN, falling back to Lstat
This behavoir is confirmed at http://ahefner.livejournal.com/16875.html:
"With sshfs, I finally found that obscure case. The dtype is always set to DT_UNKNOWN [...]"
The benchmark that supported the decision for 512-byte
prefetching previously lived outside the repo.
Let's add it where it belongs so it cannot get lost.
The Readdir function provided by os is inherently slow because
it calls Lstat on all files.
Getdents gives us all the information we need, but does not have
a proper wrapper in the stdlib.
Implement the "Getdents()" wrapper function that calls
syscall.Getdents() and parses the returned byte blob to a
fuse.DirEntry slice.
Remove the "Masterkey" field from fusefrontend.Args because it
should not be stored longer than neccessary. Instead pass the
masterkey as a separate argument to the filesystem initializers.
Then overwrite it with zeros immediately so we don't have
to wait for garbage collection.
Note that the crypto implementation still stores at least a
masterkey-derived value, so this change makes it harder, but not
impossible, to extract the encryption keys from memory.
Suggested at https://github.com/rfjakob/gocryptfs/issues/137
* extend the diriv cache to 100 entries
* add special handling for the immutable root diriv
The better cache allows to shed some complexity from the path
encryption logic (parent-of-parent check).
Mitigates https://github.com/rfjakob/gocryptfs/issues/127
Dir is like filepath.Dir but returns "" instead of ".".
This was already implemented in fusefrontend_reverse as saneDir().
We will need it in nametransform for the improved diriv caching.
A directory with a long name has two associated virtual files:
the .name file and the .diriv files.
These used to get the same inode number:
$ ls -di1 * */*
33313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw
1000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw/gocryptfs.diriv
1000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw.name
With this change we use another prefix (2 instead of 1) for .name files.
$ ls -di1 * */*
33313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw
1000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw/gocryptfs.diriv
2000000000033313535 gocryptfs.longname.2togDFouca9mrTwtfF1RNW5DZRAQY8alaR7wO_Xd5Zw.name
On MacOS, building and testing without openssl is much easier.
The tests should skip tests that fail because of missing openssl
instead of aborting.
Fixes https://github.com/rfjakob/gocryptfs/issues/123
Fixed by including the correct header. Should work on older openssl
versions as well.
Error was:
locking.go:21: undefined reference to `CRYPTO_set_locking_callback'
Due to RMW, we always need read permissions on the backing file. This is a
problem if the file permissions do not allow reading (i.e. 0200 permissions).
This patch works around that problem by chmod'ing the file, obtaining a fd,
and chmod'ing it back.
Test included.
Issue reported at: https://github.com/rfjakob/gocryptfs/issues/125
Previously we ran through the decryption steps even for an empty
ciphertext slice. The functions handle it correctly, but returning
early skips all the extra calls.
Speeds up the tar extract benchmark by about 4%.
We use two levels of buffers:
1) 4kiB+overhead for each ciphertext block
2) 128kiB+overhead for each FUSE write (32 ciphertext blocks)
This commit adds a sync.Pool for both levels.
The memory-efficiency for small writes could be improved,
as we now always use a 128kiB buffer.
128kiB = 32 x 4kiB pages is the maximum we get from the kernel. Splitting
up smaller writes is probably not worth it.
Parallelism is limited to two for now.
Spawn a worker goroutine that reads the next 512-byte block
while the current one is being drained.
This should help reduce waiting times when /dev/urandom is very
slow (like on Linux 3.16 kernels).
On my machine, reading 512-byte blocks from /dev/urandom
(same via getentropy syscall) is a lot faster in terms of
throughput:
Blocksize Throughput
16 28.18 MB/s
512 83.75 MB/s
For a single-threaded streaming write, this drops the CPU usage of
nonceGenerator.Get to almost 1/3:
flat flat% sum% cum cum%
Before 0 0% 95.08% 0.35s 2.92% github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
After 0.01s 0.092% 92.34% 0.13s 1.20% github.com/rfjakob/gocryptfs/internal/cryptocore.(*nonceGenerator).Get
This change makes the nonce reading single-threaded, which may
hurt massively-parallel writes.
This check would need locking to be multithreading-safe.
But as it is in the fastpath, just remove it.
rand.Read() already guarantees that the value is random.
Travis failed on Go 1.6.3 with this error:
internal/pathiv/pathiv_test.go:20: no args in Error call
This change should solve the problem and provides a better error
message on (real) test failure.
With hard links, the path to a file is not unique. This means
that the ciphertext data depends on the path that is used to access
the files.
Fix that by storing the derived values when we encounter a hard-linked
file. This means that the first path wins.
This fixes a few issues I have found reviewing the code:
1) Limit the amount of data ReadLongName() will read. Previously,
you could send gocryptfs into out-of-memory by symlinking
gocryptfs.diriv to /dev/zero.
2) Handle the empty input case in unPad16() by returning an
error. Previously, it would panic with an out-of-bounds array
read. It is unclear to me if this could actually be triggered.
3) Reject empty names after base64-decoding in DecryptName().
An empty name crashes emeCipher.Decrypt().
It is unclear to me if B64.DecodeString() can actually return
a non-error empty result, but let's guard against it anyway.
When a user calls into a deep directory hierarchy, we often
get a sequence like this from the kernel:
LOOKUP a
LOOKUP a/b
LOOKUP a/b/c
LOOKUP a/b/c/d
The diriv cache was not effective for this pattern, because it
was designed for this:
LOOKUP a/a
LOOKUP a/b
LOOKUP a/c
LOOKUP a/d
By also using the cached entry of the grandparent we can avoid lots
of diriv reads.
This benchmark is against a large encrypted directory hosted on NFS:
Before:
$ time ls -R nfs-backed-mount > /dev/null
real 1m35.976s
user 0m0.248s
sys 0m0.281s
After:
$ time ls -R nfs-backed-mount > /dev/null
real 1m3.670s
user 0m0.217s
sys 0m0.403s
This commit defines all exit codes in one place in the exitcodes
package.
Also, it adds a test to verify the exit code on incorrect
password, which is what SiriKali cares about the most.
Fixes https://github.com/rfjakob/gocryptfs/issues/77 .
Misspell Finds commonly misspelled English words
gocryptfs/internal/configfile/scrypt.go
Line 41: warning: "paramter" is a misspelling of "parameter" (misspell)
gocryptfs/internal/ctlsock/ctlsock_serve.go
Line 1: warning: "implementes" is a misspelling of "implements" (misspell)
gocryptfs/tests/test_helpers/helpers.go
Line 27: warning: "compatability" is a misspelling of "compatibility" (misspell)
This can happen during normal operation, and is harmless since
14038a1644
"fusefrontend: readFileID: reject files that consist only of a header"
causes dormant header-only files to be rewritten on the next write.
We do not have to track the writeOnly status because the kernel
will not forward read requests on a write-only FD to us anyway.
I have verified this behavoir manually on a 4.10.8 kernel and also
added a testcase.
As reported at https://github.com/rfjakob/gocryptfs/issues/105 ,
the "ioutil.WriteFile(file, iv, 0400)" call causes "permissions denied"
errors on an NFSv4 setup.
"strace"ing diriv creation and gocryptfs.conf creation shows this:
conf (works on the user's NFSv4 mount):
openat(AT_FDCWD, "/tmp/a/gocryptfs.conf.tmp", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0400) = 3
diriv (fails):
openat(AT_FDCWD, "/tmp/a/gocryptfs.diriv", O_WRONLY|O_CREAT|O_TRUNC|O_CLOEXEC, 0400) = 3
This patch creates the diriv file with the same flags that are used for
creating the conf:
openat(AT_FDCWD, "/tmp/a/gocryptfs.diriv", O_WRONLY|O_CREAT|O_EXCL|O_CLOEXEC, 0400) = 3
Closes https://github.com/rfjakob/gocryptfs/issues/105
Force decode of encrypted files even if the integrity check fails, instead of
failing with an IO error. Warning messages are still printed to syslog if corrupted
files are encountered.
It can be useful to recover files from disks with bad sectors or other corrupted
media.
Closes https://github.com/rfjakob/gocryptfs/pull/102 .
go-fuse has added a new method to the nodefs.File interface that
caused this build error:
internal/fusefrontend/file.go:75: cannot use file literal (type *file) as type nodefs.File in return argument:
*file does not implement nodefs.File (missing Flock method)
Fixes https://github.com/rfjakob/gocryptfs/issues/104 and
prevents the problem from happening again.
The volatile inode numbers that we used before cause "find" to complain and error out.
Virtual inode numbers are derived from their parent file inode number by adding 10^19,
which is hopefully large enough no never cause problems in practice.
If the backing directory contains inode numbers higher than that, stat() on these files
will return EOVERFLOW.
Example directory lising after this change:
$ ls -i
926473 gocryptfs.conf
1000000000000926466 gocryptfs.diriv
944878 gocryptfs.longname.hmZojMqC6ns47eyVxLlH2ailKjN9bxfosi3C-FR8mjA
1000000000000944878 gocryptfs.longname.hmZojMqC6ns47eyVxLlH2ailKjN9bxfosi3C-FR8mjA.name
934408 Tdfbf02CKsTaGVYnAsSypA
This PR addresses the Issue #95, about "Confusing file owner for
longname files in reverse mode".
It affects only the reverse mode, and introduces two
modifications:
1) The "gocryptfs.longname.XXXX.name" files are assigned the owner and
group of the underlying plaintext file. Therefore it is consistent
with the file "gocryptfs.longname.XXXX" that has the encrypted
contents of the plaintext file.
2) The two virtual files mentioned above are given -r--r--r--
permissions. This is consistent with the behavior described in
function Access in internal/fusefrontend_reverse/rfs.go where all
virtual files are always readable. Behavior also observed in point
c) in #95 .
Issue #95 URL: https://github.com/rfjakob/gocryptfs/issues/95
Pull request URL: https://github.com/rfjakob/gocryptfs/pull/97
Due to kernel readahead, we usually get multiple read requests
at the same time. These get submitted to the backing storage in
random order, which is a problem if seeking is very expensive.
Details: https://github.com/rfjakob/gocryptfs/issues/92
...if doWrite() can do it for us. This avoids the situation
that the file only consists of a file header when calling
doWrite.
A later patch will check for this condition and warn about it,
as with this change it should no longer occour in normal operation.
If you truncate a ciphertext file to 19 bytes, you could get the
impression that the plaintext is 18446744073709551585 bytes long,
as reported by "ls -l".
Fix it by clamping the value to zero.
Prior to this commit, gocryptfs's reverse mode did not report correct
directory entry sizes for symbolic links, where the dentry size needs to
be the same as the length of a string containing the target path.
This commit corrects this issue and adds a test case to verify the
correctness of the implementation.
This issue was discovered during the use of a strict file copying program
on a reverse-mounted gocryptfs file system.
Raw64 is supported (but was disabled by default) since gocryptfs
v1.2. However, the implementation was buggy because it forgot
about long names and symlinks.
Disable it for now by default and enable it later, together
with HKDF.
...but keep it disabled by default for new filesystems.
We are still missing an example filesystem and CLI arguments
to explicitely enable and disable it.
As we have dropped Go 1.4 compatibility already, and will add
a new feature flag for gocryptfs v1.3 anyway, this is a good
time to enable Raw64 as well.
There is no security reason for doing this, but it will allow
to consolidate the code once we drop compatibility with gocryptfs v1.2
(and earlier) filesystems.
There are two independent backends, one for name encryption,
the other one, AEAD, for file content.
"BackendTypeEnum" only applies to AEAD (file content), so make that
clear in the name.
Version 1.1 of the EME package (github.com/rfjakob/eme) added
a more convenient interface. Use it.
Note that you have to upgrade your EME package (go get -u)!
This really only handles scrypt and no other key-derivation functions.
Renaming the files prevents confusion once we introduce HKDF.
renamed: internal/configfile/kdf.go -> internal/configfile/scrypt.go
renamed: internal/configfile/kdf_test.go -> internal/configfile/scrypt_test.go
A crypto benchmark mode like "openssl speed".
Example run:
$ ./gocryptfs -speed
AES-GCM-256-OpenSSL 180.89 MB/s (selected in auto mode)
AES-GCM-256-Go 48.19 MB/s
AES-SIV-512-Go 37.40 MB/s
This used to hang at 100% CPU:
cat /dev/zero | gocryptfs -init a
...and would ultimately send the box into out-of-memory.
The number 1000 is chosen arbitrarily and seems big enough
given that the password must be one line.
Suggested by @mhogomchungu in https://github.com/rfjakob/gocryptfs/issues/77 .
From the comment:
// CheckTrailingGarbage tries to read one byte from stdin and exits with a
// fatal error if the read returns any data.
// This is meant to be called after reading the password, when there is no more
// data expected. This helps to catch problems with third-party tools that
// interface with gocryptfs.
Preallocation is very slow on hdds that run btrfs. Give the
user the option to disable it. This greatly speeds up small file
operations but reduces the robustness against out-of-space errors.
Also add the option to the man page.
More info: https://github.com/rfjakob/gocryptfs/issues/63
This could have caused spurious ENOENT errors.
That it did not cause these errors all the time is interesting
and probably because an earlier readdir would place the entry
in the cache. This masks the bug.
$ golint ./... | grep -v underscore | grep -v ALL_CAPS
internal/fusefrontend_reverse/rfs.go:52:36: exported func NewFS returns unexported type *fusefrontend_reverse.reverseFS, which can be annoying to use
internal/nametransform/raw64_go1.5.go:10:2: exported const HaveRaw64 should have comment (or a comment on this block) or be unexported
The Back In Time backup tool (https://github.com/bit-team/backintime)
wants to write directly into the ciphertext dir.
This may cause the cached directory IV to become out-of-date.
Having an expiry time limits the inconstency to one second, like
attr_timeout does for the kernel getattr cache.
Running xfstests generic/075 on tmpfs often triggered a panic
for what seems to be a tmpfs bug.
Quoting from the email to lkml,
http://www.spinics.net/lists/kernel/msg2370127.html :
tmpfs seems to be incorrectly returning 0-bytes when reading from
a file that is concurrently being truncated.
Stat() calls are expensive on NFS as they need a full network
round-trip. We detect when a write immediately follows the
last one and skip the Stat in this case because the write
cannot create a file hole.
On my (slow) NAS, this takes the write speed from 24MB/s to
41MB/s.
Test that we get the right timestamp when extracting a tarball.
Also simplify the workaround in doTestUtimesNano() and fix the
fact that it was running no test at all.
This can happen during normal operation when the directory has
been deleted concurrently. But it can also mean that the
gocryptfs.diriv is missing due to an error, so log the event
at "info" level.
AES-SIV uses 1/2 of the key for authentication, 1/2 for
encryption, so we need a 64-byte key for AES-256. Derive
it from the master key by hashing it with SHA-512.
On a CPU without AES-NI:
$ go test -bench .
Benchmark4kEncStupidGCM-2 50000 24155 ns/op 169.57 MB/s
Benchmark4kEncGoGCM-2 20000 93965 ns/op 43.59 MB/s
Benchmark4kEncGCMSIV-2 500 2576193 ns/op 1.59 MB/s
The last patch added functionality for generating gocryptfs.longname.*
files, this patch adds support for mapping them back to the full
filenames.
Note that resolving a long name needs a full readdir. A cache
will be implemented later on to improve performance.