/g/ - Technology

install openbsd

[Make a Post]
[X]





What hashing algorithm to use? Nanonymous No.1861 [D][U][F][S][L][A][C] >>1862 >>1863 >>3764
File: 8767f86c59ab4281eeab30365ac59be274ea4b0ebcc3b3f226b3ec310875e3e4.png (dl) (11.12 KiB)

I'm writing a program that stores files addressed by their hash digest, now I'm wondering what hashing algorithm to use.

I was inclined to go with BLAKE2b because of its speed an security but it isn't used much and I'd rather use something that is more commonly used so I can possibly deduplicate without having to hash the same file twice.

I'm guessing SHA256 is the most commonly (8chan and Nanochan uses this) used and still secure enough (though not against length extension attacks)
But it is also twice as slow as BLAKE2b ... so what should I do use something that almost no one uses but is more secure and faster or use something that is less secure and slower but more commonly used so I can use it more often to deduplicate without having to hash a file twice.

Or should I simply hash the file multiple times... (which kind of defeats the performance gains of BLAKE2b)

Sorry, I'm having trouble with making decisions

Nanonymous No.1862 [D] >>3763

>>1861
If you have a (((64-bit))) processor, you should use SHA-512 because it is faster on 64-bit. Otherwise, use SHA-256.
You don't need to worry about security at all, unless you are going to allow people to upload files. If you don't need others to have write access, just use MD5. The chance of accidental collisions is still extremely small.

Nanonymous No.1863 [D] >>1864

>>1861
Keep in mind that hashing a file is generally IO-bound, unless you are using a slow processor and/or a fast SSD.

OP No.1864 [D] >>1866 >>3764

Here are results of a random benchmark (lower is better):
>BLAKE2b = [0.24815844299882883, 0.24455917099840008, 0.24289618100010557]
>SHA512 = [0.4054747469999711, 0.400344555004267, 0.40310188900184585]
>SHA256 = [0.5319015080021927, 0.527558287998545, 0.5294597870015423]

IPFS uses SHA256 by default as well I think, anyway if I were to use BLAKE2b would it be recommended to simply truncate the hash? I know BLAKE2b has an option to get smaller hashes but it'll be entirely different from the larger hash.

>>1863
True, it certainly is not the bottleneck.

Nanonymous No.1866 [D] >>1867

>>1864
Whatever the fuck you do, stick with it. Don't be like IPFS and have multiple different hashing algos. That's total cancer and the main reason why I couldn't be bothered with the piece of shitware after using it for a while.
If you want future-proofing, just use blake2b and FUCKING STICK WITH IT.

Nanonymous No.1867 [D] >>3764

>>1866
>FUCKING STICK WITH IT.
Agreed. I hate what IPFS does, its retarded.

The reason I want to simply truncate BLAKE2b hashes is that no matter how long you want your digest you'll be able to deduplicate files still:
>da634c195b5050d8038ce1b83da6d382
>da634c195b5050d8038ce1b83da6d382882ce5914245ec866395a508ef972e39
>da634c195b5050d8038ce1b83da6d382882ce5914245ec866395a508ef972e39a6c39bcd60100e78311f401e9b8cdef7f0b920849e7f0201d83d44e3d76fee09


Nanonymous No.1868 [D] >>1869 >>1870

Use blake2b. SHA-2 is pointless now that we have SHA-3.

Nanonymous No.1869 [D] >>3762

>>1868
And confusing
>SHA3-256
>SHA2-256

Nanonymous No.1870 [D] >>1871 >>3762

>>1868
On the topic of using outdated crypto. When hakase switched from a hashing function to a key derivation function he chose to use bcrypt. While bcrypt is still strong it's fairly outdated. In fact bcrypt is almost as old as me. Currently the best KDF to use is argon2id.

Nanonymous No.1871 [D] >>1872

>>1870
I've been wanting to use Argon2 for some projects but the thing that prevented me from using it was lack of libraries for Python (I really need to invest my time in other languages)

Nanonymous No.1872 [D]

>>1871
Last time I looked anyway:
https://github.com/search?l=Python&q=Argon2&type=Repositories

Nanonymous No.1908 [D] >>1909

I'd use something simple like CRC32. As a bonus, it's super fast on jewmd64 CPUs since it has a dedicated instruction.

Also, your stuff sounds like git.

Nanonymous No.1909 [D] >>1910

>>1908
Terrible idea. It's trivial to generate arbitrary files with a particular crc. 32 bits is also very likely to collide by pure chance.

Nanonymous No.1910 [D][U][F]
File: 8105aae2a2f0be2839f98aa6da7f38a83e2b94bfca43aed3a66bc55ef75abf76.png (dl) (2.40 MiB)

>>1909
This. Also blake2 is already fast.

Anyway, I have few encrypted disks with sha-256. Should I reencrypt? Can length extension attacks affect me? That means if it's, hmm, brokeable in 30 days, in worse case in 60? I used few times over the default iterations.

Nanonymous No.1914 [D] >>1915

>encrypted disks with sha-256
You can't encrypt something with sha-256 because it is a hashing function.

Nanonymous No.1915 [D] >>1921 >>3764

>>1914
With LUKS/cryptsetup, AES-256, but you know what I meant.
It should compare password I input with the hash, right? If the hash is known I am undefended, right? There's the problem because I couldn't find how exactly does whole process works.

Nanonymous No.1921 [D][U][F]
File: 083c04e02c3050bdb6b2dac4e4bbe8c81d7f1f25aff42b44956b861283a85d2c.pdf (dl) (1.35 MiB)

>>1915
>It should compare password I input with the hash, right?
Not exactly. Here, read this research paper and maybe you'll understand how it works better.

Nanonymous No.3762 [D]

>>1869
How is that confusing? its the least confusing computer science standards you could have. algo name and bit length.

>>1870
AFAIK, bcrypt is still secure and tested properly by the industry as well as its adoption on a lot of platforms. Newer doesn't mean less likely to be broken.

Nanonymous No.3763 [D]

>>1862
crc32

Nanonymous No.3764 [D]

>>1861
>still secure enough (though not against length extension attacks)
that's not a vuln nigger that's a retard not understanding how to use a cryptographic hashing algorithm. also length extension attacks don't does not apply to deduplication
>>1864
>(lower is better)
thank you for this clickbait
>>1867
>wants to truncate hashes
typical nigger bullshit. if you want guarantees you don't fuck with the hash. if you're engineering properly to Zooko's triangle, a hash being 'too big' to write down does not matter
>>1915
>With LUKS/cryptsetup, AES-256, but you know what I meant.
No, we fucking don't.

Nanonymous No.3912 [D] >>3914 >>3915

bcrypt, cracking one rn at a blistering 86 hashes per second (rx 580) this card would do 200k sha256 a sec easy.

Nanonymous No.3914 [D]

>>3912
Retard, he isn't hashing passwords. Different use cases favor different hashing algorithms. A purposefully slow hashing algorithm is not fit for what OP wants (and needs).

Nanonymous No.3915 [D]

>>3912
>86 hashes per second
>blistering
lmfao it's gonna take you 900000 years to get the password