Skip to content

Add SHA-512 Hashing Sink #113

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
terrorbyte opened this issue Apr 26, 2022 · 13 comments · Fixed by #215
Closed

Add SHA-512 Hashing Sink #113

terrorbyte opened this issue Apr 26, 2022 · 13 comments · Fixed by #215

Comments

@terrorbyte
Copy link
Contributor

Use Case

Currently SHA-256 is the only supported hashing algorithm (and there is some engineering design decisions in #39), this PR creates a SHA-512 sink that works identically the the current SHA-256 design but just with the crypto/sha512 library imported and the same functions applied.

Tests

As per the contributor guidelines I ran also created all the tests and test data. Here is the following output:

$ go test -v | grep -i sha512
=== RUN   TestSHA512Sums_OutputsCorrectHashForEachSpecifiedFile
=== PAUSE TestSHA512Sums_OutputsCorrectHashForEachSpecifiedFile
=== RUN   TestSHA512Sum_OutputsCorrectHash
=== PAUSE TestSHA512Sum_OutputsCorrectHash
=== CONT  TestSHA512Sum_OutputsCorrectHash
=== RUN   TestSHA512Sum_OutputsCorrectHash/for_no_data
=== RUN   TestSHA512Sum_OutputsCorrectHash/for_short_string
=== RUN   TestSHA512Sum_OutputsCorrectHash/for_string_containing_newline
--- PASS: TestSHA512Sum_OutputsCorrectHash (0.00s)
    --- PASS: TestSHA512Sum_OutputsCorrectHash/for_no_data (0.00s)
    --- PASS: TestSHA512Sum_OutputsCorrectHash/for_short_string (0.00s)
    --- PASS: TestSHA512Sum_OutputsCorrectHash/for_string_containing_newline (0.00s)
=== CONT  TestSHA512Sums_OutputsCorrectHashForEachSpecifiedFile
--- PASS: TestSHA512Sums_OutputsCorrectHashForEachSpecifiedFile (0.00s)
=== RUN   ExamplePipe_SHA512Sum
--- PASS: ExamplePipe_SHA512Sum (0.00s)
=== RUN   ExamplePipe_SHA512Sums
--- PASS: ExamplePipe_SHA512Sums (0.00s)

One note, in the original patchset I added a testdata/sha512Sum.input.txt file that has the same content as testdata/sha256Sum.input.txt. This was to meet the contribution guidelines, but I think that it could be beneficial to generalize the test data to work with arbitrary hashing functions (ie something like testdata/hashSum.input.txt).

Documentation

I updated the docs as well to match the SHA-256 functions. I am happy to add some more README examples for both the functions if need be.

@terrorbyte
Copy link
Contributor Author

Opened PR as #114

@bitfield
Copy link
Owner

Thanks @terrorbyte! This looks very good.

One thing I don't see here, despite your heading, is a use case. Is there some real-world user problem that needs to be solved specifically using SHA-512 hashing, rather than just "some arbitrary reasonably secure hash"? If so, we can mention it in the documentation and explain when to use this.

@terrorbyte
Copy link
Contributor Author

For the first bit, here are some very common use case scenarios where I run into SHA-512:

As for personal use cases, I do a lot of automation and scripting in my penetration testing work which can often include checksums for software downloads, custom protocols for package signatures, TLS handshakes, database hashes of clients and people I am testing (given, this is bad practice), and even simple integrity checks.

It is certainly "less common" and often times has little benefit to most cryptographic operations, but it's something that I end up running into regularly.

@bitfield
Copy link
Owner

That's great—could you give an example of the kind of program that you'd like to write with script that needs a SHA-512 hash?

It strikes me that perhaps it would be most flexible to simply provide a Hash method, that defaults to SHA-256 (as perhaps the best trade-off between speed and security, for now), but can take an optional hash.Hasher as argument. So:

script.Echo("data").Hash().Stdout()

but also:

script.Echo("secret password").Hash(sha512.New()).Stdout()

@terrorbyte
Copy link
Contributor Author

terrorbyte commented Apr 27, 2022

Here is a quick hacky example of one such common workflow that I have to do using a curl | sha512 file validation check, a la https://www.elastic.co/blog/sha512-checksums-for-elastic-stack-artifacts (though the links are broken there, use https://www.elastic.co/downloads/kibana for the actual link examples):

package main

import (
        "fmt"
        "io"
        "net/http"
        "os"
        "strings"

        "github.com/bitfield/script"
)

func main() {
        fullURLFile := "https://artifacts.elastic.co/downloads/kibana/kibana-8.1.3-amd64.deb"

        file, err := os.Create("kibana.deb")
        if err != nil {
                fmt.Printf(err.Error(), os.Stderr)
                os.Exit(1)
        }
        client := http.Client{}
        resp, err := client.Get(fullURLFile)
        if err != nil {
                fmt.Printf(err.Error(), os.Stderr)
                os.Exit(1)
        }
        _, err = io.Copy(file, resp.Body)
        file.Close()
        resp.Body.Close()
        respSha, err := client.Get(fullURLFile + ".sha512")
        if err != nil {
                fmt.Printf(err.Error(), os.Stderr)
                os.Exit(1)
        }
        shaSumFile, err := io.ReadAll(respSha.Body)
        if err != nil {
                fmt.Printf(err.Error(), os.Stderr)
                os.Exit(1)
        }
        respSha.Body.Close()
        shaSum := strings.Split(string(shaSumFile), " ")[0]
        fmt.Println("> " + shaSum)

        script.Echo("kibana.deb").SHA512Sums().Stdout()
}

This should output:

> e8a31d895a9f34478b82755d494dc69a77cb311e203f2ba04db18bc48967e52e457a1814855bd39996b42c4b2e9c12d7164ffadee2635941735ec229dfbe9b70
e8a31d895a9f34478b82755d494dc69a77cb311e203f2ba04db18bc48967e52e457a1814855bd39996b42c4b2e9c12d7164ffadee2635941735ec229dfbe9b70

@terrorbyte
Copy link
Contributor Author

As for the API design part, I am of two minds in this regard.

First, I do foresee it becoming pretty common place to utilize other generic hashing algorithms and utilizing the generic Hash with hash.Hasher might be more flexible in real life situations, for example I have a private fork that hash support for sha1, md5, sha384, md4, and some other lesser known algorithms that I use in real life due to my offensive security work but are not "preferable" to expose to users for safety reasons, but it would have been very nice to be able to just throw a hash.Hasher type at this instead of building tests and calls for each. The issue with this comes up when you start getting to some more of the modern hashing algorithm choices such as bcrypt (https://pkg.go.dev/golang.org/x/crypto@v0.0.0-20220411220226-7b82a4e95df4/bcrypt) or scrypt (https://pkg.go.dev/golang.org/x/crypto@v0.0.0-20220411220226-7b82a4e95df4/scrypt), which utilize tunable options and do not return a hash.Hasher type. That also being said, I don't expect these will ever hit stdlib and will be relegated to x/crypto so might be considered to be "non-goals" fitting the stdlib usage requirements.

Second, I don't think having a "default" hash actually makes logical sense in the context of hashing algorithms, because you are rarely using them in optionally and are explicitly dictated which ones to use. For example I'm given specific hashes for checksums and applying the concept of "hashing" to it could mean many different things. If anything, it might be a bit more cumbersome, but the algorithm structure used now might be in line with real use case vs the proposed "optional argument". I'd rather see ${ARBITRARY_ALGORITHM}Sum() or Hash(${ARBITRARY_ALGORITHM}.New()) than making assumptions that could have security impact.

@No-one-important
Copy link
Contributor

While sha-256 is considered secure unless data is salted most passwords can be cracked easily using online tools such as dcode.fr

@terrorbyte
Copy link
Contributor Author

terrorbyte commented May 12, 2022

No one indicated using any of these for password hashes, and if they do they are not reading modern guidance from the last 20 years. Hashing algorithms are used in many more places than just password hashes, please take a moment to read my examples.

For the record, I build password cracking machines and regularly compete in competitions. From a cracking perspective there is only fractional differences between SHA-256 and SHA-512 algorithms cracking speeds in the real world. If you are using any SHA algorithm for password storage without many rounds, salt, and cost factors you are doing something horribly wrong. Modern crypt(5) fully deprecates SHA-512 and SHA-256 even with 5000 rounds for a reason, and I don't foresee people using thousands of pipes of .SHA512Sum()

@mahadzaryab1
Copy link
Contributor

@bitfield What's the status of this? Can I pick it up?

@bitfield
Copy link
Owner

Sure! The consensus API proposal seems to be something like:

  • HashSums is a filter that takes a hash.Hasher and produces hashes of the files named on each line
  • Hash is a sink that takes a hash.Hasher and produces a hash of the pipe's data

@mahadzaryab1
Copy link
Contributor

@bitfield I've got a PR ready for review at #215.

@bitfield
Copy link
Owner

@terrorbyte what do you think of the changes in #215? Do they solve the problem for you?

@terrorbyte
Copy link
Contributor Author

I think that using the hash.Hash interface is much smarter and allows for some definition of parameterized hashing. I'd sign off on this in general, but wonder if we should provide an example of how to define a hash.Hash for something like argon2i to have an example of how you could use parameterized hashing algorithms. It's also probably better to leave that open ended for specific user use cases.

Good work @mahadzaryab1

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants