Discovering the Truth Behind an invalid_file_path Error
Mar 28, 2019
6 mins
Full stack developer @ WTTJ
In the Elixir application we use on our platform, we utilize the Elixir library arc to manage file uploads (similar to Carrierwave for the Rails community). An interesting feature of this library is its ability to upload a file from a remote url. A basic use case is a user being able to provide an endpoint for a form rather than a file to upload. Obviously the endpoint must be public and reachable.
Recently, we noticed a lot of invalid changeset errors were occurring with the message invalid_file_path
during this operation (changesets allow systems to filter, cast, validate, and define constraints when manipulating structs). The weird thing is that requested files come from our own content delivery network (CDN) and files can be fetched directly via the browser.
Understanding the issue
First, we decided to locally reproduce the error using our own code. Unfortunately, we got exactly the same error message and it didn’t help us understand what was happening at all.
To get a more detailed error description, our second idea was to run a code snippet taken directly from the arc library. The part of the code required is the bit that deals with downloading the file from our remote server before storing it locally or remotely (on Amazon S3, for example).
Here is a simplified version of the arc library code:
# https://github.com/stavro/arc/blob/v0.11.0/lib/arc/file.exurl = "https://cdn.example.com/images/avatar.jpg"options = [ follow_redirect: true, recv_timeout: Application.get_env(:arc, :recv_timeout, 5_000), connect_timeout: Application.get_env(:arc, :connect_timeout, 10_000), timeout: Application.get_env(:arc, :timeout, 10_000), max_retries: Application.get_env(:arc, :max_retries, 3), backoff_factor: Application.get_env(:arc, :backoff_factor, 1000), backoff_max: Application.get_env(:arc, :backoff_max, 30_000),]:hackney.get(url, [], "", options)
After running this snippet we got the following error:
[info] ['TLS', 32, 'client', 58, 32, 73, 110, 32, 115, 116, 97, 116, 101, 32, 'certify', 32, 'at ssl_handshake.erl:1335 generated CLIENT ALERT: Fatal - Handshake Failure - {bad_cert,invalid_key_usage}', 10]{:error, {:tls_alert, 'handshake failure'}}
From this message we can easily tell that we had an SSL problem—the handshake can’t be done. The library can’t fetch the file to store it and so returns an invalid_file_path
error message.
Quick fixes
After multiple searches on the Internet, we surmised the issue was due to a problem with either the protocol version (tls/ssl) or with the server name indication (SNI), which corresponds to an extension of the TLS protocol via the hostname it is attempting to connect with at the start of the handshake. Many posts suggest fixing the issue by providing ssl_options
for the hackney requests (or an SSL option for HTTPoison).
We tried two fixes by providing additional options for the initial snippet, the first time by forcing the protocol version to tlsV1.2
, and the second time by providing the server_name_indication
, as seen below:
url = "https://cdn.example.com/images/avatar.jpg"options = [ # ... ssl_options: [versions: [:"tlsv1.2"]] # OR ssl_options: [server_name_indication: 'cdn.example.com']]:hackney.get(url, [], "", options)
Both solutions gave us the following successful response:
{:ok, 200, [...], #Reference<0.3402715975.1686634497.65800>}
Before we submitted any fixes to either our app or third-party library, we were interested in discovering why an up-to-date library like hackney needs basic SSL options to fix the handshake.
Digging deeper into hackney
Hackney is an HTTP client written in Erlang and used on many other HTTP wrapper libraries, such as HTTPoison or Tesla in Elixir world.
Regarding arc, our developers decided to use it directly, as seen in the snippet above.
Check the default SSL connect options on hackney
First of all, we wanted to look at the default SSL options used in hackney to perform the HTTPS connect.
%% https://github.com/benoitc/hackney/blob/1.15.0/src/hackney_connect.erl#L314ssl_opts(Host, Options) -> case proplists:get_value(ssl_options, Options) of undefined -> ssl_opts_1(Host, Options); [] -> ssl_opts_1(Host, Options); SSLOpts -> SSLOpts end.
The first thing we noted, which is crucial, is that if we provided any SSL options for a hackney call, the default options in the library were overridden and not merged.
The default options are outlined below:
%% https://github.com/benoitc/hackney/blob/1.15.0/src/hackney_connect.erl#L324ssl_opts_1(Host, Options) -> Insecure = proplists:get_value(insecure, Options, false), CACerts = certifi:cacerts(), case Insecure of true -> [{verify, verify_none}]; false -> VerifyFun = { fun ssl_verify_hostname:verify_fun/3, [{check_hostname, Host}] }, [{verify, verify_peer}, {depth, 99}, {cacerts, CACerts}, {partial_chain, fun partial_chain/1}, {verify_fun, VerifyFun}] end.
By default, hackney performs certificate verification (against the erlang-certifi Mozilla Certification Authorities (CA) bundle for Erlang) when connecting over HTTPS.
We were therefore able to understand quite quickly that, by providing ssl_options
, the verification is not performed and it seems to be a bad idea to provide any custom ssl_options
or at least partial ssl_options
.
Check the default SSL options on hackney
%% https://github.com/benoitc/hackney/blob/1.15.0/src/hackney_ssl.erl#L62connect(Host, Port, Opts, Timeout) when is_list(Host), is_integer(Port), (Timeout =:= infinity orelse is_integer(Timeout)) -> BaseOpts = [binary, {active, false}, {packet, raw}, {secure_renegotiate, true}, {reuse_sessions, true}, {honor_cipher_order, true}, {versions,['tlsv1.2', 'tlsv1.1', tlsv1, sslv3]}, {ciphers, ciphers()}], Opts1 = hackney_util:merge_opts(BaseOpts, Opts), Host1 = parse_address(Host), %% connect ssl:connect(Host1, Port, Opts1, Timeout).
By default, hackney set the protocol versions to ['tlsv1.2', 'tlsv1.1', ‘tlsv1’, ‘sslv3’]
, so the SSL module from Erlang tries tlsv1.2
first and, if it can’t connect, will try tlsv1.1
. So if our server is set up to accept tlsv1.2
, our quick fix above is useless because it’s the default option in hackney.
Our problem therefore looked to be with the certificate verification. By carrying out tests using curl
(and other systems), we were able to confirm that these systems are able to verify the certificate without any issue.
$ curl https://cdn.example.com --verbose* Rebuilt URL to: https://cdn.example.com/* Trying XX.XXX.XX.XXX...* TCP_NODELAY set* Connected to cdn.example.com (XX.XXX.XX.XXX) port 443 (#0)* ALPN, offering h2* ALPN, offering http/1.1* Cipher selection: ALL:!EXPORT:!EXPORT40:!EXPORT56:!aNULL:!LOW:!RC4:@STRENGTH* successfully set certificate verify locations:* CAfile: /etc/ssl/cert.pem CApath: none* TLSv1.2 (OUT), TLS handshake, Client hello (1):* TLSv1.2 (IN), TLS handshake, Server hello (2):* TLSv1.2 (IN), TLS handshake, Certificate (11):* TLSv1.2 (IN), TLS handshake, Server key exchange (12):* TLSv1.2 (IN), TLS handshake, Server finished (14):* TLSv1.2 (OUT), TLS handshake, Client key exchange (16):* TLSv1.2 (OUT), TLS change cipher, Client hello (1):* TLSv1.2 (OUT), TLS handshake, Finished (20):* TLSv1.2 (IN), TLS change cipher, Client hello (1):* TLSv1.2 (IN), TLS handshake, Finished (20):* SSL connection using TLSv1.2 / ECDHE-RSA-AES128-GCM-SHA256* ALPN, server accepted to use h2* Server certificate:* subject: OU=Domain Control Validated; CN=*.example.com* start date: Jul 20 15:56:38 2016 GMT* expire date: Jul 20 15:56:38 2019 GMT* subjectAltName: host "cdn.example.com" matched certs "*.example.com"* issuer: C=US; ST=Arizona; L=Scottsdale; O=Starfield Technologies, Inc.; OU=http://certs.starfieldtech.com/repository/; CN=Starfield Secure Certificate Authority - G2* SSL certificate verify ok.* Using HTTP2, server supports multi-use* Connection state changed (HTTP/2 confirmed)* Copying HTTP/2 data in stream buffer to connection buffer after upgrade: len=0* Using Stream ID: 1 (easy handle 0x7f8230802a00)> GET / HTTP/2> Host: cdn.example.com> User-Agent: curl/7.54.0> Accept: */*
The log above shows us that curl is able to connect with tlsv1.2 protocol and verify the certificate chain using CAfile: /etc/ssl/cert.pem.
So, what’s the problem?
Certificate chain verification
Here we can see the chain certificate by using openssl
commands, as below:
$ openssl s_client -connect cdn.example.com:443 -servername cdn.example.com </dev/null...Certificate chain 0 s:/OU=Domain Control Validated/CN=*.example.com i:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository/CN=Starfield Secure Certificate Authority - G2 1 s:/OU=Domain Control Validated/CN=*.example.com i:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository/CN=Starfield Secure Certificate Authority - G2 2 s:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./OU=http://certs.starfieldtech.com/repository/CN=Starfield Secure Certificate Authority - G2 i:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Root Certificate Authority - G2 3 s:/C=US/ST=Arizona/L=Scottsdale/O=Starfield Technologies, Inc./CN=Starfield Root Certificate Authority - G2 i:/C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority 4 s:/C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority i:/C=US/O=Starfield Technologies, Inc./OU=Starfield Class 2 Certification Authority...
The openssl
command log shows us that the certificate chain is not perfect: The first link and the second link are the same. The hackney implementation of CA verification is stricter than others and the chain must be perfect to be traversed correctly.
So, here is the invalid_key_usage
: It looks like we had a problem with the way we were serving our certificate. Our CDN is served by Amazon CloudFront and the SSL configuration by AWS Certificate Manager (ACM). To configure the certificate, ACM requires three separate files: Certificate body, certificate private key, and certificate chain. For Nginx users, the certificate body and certificate chain are concatenated into a single file.
By reimporting our certificate correctly, the error disappeared, without the need for code fixes.
Conclusion
Sometimes a basic error can signal more important problems. The Internet contains a lot of resources to fix most issues but it’s really important to understand why the fixes work. Applying a quick fix without knowing what it does can easily lead to a bigger issue.
This article is part of Behind the Code, the media for developers, by developers. Discover more articles and videos by visiting Behind the Code!
Want to contribute? Get published!
Follow us on Twitter to stay tuned!
Illustration by Blok
More inspiration: Coder stories
We can learn a lot by listening to the tales of those that have already paved a path and by meeting people who are willing to share their thoughts and knowledge about programming and technologies.
Keeping up with Swift's latest evolutions
Daniel Steinberg was our guest for an Ask Me Anything session (AMA) dedicated to the evolutions of the Swift language since Swift 5 was released.
May 10, 2021
"We like to think of Opstrace as open-source distribution for observability"
Discover the main insights gained from an AMA session with Sébastien Pahl about Opstrace, an open-source distribution for observability.
Apr 16, 2021
The One Who Co-created Siri
Co-creator of the voice assistant Siri, Luc Julia discusses how the back end for Siri was built at Apple, and shares his vision of the future of AI.
Dec 07, 2020
The Breaking Up of the Global Internet
Only 50 years since its birth, the Internet is undergoing some radical changes.
Nov 26, 2020
On the Importance of Understanding Memory Handling
One concept that can leave developers really scratching their heads is memory, and how programming languages interact with it.
Oct 27, 2020
The newsletter that does the job
Want to keep up with the latest articles? Twice a week you can receive stories, jobs, and tips in your inbox.
Looking for your next job?
Over 200,000 people have found a job with Welcome to the Jungle.
Explore jobs