Idea For A VPN Service

Tuesday, August 31 2021

Recently I’ve been thinking about starting a VPN service. This service has some interesting requirements that I have never seen a VPN service do before, so I’d like to put down my thoughts as to what might be sensible for a centralized yet encrypted* VPN service.

I would license all the code and scripts under the AGPLv3. This creates an environment where I could allow my company to use this code, and any other company for that matter. However, no company would be allowed to take it into their own hands and use it without contributing back to the project.

E2EE VPN

I want this service in many ways to be on par with ProtonMail: end-to-end encrypted (E2EE), and with a focus in data security for the user of the service.

Full encryption, so that even me, the writer and the deployer of the service, cannot view any information about the user: this is the utmost security. The bad news is that this is very hard to do in a convenient way. I’ve decided for now that the best thing to do is to target the Linux nerd. Target the user who is familiar with these advanced security practices, then make them available to the general public as the layers on top of the robust security are refined.

Why?

End-to-end encryption is necessary in a country like Canada, where I may be sent a subpoena to provide customer data. This is the case especially in the Five Eyes anglophone group of countries, who essentially spy on each others’ citizens for eachother. In essence, any data in the hand of one government agency in the “Eyes Countries” may be shared between the Five, Nine, and 14 Eyes countries.

I am not against government surveillance in principle. In theory, the government should be finding bad guys: pedophiles, sex trafficking rings and drug cartels. In practice, the U.S. government especially, uses its authority to spy on its own citizens who are simply minding their own business. ~~Bulk data collection~~ mass surveillance is not a freedom respecting characteristic of modern western democracies. I do run the risk of not being able to help much in the case of a genuine warrant against a genuine, evil criminal. That is the risk of privacy.

That said, let’s see what can be built that can do these 2 things:

Maximize privacy for the user.
Allow for (optional) monetization, depending on the provider. This is in some contradiction to premise 1.

What We Need

A VPN service needs access to some basic information:

Service discontinue time (the amount of time until the customer must renew).
Active connections (a number which can not be exceeded by an individual user).

The client needs access to some information from the server as well:

A list of VPNs able to be connected to (with filters).
For every VPN:
1. IP Address.
2. Maximum bandwidth.
3. Number of connected users or connection saturation percentage.
4. Supported protocols.

Can we do this in a end-to-end encrypted fashion? I’m honestly not sure. But here are my ideas so far as to how some of these functions might work.

How To Do It

“Usernames”

There will be one button to create your account: “Generate username” The username, or unique identifier for a user will be generated for them by a random generator. I plan to generate a username from a list of Base 64 characters; it will be a guaranteed length of 16. This gives a total of: 79228162514264337593543950336 or $7.9 \times 10^{28}$ posibilities. This is sufficient for a username.

The other option is to use a standard “username” field that uses a modern hash function like SHA512 to store it in the database. This is less secure as it is vulnerable to a brute-force attack of finding users, but this is also a very easy attack to defend against, i.e. IP banning after 10-ish tries of not finding a username.

A non-unique, universal salt will also be used on each username before storing it in the database to make it more secure. This decreases the possibility of an advanced attacker being able to find usernames in a leaked database using rainbow tables. That said, the fact that it is a fixed salt makes it much more vulnerable to an attack. Although it would be known only by the server machine, it would still be somewhat of a vulnerability. The operator may also store the salt in an encrypted password store of their own in case the server is erased, broken into, etc. It would be fairly easy, if they have access to the active salt, to migrate to a new salt every few days/months, or perhaps every time a server upgrade/maintenance happens. This does run the possibility of larger issues if the server is shut down or hangs during a migration and needs to be restarted. Many users may end up with accounts they cannot access without manual cleanup.

In the end, the application would need a backup of this salt, otherwise login times would become linear to the number of users as the database checks every user’s salt to see if it matches the hash made with the username input. Note that the database does not store the salt, so finding it will be very hard, even in the case of a leaked database.

So, here’s the overview: The username will be generated, then stored after being salted and hashed. The salt will be a fixed or rolling salt across all usernames to avoid linear scaling of searching for a user. The server will only see the username once, when sending it to the user for them to save for the first time; there will be no database entry with the original username in it.

This does mean that if the username is lost, the account is lost too. There is no way to recover the account. Again, this is ok for now, as my target audience is advanced Linux and privacy enthusiasts.

“Passwords”

There are a few options for passwords/secret keys.

I think the best is to treat it similarly to the username is above, except it will not be generated for you. When a new account is generated, you will be taken to a password reset screen where you will set your password to whatever your want, using your own secure system to handle it. This is ideal for Linux and tech enthusiasts as they generally already have a password management system setup.

This will also be salted, with its own unique salt, then hashed and stored alongside the username.

Active Time Remaining

It is easy and ideal to have a field connected to a user with their expiry date for their account. When a payment is made, this date will be increased by the number of days, hours and minutes proportional to the payment received.

For example: if a “month” (30 days) costs ten dollars, then a payment of fifteen dollars would add 45 days to an account. So essentially 33 cents per day, $\frac{10}{30 \times 24}=0.0138$ dollars per hour, or $\frac{10}{30 \times 24 \times 60}=0.00023\overline{148}$ dollars per minute. This is the second biggest threat to the users’ data privacy, as this, by definition, cannot be encrypted as my server needs access to this data to decide whether a user should be allowed to: view a list of VPN nodes available to them or connect to a VPN. The best I can think of in this case is:

Use a system similar to the username: use a common salt and hash algorithm to store them in the database.
Use full-disk and full-database encryption to keep the data secure to outside attackers.

This is not a fantastic solution, and still has the threat of a service provider snooping in on the database. The truth is: a service provider has root access to any machine it hosts. This necessitates that the physical infrastructure hosting the central database server must by physically owned and operated by the VPN operator and not any third party. In addition, it means top security root passwords, tamper resistant cases (in the case of a co-hosting or server room environment), sensors to indicate it has been opened or touched. If you thought this was bad, wait until the next part.

Active Connections

In order to stop a user from simply using the entire bandwidth of all the VPN nodes available to them, there must be a way to know how many active connections the user has. This is by far the biggest issue in terms of user privacy. There are a few options here:

Do not have a limit on the number of connections a user may have. This is dangerous from a DDoS (distributed denial-of-service) perspective. This also makes the VPN provider vulnerable to be used as a DDoS distribution method by putting all their traffic through the VPN provider, and them not having any logs—the bad guys could use the distributed nature of VPN nodes to attack whoever they see fit. This is not a viable option.
Have a list of connected users sent to the central server every 15 to 30 seconds. This is fairly efficient, but more privacy invasive.
When a user connects, log an explicit “connect” message. When a user disconnects, send an explicit “disconnect”. Have the VPN server report an implicit “disconnect” after an amount of time, say 15 minutes, then send an implicit “connect” message once traffic continues. This is all in RAM under temporary storage and is lost upon restart of the server.

The best method (used currently by Mullvad VPN) is number 3.

Panel

The admin panel will have some broad info about the nodes:

Active connections
Server load (held and reported every minute by the nodes themselves. Not sure how to do this yet.)
Location
IP Address
Failed connections in last X amount of time (i.e. invalid credentials)
Physical server status (i.e. owned by the hoster vs. contracted out to another hosting company in the area)

This panel would also have options to stop, start or soft stop the VPN service on each node for maintenance. A soft stop will stop new connections and remove it from the list of available servers for the end-user. Users will disconnect whenever they feel like it—eventually winding down to zero connections. This allows maintenance without service disruption.

I’m not sure how to do this securely. Best I can think of right now is have an admin login, then have the server have a key in each node machine. This completely compromises the SSH key system though. Now every node is secured with nothing but a password. Maybe the console will require connecting to a local instance on a machine through an encrypted connection which will require a key. Even then, that does make every machine vulnerable to one point of failure (the key to connect to the local instance).

Another way to approach this, security-wise is to make a shell script (or locally running flask app) which reads info about the servers from a sqlite database. Then, it uses the local computer to connect to the servers—assuming the local machine has all the keys necessary to do so.

This fixes one problem and creates another. It fixes the single point of failure in the cloud. This massively reduces the attack surface to intentionally stealing physical hardware from trusted parties, or software-hacking these same trusted people. But, if the key is lost by the host… The entire service is kaput. No maintenance may be performed, no checks, bans, addition of servers can be done whatsoever. This also increases the possibility of sloppy security from trusted parties. Perhaps a trusted member leaves his laptop unattended for a few minutes and a hacker is able to steal the simple key file. He’s in!!! This is very unlikely, I must say, but it comes down to: should I trust people or machines more to keep the data secure. Depending on the person, I might trust them more.

Conclusion

With all of these ideas in mind, I have realized how difficult it really is to make a VPN service. Boy do they deserve every dollar they get! If you don’t have a VPN, get one. Doesn’t really matter which one, unless you’re a nerd—for your average person you can just pick whatever the best deal is at the time and you’re off to the races.

Anyway, I think I’ve rambled on long enough about VPNs and my crazy ideas, so I’m going to leave this one for now.

Happy VPN hacking :D