Table of contents
- Apply Restrictive File Permissions
- Avoid Dangerous File Parsing and Object Serialization Libraries
- Python Pipes to Avoid Shells
- Unvalidated URL redirect
- Validate Certificates on HTTPS Connections to Avoid Man-in-the-Middle Attacks
- Create, Use, and Remove Temporary Files Securely
- Restrict Path Access to Prevent Path Traversal
- Correct
- Use Subprocess Securely
- Parameterize Database Queries
- Protect Sensitive Data in Config Files from Disclosure
- Use Secure Channels for Transmitting Data
- Escape User Input to Prevent XSS Attacks
- Resources
Attacking OpenStack, an open-source cloud computing platform, involves exploiting vulnerabilities in its components and configuration to gain unauthorized access or disrupt services. One common attack vector is through the OpenStack Dashboard (Horizon), which can be susceptible to cross-site scripting (XSS) and cross-site request forgery (CSRF) attacks if not properly secured. Attackers can also target the API endpoints, which are often exposed to the internet. By exploiting insecure API calls or leveraging weak authentication mechanisms, attackers can manipulate cloud resources, potentially leading to data breaches or service interruptions. Furthermore, insecure configurations, such as using default passwords or failing to apply security patches, provide additional opportunities for exploitation.
Another critical aspect of attacking OpenStack is leveraging vulnerabilities within its network and storage components. OpenStack's Neutron, responsible for networking, can be targeted through attacks on the underlying network infrastructure, such as VLAN hopping or manipulating network policies to intercept or reroute traffic. Similarly, the Cinder and Swift services, which handle block and object storage respectively, can be exploited if they are misconfigured, allowing attackers to access or corrupt stored data. Privilege escalation attacks are also a significant threat, where attackers can exploit vulnerabilities in the underlying hypervisor or the Keystone identity service to gain higher-level access, thereby compromising the entire cloud environment. Ensuring comprehensive security measures, regular updates, and stringent access controls are crucial to
Apply Restrictive File Permissions
To ensure the security of files containing sensitive information, such as configuration files with passwords, it is crucial to apply restrictive file permissions. This prevents unauthorized access and potential exploitation through information disclosure or code execution. Files should be created with permissions that limit access to only the owning user or service and group, avoiding any world/other access.
Incorrect Example
Consider a configuration file for a service named "secureserv" stored in secureserv.conf
. If the file permissions are set as follows:
ls -l secureserv.conf
-rw-rw-rw- 1 secureserv secureserv 6710 Feb 17 22:00 secureserv.conf
The file permissions are set to 666
, allowing read and write access to everyone on the system. This is insecure because it exposes the sensitive information contained within the configuration file to all users.
Writing Files with Python
When writing files in a Unix-like operating system using Python, the file permissions are influenced by the system's umask setting. By default, this may be too permissive for sensitive data:
with open('testfile.txt', 'w') as fout:
fout.write("secrets!")
Checking the permissions of the created file:
ls -l testfile.txt
-rw-r--r-- 1 user staff 4 Feb 19 10:59 testfile.txt
The file is readable by others, which is not secure for sensitive information.
Correct Example
To correct this, set restrictive permissions to ensure only the owning user can read and write to the file. Update the file permissions as follows:
chmod 0600 secureserv.conf
ls -l secureserv.conf
-rw------- 1 secureserv secureserv 6710 Feb 17 22:00 secureserv.conf
Secure File Creation in Python
You can securely create a file in Python with appropriate permissions using os.open
with specific flags:
import os
flags = os.O_WRONLY | os.O_CREAT | os.O_EXCL
with os.fdopen(os.open('testfile.txt', flags, 0o600), 'w') as fout:
fout.write("secrets!")
Verify Ownership and Group
It is essential to verify the ownership and group of the file, ensuring only the necessary users have access. Follow the principle of least privilege: if group access is not needed, do not grant it. This approach minimizes the risk of unauthorized access and enhances the overall security of sensitive data files.
Avoid Dangerous File Parsing and Object Serialization Libraries
Many common libraries used for reading configuration files and deserializing objects can be dangerous because they may allow execution of arbitrary code. Libraries such as PyYAML and pickle do not provide strong separation of data and code by default, enabling code to be embedded inside the input.
Often, inputs to these libraries are untrusted or only partially trusted, coming from configuration files or provided via REST APIs. For instance, YAML is frequently used for configuration files but can also contain embedded Python code, which could provide an attacker with a method to execute code.
Many, but not all, of these libraries offer safe interfaces that disable features enabling code execution. It is crucial to use the safe functions to load input. The most obvious function to use is often not the safe one, so always check the documentation for libraries not covered here.
Common libraries used to load data in Python programs include YAML, pickle, and eval. While PyYAML offers a safe way to load data, pickle and eval do not.
Module | Problem | Use | Avoid |
PyYAML | Allows creating arbitrary Python objects. | yaml.safe _load | yaml.load |
pickle | Allows creating arbitrary Python objects. | Do not use | pickle.load , pickle.loads |
cPickle | Allows creating arbitrary Python objects. | Do not use | cPickle.load , cPickle.loads |
eval | Runs all input as Python code | Do not use | eval |
exec | Runs all input as Python code (Python 3.x) | Do not use | exec |
Incorrect Usage
yaml.load
is the obvious function to use but it is dangerous:
import yaml
import pickle
conf_str = '''
!!python/object:__main__.AttackerObj
key: 'value'
'''
conf = yaml.load(conf_str)
Using pickle
or cPickle
with untrusted input can result in arbitrary code execution:
import pickle
import cPickle
user_input = "cos\nsystem\n(S'cat /etc/passwd'\ntR.'\ntR."
cPickle.loads(user_input) # results in code execution
pickle.loads(user_input) # results in code execution
Similarly, eval
and exec
are difficult to use safely with input from an untrusted source:
user_input = "os.system('cat /etc/passwd')"
eval(user_input) # execute python expressions
user_input = "import os; os.system('cat /etc/passwd')"
exec(user_input) # execute _any_ python code
Correct Usage
Here we use PyYAML's safe YAML loading function:
import yaml
conf_str = '''
- key: 'value'
- key: 'value'
'''
conf = yaml.safe_load(conf_str)
There is no safe alternative for pickle.load
. It is generally advisable to avoid using pickle
for serialization of data objects altogether, especially when dealing with untrusted sources.
Python Pipes to Avoid Shells
You should take a look at the shell injection document before this one.
A lot of the time, our codebase uses shell=True
because it’s convenient. The shell provides the ability to pipe things around without buffering them in memory and allows a malicious user to chain additional commands after a legitimate command is run.
Incorrect
Here is a simple function that uses curl
to grab a page from a website and pipe it directly to the wordcount
program to tell us how many lines there are in the HTML source code.
import subprocess
def count_lines(website):
return subprocess.check_output('curl %s | wc -l' % website, shell=True)
# >>> count_lines('www.google.com')
# '7\n'
(That output is correct, by the way - the Google HTML source does have 7 lines.)
The function is insecure because it uses shell=True
, which allows shell injection. A user who instructs your code to fetch the website ; rm -rf /
can do terrible things to what used to be your machine.
If we convert the function to use shell=False
, it doesn’t work.
def count_lines(website):
args = ['curl', website, '|', 'wc', '-l']
return subprocess.check_output(args, shell=False)
# >>> count_lines('www.google.com')
# curl: (6) Could not resolve host: |
# curl: (6) Could not resolve host: wc
# Traceback (most recent call last):
# File "<stdin>", line 3, in <module>
# File "/usr/lib/python2.7/subprocess.py", line 573, in check_output
# raise CalledProcessError(retcode, cmd, output=output)
# subprocess.CalledProcessError: Command
# '['curl', 'www.google.com', '|', 'wc', '-l']' returned non-zero exit status 6
The pipe doesn’t mean anything special when shell=False
, so curl
tries to download the website called |
. This does not fix the issue; it causes it to be more broken than before.
If we can’t rely on pipes with shell=False
, how should we do this?
Correct
def count_lines(website):
args = ['curl', website]
args2 = ['wc', '-l']
process_curl = subprocess.Popen(args, stdout=subprocess.PIPE, shell=False)
process_wc = subprocess.Popen(args2, stdin=process_curl.stdout, stdout=subprocess.PIPE, shell=False)
# Allow process_curl to receive a SIGPIPE if process_wc exits.
process_curl.stdout.close()
return process_wc.communicate()[0]
# >>> count_lines('www.google.com')
# '7\n'
Rather than calling a single shell process that runs each of our programs, we run them separately and connect stdout
from curl
to stdin
for wc
. We specify stdout=subprocess.PIPE
, which tells subprocess to send that output to the respective file handler.
Treat pipes like file descriptors (you can actually use FDs if you want); they may block on reading and writing if nothing is connected to the other end. That’s why we use communicate()
, which reads until EOF on the output and then waits for the process to terminate. You should generally avoid reading and writing to pipes directly unless you really know what you’re doing - it’s easy to work yourself into a situation that can deadlock.
Note that communicate()
buffers the result in memory - if that’s not what you want, use a file descriptor for stdout
to pipe that output into a file.
Unvalidated URL redirect
It is common for web forms to redirect to a different page upon successful submission of the form data. This is often done using a next
or return
parameter in the HTTP request. Any HTTP parameter can be controlled by the user and could be abused by attackers to redirect a user to a malicious site.
This is commonly used in phishing attacks. For example, an attacker could redirect a user from a legitimate login form to a fake, attacker-controlled login form. If the page looks enough like the target site and tricks the user into believing they mistyped their password, the attacker can convince the user to re-enter their credentials and send them to the attacker.
Here is an example of a malicious redirect URL:
https://good.com/login.php?next=http://bad.com/phonylogin.php
To counter this type of attack, all URLs must be validated before being used to redirect the user. This should ensure the redirect will take the user to a page within your site.
Incorrect
This example just processes the next
argument with no validation:
import os
from flask import Flask, redirect, request
app = Flask(__name__)
@app.route('/')
def example_redirect():
return redirect(request.args.get('next'))
Correct
The following is an example using the Flask web framework. It checks that the URL the user is being redirected to originates from the same host as the host serving the content.
from flask import request, redirect
from urllib.parse import urlparse, urljoin
def is_safe_redirect_url(target):
host_url = urlparse(request.host_url)
redirect_url = urlparse(urljoin(request.host_url, target))
return redirect_url.scheme in ('http', 'https') and host_url.netloc == redirect_url.netloc
def get_safe_redirect():
url = request.args.get('next')
if url and is_safe_redirect_url(url):
return url
url = request.referrer
if url and is_safe_redirect_url(url):
return url
return '/'
@app.route('/')
def example_redirect():
return redirect(get_safe_redirect())
The is_safe_redirect_url
function checks that the scheme of the redirect URL is either http
or https
and that the netloc (network location part) of the redirect URL matches the host URL. This ensures that the user is only redirected to URLs within the same site.
The Django framework contains a django.utils.http.is
_safe_url
function that can be used to validate redirects without implementing a custom version. This built-in function provides a secure and efficient way to handle URL redirects, ensuring that users are only redirected to safe and trusted URLs within the same domain.
Validate Certificates on HTTPS Connections to Avoid Man-in-the-Middle Attacks
When developing a module that makes secure HTTPS connections, it is crucial to use a library that verifies certificates. Many such libraries provide an option to ignore certificate verification failures, but these options should be exposed to the deployer to choose their level of risk.
Although this guideline specifically mentions HTTPS, verifying the identity of the hosts you are connecting to applies to most protocols (SSH, LDAPS, etc.).
Incorrect
import requests
requests.get('https://www.openstack.org/', verify=False)
The example above uses verify=False
to bypass the check of the certificate received against those in the CA trust store.
It is important to note that modules such as httplib
within the Python standard library did not verify certificate chains until it was fixed in the 2.7.9 release. For more specifics about the modules affected, refer to CVE-2014-9365.
Correct
import requests
requests.get('https://www.openstack.org/', verify='/path/to/ca_cert.pem')
The example above uses the CA certificate file to verify that the certificate received is from a trusted authority.
Create, Use, and Remove Temporary Files Securely
Creating temporary files safely is more complicated than it seems due to potential security vulnerabilities. It's crucial to use the correct library functions to avoid issues like TOCTOU (Time of Check to Time of Use) attacks.
Python Temporary File Handling
Use | Avoid |
tempfile.TemporaryFile | tempfile.mktemp |
tempfile.NamedTemporaryFile | open |
tempfile.SpooledTemporaryFile | tempfile.mkstemp |
tempfile.mkdtemp |
tempfile.TemporaryFile
should be used whenever possible as it hides the file and cleans up the file automatically.
Incorrect
Creating temporary files with predictable paths leaves them open to TOCTOU attacks:
import os
import tempfile
# This will most certainly put you at risk
tmp = os.path.join(tempfile.gettempdir(), "filename")
if not os.path.exists(tmp):
with open(tmp, "w") as file:
file.write("defaults")
Using tempfile.mktemp
is also insecure:
import tempfile
open(tempfile.mktemp(), "w")
Correct
Use the secure methods provided by the tempfile
module:
import tempfile
# Use the TemporaryFile context manager for easy clean-up
with tempfile.TemporaryFile() as tmp:
tmp.write(b'stuff')
# Clean up a NamedTemporaryFile on your own
tmp = tempfile.NamedTemporaryFile(delete=True)
try:
tmp.write(b'stuff')
finally:
tmp.close()
# Handle opening the file yourself
fd, path = tempfile.mkstemp()
try:
with os.fdopen(fd, 'w') as tmp:
tmp.write('stuff')
finally:
os.remove(path)
Restrict Path Access to Prevent Path Traversal
A path traversal attack occurs when an attacker supplies input used to construct a file path, allowing access to unintended files. Mitigating these attacks involves restricting file system access and using a restricted file permission profile.
Incorrect
An example of a path traversal vulnerability in a web application:
import os
from flask import Flask, request, send_file
app = Flask(__name__)
@app.route('/')
def cat_picture():
image_name = request.args.get('image_name')
if not image_name:
return 404
return send_file(os.path.join(os.getcwd(), image_name))
if __name__ == '__main__':
app.run(debug=True)
An attacker could exploit this by making a request like:
curl http://example.com/?image_name=../../../../../../../../etc/passwd
Correct
Use the following method to restrict access to files within a specific directory:
import os
def is_safe_path(basedir, path, follow_symlinks=True):
if follow_symlinks:
matchpath = os.path.realpath(path)
else:
matchpath = os.path.abspath(path)
return basedir == os.path.commonpath([basedir, matchpath])
def main(args):
for arg in args:
if is_safe_path(os.getcwd(), arg):
print(f"safe: {arg}")
else:
print(f"unsafe: {arg}")
if __name__ == "__main__":
import sys
main(sys.argv[1:])
Another approach is to use an indirect mapping between a unique identifier and a file path:
localfiles = {
"01": "/var/www/img/001.png",
"02": "/var/www/img/002.png",
"03": "/var/www/img/003.png",
}
def get_file(file_id):
return open(localfiles[file_id])
Use Subprocess Securely
Running commands via subprocess is common but can be insecure if not handled correctly.
Incorrect
def ping(myserver):
return subprocess.check_output(f'ping -c 1 {myserver}', shell=True)
# >>> ping('8.8.8.8')
# '64 bytes from 8.8.8.8: icmp_seq=1 ttl=58 time=5.82 ms\n'
This code is vulnerable to shell injection:
ping('8.8.8.8; rm -rf /')
Correct
def ping(myserver):
args = ['ping', '-c', '1', myserver]
return subprocess.check_output(args, shell=False)
By passing a list of strings, the shell does not process user input as commands:
ping('8.8.8.8; rm -rf /')
# ping: unknown host 8.8.8.8; rm -rf /
Using subprocess correctly ensures user-provided input is interpreted safely.
Parameterize Database Queries
Often we write code that interacts with a database using parameters provided by the application’s users. These parameters include credentials, resource identifiers, and other user-supplied data.
Care must be taken when dynamically creating database queries to prevent them from being subverted by user-supplied malicious input. This is generally referred to as SQL injection (SQLi). SQL injection works because the user input changes the logic of the SQL query, resulting in behavior that is not intended by the application developer.
The results of a successful SQL injection attack can include disclosure of sensitive information such as user passwords, modification or deletion of data, and gaining execution privileges, which would allow an attacker to run arbitrary commands on the database server.
SQL injection can typically be mitigated by using some combination of prepared statements, stored procedures, and escaping of user-supplied input. Most secure web applications will use all three, and we have described their use below.
SQLAlchemy
Incorrect
This example uses the built-in parameter substitution mechanism %
to insert a value into the query string. This will perform an unsafe literal insertion and not provide any escaping.
import sqlalchemy
connection = engine.connect()
myvar = 'jsmith' # our intended usage
myvar = 'jsmith or 1=1' # this will return all users
myvar = 'jsmith; DROP TABLE users' # this drops (removes) the users table
query = "select username from users where username = %s" % myvar
result = connection.execute(query)
for row in result:
print("username:", row['username'])
connection.close()
Correct
This example uses SQLAlchemy’s built-in parameter substitution mechanism to safely replace the :name
variable with a provided value.
import sqlalchemy
connection = engine.connect()
myvar = 'jsmith' # our intended usage
myvar = 'jsmith or 1=1' # only matches this odd username
query = "select username from users where username = :name"
result = connection.execute(query, name=myvar)
for row in result:
print("username:", row['username'])
connection.close()
MySQL
Incorrect
Without using any escaping mechanism, potentially unsafe queries can be created.
import MySQLdb
query = "select username from users where username = '%s'" % name
con = MySQLdb.connect('localhost', 'testuser', 'test623', 'testdb')
with con:
cur = con.cursor()
cur.execute(query)
Better
In this example, the query is created using Python's standard, unsafe %
operator. MySQL’s escape_string
method is used to perform escaping on the user input string prior to inclusion in the string.
import MySQLdb
query = "select username from users where username = '%s'" % MySQLdb.escape_string(name)
con = MySQLdb.connect('localhost', 'testuser', 'test623', 'testdb')
with con:
cur = con.cursor()
cur.execute(query)
Correct
The correct way to do this using a parameterized query might look like the following:
import MySQLdb
query = "select username from users where username = %s"
con = MySQLdb.connect('localhost', 'testuser', 'test623', 'testdb')
with con:
cur = con.cursor()
cur.execute(query, (username_value,))
This works because the logic of the query is compiled before the user input is considered.
PostgreSQL (Psycopg2)
Incorrect
This example uses Python’s unsafe default parameter substitution mechanism to build a query string. This will not perform any escaping, unlike the correct example below. The string is processed and passed as a single parameter to execute
.
import psycopg2
conn = psycopg2.connect("dbname=test user=postgres")
cur = conn.cursor()
cur.execute("select username from users where username = '%s'" % name)
Correct
This example uses Psycopg2’s parameter substitution mechanism to build a query string. Despite the use of %
to indicate the substitution token, it is not the same as Python’s built-in string operator %
. Note the value(s) are passed as parameters to execute
separately.
import psycopg2
conn = psycopg2.connect("dbname=test user=postgres")
cur = conn.cursor()
cur.execute("select username from users where username = %s", (name,))
Protect Sensitive Data in Config Files from Disclosure
It is preferable to avoid storing sensitive information in configuration files, but there are occasions where this is unavoidable. For those situations, oslo.config
provides a useful mechanism by which those sensitive pieces of information can be sanitized and protected.
In order to trigger this sanitization, a secret=True
flag must be added to the cfg.StrOpt()
function when registering the Oslo configuration.
An example of this practice is provided below.
Incorrect
In the example below, the password secrets!
will be loaded through the cfg.StrOpt()
function that could otherwise be logged and disclosed to anyone with access to the log file (legitimate or not).
cfg.StrOpt('password',
help='Password of the host.')
Correct
A correct code example:
cfg.StrOpt('password',
help='Password of the host.',
secret=True)
Consequences
If sensitive information options are logged without being marked as secret, that sensitive information would be exposed whenever the logger debug flag is activated.
Example Log Entries
2015-02-18 20:46:48.928 25351 DEBUG nova.openstack.common.service [-] Full set of CONF: _wait_for_exit_or_signal /usr/lib/python2.7/dist-packages/nova/openstack/common/service.py:166
2015-02-18 20:46:48.937 25351 DEBUG nova.openstack.common.service [-] Configuration options gathered from: log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:1982
2015-02-18 20:46:51.482 25351 DEBUG nova.openstack.common.service [-] host.ip = 127.0.0.1 log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:2002
2015-02-18 20:46:51.491 25351 DEBUG nova.openstack.common.service [-] host.port = 443 log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:2002
2015-02-18 20:46:51.502 25351 DEBUG nova.openstack.common.service [-] host.username = root log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:2002
2015-02-18 20:46:51.486 25351 DEBUG nova.openstack.common.service [-] host.password = secrets! log_opt_values /usr/lib/python2.7/dist-packages/oslo/config/cfg.py:2002
Use Secure Channels for Transmitting Data
Data in transit over networks should be protected wherever possible. Although some data may not appear to have strong confidentiality or integrity requirements, it is best practice to secure it.
When building any application that communicates over a network, we have to assume that we do not control the network that the data travels over. We should consider that the network may have hostile actors who will attempt to view or change the data that we are transmitting.
Clear Example
OpenStack API calls often contain credentials or tokens that are very sensitive. If they are sent in plaintext, they may be modified or stolen.
It is very important that API calls are protected from malicious third parties viewing them or tampering with their content - even for communications between services on an internal network.
Less Obvious Example
Consider a server process that reports the current number of stars in the sky and sends the data over the network to clients using a simple webpage. There is no strong confidentiality requirement for this; the data is not secret. However, integrity is important. An attacker on the network could alter the communications going from the server to clients and inject malicious traffic such as browser exploits into the HTTP stream, thus compromising vulnerable clients.
Incorrect
cfg.StrOpt('protocol',
default='http',
help='Default protocol to use when connecting to glance.')
Correct
cfg.StrOpt('protocol',
default='https',
help='Default protocol to use when connecting to glance.')
Escape User Input to Prevent XSS Attacks
These days, almost every service we create has some form of web interface, be it for administration, monitoring, or for the core functionality of the service. These interfaces are becoming ever more complex and dynamic, and increasingly interactive. There is a risk, however, when increasing the interactivity of these web services, that we inadvertently allow a user to supply data which can corrupt or disrupt the normal running of that service.
Cross-Site Scripting (XSS) is a class of vulnerability whereby an attacker is able to present active web content to a web service, which is subsequently echoed back to a user and executed by the browser. This content can be as seemingly benign as an embarrassing image or text, or as malign as browser-based exploits intended to steal and utilize the user’s web session, or even compromise the user’s web browser and take control of their client system.
There are three main classes of XSS issues: Persistent, Reflected, and DOM-Based. Persistent XSS issues are those where user input is stored by the server, either in a database or on the filesystem. This data is subsequently retrieved and rendered into the web page as if it were legitimate content. Reflected XSS issues are where user input is presented to the server and then immediately reflected back to the client. This data is never stored by the server. DOM-Based XSS issues occur entirely within the browser.
Incorrect
In the below example, user-supplied data from the query string is written directly into the HTML document and is presented to the user.
def index(request):
name = request.GET.get('name')
return f"<html><body>Hello {name}!</body></html>"
Correct
In the below example, the same user-supplied data is escaped before being written into the document, thus preventing the user from adding HTML elements to the document.
from html import escape
def index(request):
name = request.GET.get('name')
safe_name = escape(name)
return f"<html><body>Hello {safe_name}!</body></html>"
In addition to the use of escaping, frameworks and templating engines can also be useful in preventing XSS attacks.