Django пытается поддерживать возможности различных бэкендов по максимуму. Однако, базы данных работают по разному, и нам приходится принимать решение какие возможности поддерживать, чтобы они безопасно работали для всех бэкендов.
Этот раздел описывает работу Django с различными базами данных. Конечно же он не заменит вам документацию базы данных.
Постоянные соединения улучшают производительность, позволяя не создавать новое подключение к базе данных при каждом запросе. Настройка CONN_MAX_AGE указывает как долго существует соединение. Эту настройку можно указывать для каждой базы данных отдельно.
По умолчанию она равна 0, и соединение будет закрывать после обработки каждого запроса. Это сделано для обратной совместимости. Чтобы активировать постоянное соединение, укажите количество секунд в CONN_MAX_AGE. Чтобы соединение существовало постоянно, укажите None.
Django выполняет подключение к базе данных при первом запросе. Он держит соединение и использует его для последующих запросов. Django закрывает соединение по истечению CONN_MAX_AGE, или когда оно не может быть больше использовано.
Точнее, Django автоматически создаете соединение с базой данных, если оно необходимо, и нет открытого соединения, потому что это первое соединение, или предыдущее было закрыто.
В начале каждого запроса Django закрывает соединение, если истек его срок. Если ваша база данных закрывает соединение после определенного времени, вам следует указать меньшее значение в CONN_MAX_AGE, чтобы Django не пытался использовать закрытое соединение. (Эта проблема может возникнуть только на сайтах с низким трафиком.)
В конце каждого запроса Django закрывает соединение, если истек его срок, или, если соединение находится в состоянии неисправимой ошибки. Если в процессе обработки запроса произошла ошибка базы данных, Django проверяет работает ли соединение, и закрывает его, если оно не работает. Таким образом ошибка базы данных влияет только на один запрос, для последующих запросов будет создано новое соединение.
Так как каждый поток использует отдельное подключение, ваша база данных должна поддерживать необходимое количество соединений.
В некоторых случаях большинство представлений могут не использовать базу данных, например, это база данных внешней системы, или благодаря кэшированию. В таких случаях укажите небольшое значение в CONN_MAX_AGE или 0, чтобы не поддерживать соединение, которое почти не используется. Это позволит уменьшить количество одновременных подключений к базе данных.
Сервер для разработки создает отдельны поток для каждого запроса, по этому нет смысла использовать постоянные подключения во время разработки.
При подключении к базе данных, устанавливаются различные параметры подключения в соответствии с используемым бэкендом. При постоянном соединении это не будет выполняться при каждом запросе. Если вы измените параметры подключения, например, уровень изоляции транзакций или часовой пояс, вам следует восстановить параметры по умолчанию после обработки запроса, или устанавливать в начале каждого запроса, или отключить постоянные соединения.
Django предполагает, что все базы данных используют UTF-8. Использование другой кодировки может привести к неожиданному поведению, например ошибке “value too long”, в то время, как данные правильны для Django. Смотрите ниже как настраивать различные базы данных.
Django поддерживает PostgreSQL 9.0 и выше. Для этого необходимо использовать psycopg2 2.4.5 и выше (или 2.5+, если вы хотите использовать django.contrib.postgres).
Если вы используете Windows, можете использовать наши неофициальные сборки psycopg2 под Windows.
Django необходимы следующие параметры для подключения:
default_transaction_isolation: 'read committed' по умолчанию, или значение их параметров подключения (смотрите ниже),
timezone: 'UTC' если USE_TZ равно True, иначе значение TIME_ZONE.
Если эти параметры уже содержат правильные значения, Django не будет устанавливать их при каждом запросе, что немного улучшит производительность. Вы можете настроить их непосредственно в postgresql.conf, или для каждой базы данных отдельно, используя ALTER ROLE.
Django отлично работает и без этой оптимизации, но для каждого соединения будут выполняться дополнительные запросы для установки параметров.
Как и в PostgreSQL, Django по умолчанию использует READ COMMITTED уровень изоляции транзакций. Если вам нужен более высокий уровень изоляции, такой как REPEATABLE READ или SERIALIZABLE, укажите его в параметрах OPTIONS настройки DATABASES:
import psycopg2.extensions
DATABASES = {
# ...
'OPTIONS': {
'isolation_level': psycopg2.extensions.ISOLATION_LEVEL_SERIALIZABLE,
},
}
Примечание
На более высоком уровне изоляции транзакций ваше приложение должно быть готовым обрабатывать исключения, которые могут возникнуть при сериализации. Этот параметр предназначен для опытных пользователей.
When specifying db_index=True on your model fields, Django typically outputs a single CREATE INDEX statement. However, if the database type for the field is either varchar or text (e.g., used by CharField, FileField, and TextField), then Django will create an additional index that uses an appropriate PostgreSQL operator class for the column. The extra index is necessary to correctly perform lookups that use the LIKE operator in their SQL, as is done with the contains and startswith lookup types.
Django supports MySQL 5.5 and higher.
Django’s inspectdb feature uses the information_schema database, which contains detailed data on all database schemas.
Django expects the database to support Unicode (UTF-8 encoding) and delegates to it the task of enforcing transactions and referential integrity. It is important to be aware of the fact that the two latter ones aren’t actually enforced by MySQL when using the MyISAM storage engine, see the next section.
MySQL has several storage engines. You can change the default storage engine in the server configuration.
Until MySQL 5.5.4, the default engine was MyISAM [1]. The main drawbacks of MyISAM are that it doesn’t support transactions or enforce foreign-key constraints. On the plus side, it was the only engine that supported full-text indexing and searching until MySQL 5.6.4.
Since MySQL 5.5.5, the default storage engine is InnoDB. This engine is fully transactional and supports foreign key references. It’s probably the best choice at this point. However, note that the InnoDB autoincrement counter is lost on a MySQL restart because it does not remember the AUTO_INCREMENT value, instead recreating it as “max(id)+1”. This may result in an inadvertent reuse of AutoField values.
If you upgrade an existing project to MySQL 5.5.5 and subsequently add some tables, ensure that your tables are using the same storage engine (i.e. MyISAM vs. InnoDB). Specifically, if tables that have a ForeignKey between them use different storage engines, you may see an error like the following when running migrate:
_mysql_exceptions.OperationalError: (
1005, "Can't create table '\\db_name\\.#sql-4a8_ab' (errno: 150)"
)
[1] | Unless this was changed by the packager of your MySQL package. We’ve had reports that the Windows Community Server installer sets up InnoDB as the default storage engine, for example. |
The Python Database API is described in PEP 249. MySQL has three prominent drivers that implement this API:
All these drivers are thread-safe and provide connection pooling. MySQLdb is the only one not supporting Python 3 currently.
In addition to a DB API driver, Django needs an adapter to access the database drivers from its ORM. Django provides an adapter for MySQLdb/mysqlclient while MySQL Connector/Python includes its own.
Django requires MySQLdb version 1.2.1p2 or later.
At the time of writing, the latest release of MySQLdb (1.2.5) doesn’t support Python 3. In order to use MySQLdb under Python 3, you’ll have to install mysqlclient instead.
Примечание
There are known issues with the way MySQLdb converts date strings into datetime objects. Specifically, date strings with value 0000-00-00 are valid for MySQL but will be converted into None by MySQLdb.
This means you should be careful while using loaddata and dumpdata with rows that may have 0000-00-00 values, as they will be converted to None.
Django requires mysqlclient 1.3.3 or later. Note that Python 3.2 is not supported. Except for the Python 3.3+ support, mysqlclient should mostly behave the same as MySQLDB.
MySQL Connector/Python is available from the download page. The Django adapter is available in versions 1.1.X and later. It may not support the most recent releases of Django.
If you plan on using Django’s timezone support, use mysql_tzinfo_to_sql to load time zone tables into the MySQL database. This needs to be done just once for your MySQL server, not per database.
You can create your database using the command-line tools and this SQL:
CREATE DATABASE <dbname> CHARACTER SET utf8;
This ensures all tables and columns will use UTF-8 by default.
The collation setting for a column controls the order in which data is sorted as well as what strings compare as equal. It can be set on a database-wide level and also per-table and per-column. This is documented thoroughly in the MySQL documentation. In all cases, you set the collation by directly manipulating the database tables; Django doesn’t provide a way to set this on the model definition.
By default, with a UTF-8 database, MySQL will use the utf8_general_ci collation. This results in all string equality comparisons being done in a case-insensitive manner. That is, "Fred" and "freD" are considered equal at the database level. If you have a unique constraint on a field, it would be illegal to try to insert both "aa" and "AA" into the same column, since they compare as equal (and, hence, non-unique) with the default collation.
In many cases, this default will not be a problem. However, if you really want case-sensitive comparisons on a particular column or table, you would change the column or table to use the utf8_bin collation. The main thing to be aware of in this case is that if you are using MySQLdb 1.2.2, the database backend in Django will then return bytestrings (instead of unicode strings) for any character fields it receive from the database. This is a strong variation from Django’s normal practice of always returning unicode strings. It is up to you, the developer, to handle the fact that you will receive bytestrings if you configure your table(s) to use utf8_bin collation. Django itself should mostly work smoothly with such columns (except for the contrib.sessions Session and contrib.admin LogEntry tables described below), but your code must be prepared to call django.utils.encoding.smart_text() at times if it really wants to work with consistent data – Django will not do this for you (the database backend layer and the model population layer are separated internally so the database layer doesn’t know it needs to make this conversion in this one particular case).
If you’re using MySQLdb 1.2.1p2, Django’s standard CharField class will return unicode strings even with utf8_bin collation. However, TextField fields will be returned as an array.array instance (from Python’s standard array module). There isn’t a lot Django can do about that, since, again, the information needed to make the necessary conversions isn’t available when the data is read in from the database. This problem was fixed in MySQLdb 1.2.2, so if you want to use TextField with utf8_bin collation, upgrading to version 1.2.2 and then dealing with the bytestrings (which shouldn’t be too difficult) as described above is the recommended solution.
Should you decide to use utf8_bin collation for some of your tables with MySQLdb 1.2.1p2 or 1.2.2, you should still use utf8_general_ci (the default) collation for the django.contrib.sessions.models.Session table (usually called django_session) and the django.contrib.admin.models.LogEntry table (usually called django_admin_log). Those are the two standard tables that use TextField internally.
Please note that according to MySQL Unicode Character Sets, comparisons for the utf8_general_ci collation are faster, but slightly less correct, than comparisons for utf8_unicode_ci. If this is acceptable for your application, you should use utf8_general_ci because it is faster. If this is not acceptable (for example, if you require German dictionary order), use utf8_unicode_ci because it is more accurate.
Предупреждение
Model formsets validate unique fields in a case-sensitive manner. Thus when using a case-insensitive collation, a formset with unique field values that differ only by case will pass validation, but upon calling save(), an IntegrityError will be raised.
Refer to the settings documentation.
Connection settings are used in this order:
In other words, if you set the name of the database in OPTIONS, this will take precedence over NAME, which would override anything in a MySQL option file.
Here’s a sample configuration which uses a MySQL option file:
# settings.py
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.mysql',
'OPTIONS': {
'read_default_file': '/path/to/my.cnf',
},
}
}
# my.cnf
[client]
database = NAME
user = USER
password = PASSWORD
default-character-set = utf8
Several other MySQLdb connection options may be useful, such as ssl, init_command, and sql_mode. Consult the MySQLdb documentation for more details.
When Django generates the schema, it doesn’t specify a storage engine, so tables will be created with whatever default storage engine your database server is configured for. The easiest solution is to set your database server’s default storage engine to the desired engine.
If you’re using a hosting service and can’t change your server’s default storage engine, you have a couple of options.
After the tables are created, execute an ALTER TABLE statement to convert a table to a new storage engine (such as InnoDB):
ALTER TABLE <tablename> ENGINE=INNODB;
This can be tedious if you have a lot of tables.
Another option is to use the init_command option for MySQLdb prior to creating your tables:
'OPTIONS': {
'init_command': 'SET storage_engine=INNODB',
}
This sets the default storage engine upon connecting to the database. After your tables have been created, you should remove this option as it adds a query that is only needed during table creation to each database connection.
There are known issues in even the latest versions of MySQL that can cause the case of a table name to be altered when certain SQL statements are executed under certain conditions. It is recommended that you use lowercase table names, if possible, to avoid any problems that might arise from this behavior. Django uses lowercase table names when it auto-generates table names from models, so this is mainly a consideration if you are overriding the table name via the db_table parameter.
Both the Django ORM and MySQL (when using the InnoDB storage engine) support database savepoints.
If you use the MyISAM storage engine please be aware of the fact that you will receive database-generated errors if you try to use the savepoint-related methods of the transactions API. The reason for this is that detecting the storage engine of a MySQL database/table is an expensive operation so it was decided it isn’t worth to dynamically convert these methods in no-op’s based in the results of such detection.
Any fields that are stored with VARCHAR column types have their max_length restricted to 255 characters if you are using unique=True for the field. This affects CharField, SlugField and CommaSeparatedIntegerField.
MySQL 5.6.4 and later can store fractional seconds, provided that the column definition includes a fractional indication (e.g. DATETIME(6)). Earlier versions do not support them at all. In addition, versions of MySQLdb older than 1.2.5 have a bug that also prevents the use of fractional seconds with MySQL.
Django will not upgrade existing columns to include fractional seconds if the database server supports it. If you want to enable them on an existing database, it’s up to you to either manually update the column on the target database, by executing a command like:
ALTER TABLE `your_table` MODIFY `your_datetime_column` DATETIME(6)
or using a RunSQL operation in a data migration.
Previously, Django truncated fractional seconds from datetime and time values when using the MySQL backend. Now it lets the database decide whether it should drop that part of the value or not. By default, new DateTimeField or TimeField columns are now created with fractional seconds support on MySQL 5.6.4 or later with either mysqlclient or MySQLdb 1.2.5 or later.
If you are using a legacy database that contains TIMESTAMP columns, you must set USE_TZ = False to avoid data corruption. inspectdb maps these columns to DateTimeField and if you enable timezone support, both MySQL and Django will attempt to convert the values from UTC to local time.
MySQL does not support the NOWAIT option to the SELECT ... FOR UPDATE statement. If select_for_update() is used with nowait=True then a DatabaseError will be raised.
When performing a query on a string type, but with an integer value, MySQL will coerce the types of all values in the table to an integer before performing the comparison. If your table contains the values 'abc', 'def' and you query for WHERE mycolumn=0, both rows will match. Similarly, WHERE mycolumn=1 will match the value 'abc1'. Therefore, string type fields included in Django will always cast the value to a string before using it in a query.
If you implement custom model fields that inherit from Field directly, are overriding get_prep_value(), or use extra() or raw(), you should ensure that you perform the appropriate typecasting.
SQLite provides an excellent development alternative for applications that are predominantly read-only or require a smaller installation footprint. As with all database servers, though, there are some differences that are specific to SQLite that you should be aware of.
For all SQLite versions, there is some slightly counter-intuitive behavior when attempting to match some types of strings. These are triggered when using the iexact or contains filters in Querysets. The behavior splits into two cases:
1. For substring matching, all matches are done case-insensitively. That is a filter such as filter(name__contains="aa") will match a name of "Aabb".
2. For strings containing characters outside the ASCII range, all exact string matches are performed case-sensitively, even when the case-insensitive options are passed into the query. So the iexact filter will behave exactly the same as the exact filter in these cases.
Some possible workarounds for this are documented at sqlite.org, but they aren’t utilized by the default SQLite backend in Django, as incorporating them would be fairly difficult to do robustly. Thus, Django exposes the default SQLite behavior and you should be aware of this when doing case-insensitive or substring filtering.
SQLite 3.6.23.1 and older contains a bug when handling query parameters in a CASE expression that contains an ELSE and arithmetic.
SQLite 3.6.23.1 was released in March 2010, and most current binary distributions for different platforms include a newer version of SQLite, with the notable exception of the Python 2.7 installers for Windows.
As of this writing, the latest release for Windows - Python 2.7.9 - includes SQLite 3.6.21. You can install pysqlite2 or replace sqlite3.dll (by default installed in C:\Python27\DLLs) with a newer version from http://www.sqlite.org/ to remedy this issue.
Django will use a pysqlite2 module in preference to sqlite3 as shipped with the Python standard library if it finds one is available.
This provides the ability to upgrade both the DB-API 2.0 interface or SQLite 3 itself to versions newer than the ones included with your particular Python binary distribution, if needed.
SQLite is meant to be a lightweight database, and thus can’t support a high level of concurrency. OperationalError: database is locked errors indicate that your application is experiencing more concurrency than sqlite can handle in default configuration. This error means that one thread or process has an exclusive lock on the database connection and another thread timed out waiting for the lock the be released.
Python’s SQLite wrapper has a default timeout value that determines how long the second thread is allowed to wait on the lock before it times out and raises the OperationalError: database is locked error.
If you’re getting this error, you can solve it by:
Switching to another database backend. At a certain point SQLite becomes too “lite” for real-world applications, and these sorts of concurrency errors indicate you’ve reached that point.
Rewriting your code to reduce concurrency and ensure that database transactions are short-lived.
Increase the default timeout value by setting the timeout database option:
'OPTIONS': {
# ...
'timeout': 20,
# ...
}
This will simply make SQLite wait a bit longer before throwing “database is locked” errors; it won’t really do anything to solve them.
SQLite does not support the SELECT ... FOR UPDATE syntax. Calling it will have no effect.
For most backends, raw queries (Manager.raw() or cursor.execute()) can use the “pyformat” parameter style, where placeholders in the query are given as '%(name)s' and the parameters are passed as a dictionary rather than a list. SQLite does not support this.
sqlite3 does not provide a way to retrieve the SQL after quoting and substituting the parameters. Instead, the SQL in connection.queries is rebuilt with a simple string interpolation. It may be incorrect. Make sure you add quotes where necessary before copying a query into an SQLite shell.
Django supports Oracle Database Server versions 11.1 and higher. Version 4.3.1 or higher of the cx_Oracle Python driver is required, although we recommend version 5.1.3 or later as these versions support Python 3.
Note that due to a Unicode-corruption bug in cx_Oracle 5.0, that version of the driver should not be used with Django; cx_Oracle 5.0.1 resolved this issue, so if you’d like to use a more recent cx_Oracle, use version 5.0.1.
cx_Oracle 5.0.1 or greater can optionally be compiled with the WITH_UNICODE environment variable. This is recommended but not required.
In order for the python manage.py migrate command to work, your Oracle database user must have privileges to run the following commands:
To run a project’s test suite, the user usually needs these additional privileges:
Note that, while the RESOURCE role has the required CREATE TABLE, CREATE SEQUENCE, CREATE PROCEDURE and CREATE TRIGGER privileges, and a user granted RESOURCE WITH ADMIN OPTION can grant RESOURCE, such a user cannot grant the individual privileges (e.g. CREATE TABLE), and thus RESOURCE WITH ADMIN OPTION is not usually sufficient for running tests.
Some test suites also create views; to run these, the user also needs the CREATE VIEW WITH ADMIN OPTION privilege. In particular, this is needed for Django’s own test suite.
Prior to Django 1.8, the test user was granted the CONNECT and RESOURCE roles, so the extra privileges required for running the test suite were different.
All of these privileges are included in the DBA role, which is appropriate for use on a private developer’s database.
The Oracle database backend uses the SYS.DBMS_LOB package, so your user will require execute permissions on it. It’s normally accessible to all users by default, but in case it is not, you’ll need to grant permissions like so:
GRANT EXECUTE ON SYS.DBMS_LOB TO user;
To connect using the service name of your Oracle database, your settings.py file should look something like this:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'xe',
'USER': 'a_user',
'PASSWORD': 'a_password',
'HOST': '',
'PORT': '',
}
}
In this case, you should leave both HOST and PORT empty. However, if you don’t use a tnsnames.ora file or a similar naming method and want to connect using the SID (“xe” in this example), then fill in both HOST and PORT like so:
DATABASES = {
'default': {
'ENGINE': 'django.db.backends.oracle',
'NAME': 'xe',
'USER': 'a_user',
'PASSWORD': 'a_password',
'HOST': 'dbprod01ned.mycompany.com',
'PORT': '1540',
}
}
You should either supply both HOST and PORT, or leave both as empty strings. Django will use a different connect descriptor depending on that choice.
If you plan to run Django in a multithreaded environment (e.g. Apache using the default MPM module on any modern operating system), then you must set the threaded option of your Oracle database configuration to True:
'OPTIONS': {
'threaded': True,
},
Failure to do this may result in crashes and other odd behavior.
By default, the Oracle backend uses a RETURNING INTO clause to efficiently retrieve the value of an AutoField when inserting new rows. This behavior may result in a DatabaseError in certain unusual setups, such as when inserting into a remote table, or into a view with an INSTEAD OF trigger. The RETURNING INTO clause can be disabled by setting the use_returning_into option of the database configuration to False:
'OPTIONS': {
'use_returning_into': False,
},
In this case, the Oracle backend will use a separate SELECT query to retrieve AutoField values.
Oracle imposes a name length limit of 30 characters. To accommodate this, the backend truncates database identifiers to fit, replacing the final four characters of the truncated name with a repeatable MD5 hash value. Additionally, the backend turns database identifiers to all-uppercase.
To prevent these transformations (this is usually required only when dealing with legacy databases or accessing tables which belong to other users), use a quoted name as the value for db_table:
class LegacyModel(models.Model):
class Meta:
db_table = '"name_left_in_lowercase"'
class ForeignModel(models.Model):
class Meta:
db_table = '"OTHER_USER"."NAME_ONLY_SEEMS_OVER_30"'
Quoted names can also be used with Django’s other supported database backends; except for Oracle, however, the quotes have no effect.
When running migrate, an ORA-06552 error may be encountered if certain Oracle keywords are used as the name of a model field or the value of a db_column option. Django quotes all identifiers used in queries to prevent most such problems, but this error can still occur when an Oracle datatype is used as a column name. In particular, take care to avoid using the names date, timestamp, number or float as a field name.
Django generally prefers to use the empty string (‘’) rather than NULL, but Oracle treats both identically. To get around this, the Oracle backend ignores an explicit null option on fields that have the empty string as a possible value and generates DDL as if null=True. When fetching from the database, it is assumed that a NULL value in one of these fields really means the empty string, and the data is silently converted to reflect this assumption.
The Oracle backend stores TextFields as NCLOB columns. Oracle imposes some limitations on the usage of such LOB columns in general:
In addition to the officially supported databases, there are backends provided by 3rd parties that allow you to use other databases with Django:
The Django versions and ORM features supported by these unofficial backends vary considerably. Queries regarding the specific capabilities of these unofficial backends, along with any support queries, should be directed to the support channels provided by each 3rd party project.
Jun 02, 2016