Database string types comparison: VARCHAR(), VARCHAR(n), TEXT
=> here 20.01.22 varchar
MySql
VARCHAR:
=> As the name suggests, varchar means character data that is varying.
# Also known as Variable Character, it is an indeterminate length string data type.
# It can hold numbers, letters and special characters.
#
MsSql 2008+ can store up to 8000 characters as the maximum length of the string using varchar data type.
#
SQL varchar usually holds 1 byte per character and 2 more bytes for the length information.
# It is recommended to use varchar as the data type when columns have variable length and
the actual data is way less than the given capacity.
VARCHAR(max):
=> How
SQL varchar(max) is different from varchar(n)?
# There are times where
SQL developers (including myself) usually define varchar datatype without a length,
and subsequently, are failed to insert string records in the
SQL table.
This is because
SQL Server allocates 1 character space as the default value to the varchar column
that is defined without any length.
# In practical scenarios, varchar(n) is used to store variable length value as a string, here ‘n’ denotes the string length
in bytes and it can go up to 8000 characters (
MsSql).
# Example
CREATE TABLE Demovarchar
(
Id int NOT NULL IDENTITY(1,1),
LastName varchar(10),
FirstName varchar(10),
Gender varchar,
DepartmentName varchar(20),
Age int
)
INSERT INTO Demovarchar VALUES('Gilbert', 'Kevin','M','Tool Design',33) => OK
INSERT INTO Demovarchar VALUES('Newton Hamilton', 'Isaac','M','Design Head',69) => ERROR
=> VARCHAR( N ) vs. VARCHAR( MAX ):
=> Difference between the varchar(max) and varchar(n) data type [in
MsSql ]
varchar(max) varchar(n)
# store up to 2 GB of data # store up to 8000 bytes data
# <= 8000 bytes it uses allocation
unit IN_ROW_Data. # stores data in the standard data page
> 8000 bytes it uses LOB_Data page and
stores its pointer in the IN_ROW_Data page
# cannot create an index on the key column # can create an index on this data type
of the varchar(max) data type
# cannot compress the LOB data # can compress data for this data type
# Data retrieval and updation on the LOB data is slow
We do not face such issue in the varchar(n) data type
=>
MsSql 2005+ got around this limitation of 8KB storage size and provided a workaround with varchar(max).
It is a non-
Unicode large variable-length character data type and can store a maximum of 2^31-1 bytes (2 GB)
of non-
Unicode characters.
CREATE TABLE Demovarcharmax
(
ID INT IDENTITY(1, 1) ,
StringCol VARCHAR(MAX)
)
INSERT INTO Demovarcharmax(StringCol) VALUES(REPLICATE(CAST('B' AS VARCHAR(MAX)), 15000))
SELECT Id, StringCol,len(StringCol) AS LengthOfString FROM Demovarcharmax
=> One limitation of using varchar(max) is we cannot create an index that has a varchar(max) as a key column,
instead, it is advisable to do a Full-text index on that column.
=> Do NOT consider it as the varchar with default value = 1.
=>
MsSql 2019+, announced support for UTF-8 character encoding to the existing data types (varchar and char).
Reduction in storage and performance improvements
UTF-8 support for varchar data type provides substantial storage savings depending on the character set in use.
For eg, using an UTF-8 enabled collation, changing the column data type from nvarchar(20) to varchar(20) offers a
significant drop in storage requirements since nvarchar(20) requires 40 bytes for storage and varchar(20) needs 20.
=> Impact on string length of
SQL varchar with CAST and CONVERT functions
SQL Server stores long string data in the commonly used varchar data type and it becomes helpful to know the
expected and maximum lengths of the strings to display the results in the UI.
=>
MsSql takes 30 as the default length for
SQL Varchar (with unspecified varchar length) in the
SQL Server
when it is used with CAST and CONVERT functions.
In our case, even though the length of the string was 52, it returned 30 as the length as shown in the last result output.
=> One important point to note here is that when an unspecified length varchar field is created,
the default length of such field is 1 (shown in red color below).
When varchar length is unspecified and is used with CAST or CONVERT functions, the CAST or CONVERT returns
n=30 as the default string length of this conversion (marked in blue color below).
=> Insert 10,000 records, to check the data insertion time. Y
Employee_varchar_2000 insertion time 0.08 Seconds
Employee_varchar_4500 insertion time 0.19 Seconds
Employee_varchar_8000 insertion time 0.31 Seconds
Employee_varchar_Max insertion time 2.72 Seconds
Sqlite:
=> What is the maximum size of a VARCHAR ?
# SQLite does not enforce the length of a VARCHAR.
# You can declare a VARCHAR(10) and SQLite will be happy to store a 500-million character string there.
And it will keep all 500-million characters intact.
# Your content is never truncated.
# SQLite understands the column type of "VARCHAR(N)" to be the same as "TEXT", regardless of the value of N.
=> TEXT. The value is a text string, stored using the database encoding (UTF-8, UTF-16BE or UTF-16LE).
# If the declared type of the column contains any of the strings "CHAR", "CLOB", or "TEXT" then that column
has TEXT affinity.
# Notice that the type VARCHAR contains the string "CHAR" and is thus assigned TEXT affinity.
=> Affinity Name Examples
# The following table shows how many common datatype names from more traditional
SQL implementations are
converted into affinities by the five rules of the previous section.
# This table shows only a small subset of the datatype names that SQLite will accept.
# Note that numeric arguments in parentheses that following the type name (ex: "VARCHAR(255)") are ignored by
SQLite - SQLite does not impose any length restrictions
(other than the large global SQLITE_MAX_LENGTH limit) on the length of strings, BLOBs or numeric values.
=> Example Typenames
CREATE TABLE Statement or CAST Expression Resulting Affinity Rule Used To Determine Affinity
CHARACTER(20) TEXT 2
VARCHAR(255)
VARYING CHARACTER(255)
NCHAR(55)
NATIVE CHARACTER(70)
NVARCHAR(100)
TEXT
CLOB