String Functions

TiDB supports most of the string functions available in MySQL 5.7, some of the string functions available in MySQL 8.0, and some of the functions available in Oracle 21.

For comparisons between functions and syntax of Oracle and TiDB, see Comparisons between Functions and Syntax of Oracle and TiDB.

Supported functions

NameDescription
ASCII()Return numeric value of left-most character
BIN()Return a string containing binary representation of a number
BIT_LENGTH()Return length of argument in bits
CHAR()Return the character for each integer passed
CHAR_LENGTH()Return number of characters in argument
CHARACTER_LENGTH()Synonym for CHAR_LENGTH()
CONCAT()Return concatenated string
CONCAT_WS()Return concatenate with separator
ELT()Return string at index number
EXPORT_SET()Return a string such that for every bit set in the value bits, you get an on string and for every unset bit, you get an off string
FIELD()Return the index (position) of the first argument in the subsequent arguments
FIND_IN_SET()Return the index position of the first argument within the second argument
FORMAT()Return a number formatted to specified number of decimal places
FROM_BASE64()Decode to a base-64 string and return result
HEX()Return a hexadecimal representation of a decimal or string value
INSERT()Insert a substring at the specified position up to the specified number of characters
INSTR()Return the index of the first occurrence of substring
LCASE()Synonym for LOWER()
LEFT()Return the leftmost number of characters as specified
LENGTH()Return the length of a string in bytes
LIKESimple pattern matching
LOCATE()Return the position of the first occurrence of substring
LOWER()Return the argument in lowercase
LPAD()Return the string argument, left-padded with the specified string
LTRIM()Remove leading spaces
MAKE_SET()Return a set of comma-separated strings that have the corresponding bit in bits set
MID()Return a substring starting from the specified position
NOT LIKENegation of simple pattern matching
NOT REGEXPNegation of REGEXP
OCT()Return a string containing octal representation of a number
OCTET_LENGTH()Synonym for LENGTH()
ORD()Return character code for leftmost character of the argument
POSITION()Synonym for LOCATE()
QUOTE()Escape the argument for use in an SQL statement
REGEXPPattern matching using regular expressions
REGEXP_INSTR()Return the starting index of the substring that matches the regular expression (Partly compatible with MySQL. For more details, see Regular expression compatibility with MySQL)
REGEXP_LIKE()Whether the string matches the regular expression (Partly compatible with MySQL. For more details, see Regular expression compatibility with MySQL)
REGEXP_REPLACE()Replace substrings that match the regular expression (Partly compatible with MySQL. For more details, see Regular expression compatibility with MySQL)
REGEXP_SUBSTR()Return the substring that matches the regular expression (Partly compatible with MySQL. For more details, see Regular expression compatibility with MySQL)
REPEAT()Repeat a string the specified number of times
REPLACE()Replace occurrences of a specified string
REVERSE()Reverse the characters in a string
RIGHT()Return the specified rightmost number of characters
RLIKESynonym for REGEXP
RPAD()Append string the specified number of times
RTRIM()Remove trailing spaces
SPACE()Return a string of the specified number of spaces
STRCMP()Compare two strings
SUBSTR()Return the substring as specified
SUBSTRING()Return the substring as specified
SUBSTRING_INDEX()Return a substring from a string before the specified number of occurrences of the delimiter
TO_BASE64()Return the argument converted to a base-64 string
TRANSLATE()Replace all occurrences of characters by other characters in a string. It does not treat empty strings as NULL as Oracle does.
TRIM()Remove leading and trailing spaces
UCASE()Synonym for UPPER()
UNHEX()Return a string containing hex representation of a number
UPPER()Convert to uppercase
WEIGHT_STRING()Return the weight string for the input string

Unsupported functions

  • LOAD_FILE()
  • MATCH()
  • SOUNDEX()

Regular expression compatibility with MySQL

The following sections describe the regular expression compatibility with MySQL, including REGEXP_INSTR(), REGEXP_LIKE(), REGEXP_REPLACE(), and REGEXP_SUBSTR().

Syntax compatibility

MySQL implements regular expression using International Components for Unicode (ICU), and TiDB uses RE2. To learn the syntax differences between the two libraries, you can refer to the ICU documentation and RE2 Syntax.

match_type compatibility

The value options of match_type between TiDB and MySQL are:

  • Value options in TiDB are "c", "i", "m", and "s", and value options in MySQL are "c", "i", "m", "n", and "u".

  • The "s" in TiDB corresponds to "n" in MySQL. When "s" is set in TiDB, the . character also matches line terminators (\n).

    For example, the SELECT REGEXP_LIKE(a, b, "n") FROM t1 in MySQL is the same as the SELECT REGEXP_LIKE(a, b, "s") FROM t1 in TiDB.

  • TiDB does not support "u", which means Unix-only line endings in MySQL.

Data type compatibility

The difference between TiDB and MySQL support for the binary string type:

  • MySQL does not support binary strings in regular expression functions since 8.0.22. For more details, refer to MySQL documentation. But in practice, regular functions can work in MySQL when all parameters or return types are binary strings. Otherwise, an error will be reported.
  • Currently, TiDB prohibits using binary strings and an error will be reported under any circumstances.

Other compatibility

  • The behavior of replacing empty strings in TiDB is different from MySQL. Taking REGEXP_REPLACE("", "^$", "123") as an example:

    • MySQL does not replace the empty string and returns "" as the result.
    • TiDB replaces the empty string and returns "123" as the result.
  • The keyword used for capturing groups in TiDB is different from MySQL. MySQL uses $ as the keyword, while TiDB uses \\ as the keyword. In addition, TiDB only supports capturing groups numbered from 0 to 9.

    For example, the following SQL statement returns ab in TiDB:

    SELECT REGEXP_REPLACE('abcd','(.*)(.{2})$','\\1') AS s;

Was this page helpful?