YouTip LogoYouTip

Mysql Handling Duplicates

# MySQL Handling Duplicates Some MySQL tables may contain duplicate records. In some cases, we allow duplicate data to exist, but sometimes we also need to delete this duplicate data. In this chapter, we will introduce how to prevent duplicate data in tables and how to delete duplicate data from tables. * * * ## Preventing Duplicate Records in a Table You can set specified fields in a MySQL table as **PRIMARY KEY** or **UNIQUE** index to ensure data uniqueness. Let's try an example: The following table has no index or primary key, so it allows multiple duplicate records. ```sql CREATE TABLE person_tbl ( first_name CHAR(20), last_name CHAR(20), sex CHAR(10) ); If you want to ensure that the `first_name` and `last_name` fields in the table cannot have duplicate data, you can set a composite primary key to enforce data uniqueness. If you set a composite primary key, the default value for that key cannot be NULL and should be set to NOT NULL. As shown below: ```sql CREATE TABLE person_tbl ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), PRIMARY KEY (last_name, first_name) ); If we set a UNIQUE index, then when inserting duplicate data, the SQL statement will fail to execute and throw an error. The difference between `INSERT IGNORE INTO` and `INSERT INTO` is that `INSERT IGNORE INTO` will ignore data that already exists in the database. If there is no data in the database, it inserts new data; if there is data, it skips that data. This way, existing data in the database is preserved, achieving the purpose of inserting data into gaps. The following example uses `INSERT IGNORE INTO`. After execution, it will not throw an error and will not insert duplicate data into the table: ```sql mysql> INSERT IGNORE INTO person_tbl (last_name, first_name) -> VALUES( 'Jay', 'Thomas'); Query OK, 1 row affected (0.00 sec) mysql> INSERT IGNORE INTO person_tbl (last_name, first_name) -> VALUES( 'Jay', 'Thomas'); Query OK, 0 rows affected (0.00 sec) `INSERT IGNORE INTO` does not return an error when inserting duplicate data after setting record uniqueness; it only returns a warning. `REPLACE INTO`, if a record with the same primary or unique key exists, deletes it first and then inserts the new record. Another way to set data uniqueness is to add a UNIQUE index, as shown below: ```sql CREATE TABLE person_tbl ( first_name CHAR(20) NOT NULL, last_name CHAR(20) NOT NULL, sex CHAR(10), UNIQUE (last_name, first_name) ); * * * ## Counting Duplicate Records Below, we will count the number of duplicate records for `first_name` and `last_name` in the table: ```sql mysql> SELECT COUNT(*) as repetitions, last_name, first_name -> FROM person_tbl -> GROUP BY last_name, first_name -> HAVING repetitions > 1; The above query will return the number of duplicate records in the `person_tbl` table. Generally, to query duplicate values, perform the following steps: * Determine which column contains values that may be duplicated. * Use `COUNT(*)` in the column selection list for those columns. * List the columns in the `GROUP BY` clause. * Set the `HAVING` clause to have a repetition count greater than 1. * * * ## Filtering Duplicate Data If you need to read non-duplicate data, you can use the `DISTINCT` keyword in the `SELECT` statement to filter duplicate data. ```sql mysql> SELECT DISTINCT last_name, first_name -> FROM person_tbl; You can also use `GROUP BY` to read non-duplicate data from the table: ```sql mysql> SELECT last_name, first_name -> FROM person_tbl -> GROUP BY (last_name, first_name); * * * ## Deleting Duplicate Data If you want to delete duplicate data from a table, you can use the following SQL statements: ```sql mysql> CREATE TABLE tmp SELECT last_name, first_name, sex FROM person_tbl GROUP BY (last_name, first_name, sex); mysql> DROP TABLE person_tbl; mysql> ALTER TABLE tmp RENAME TO person_tbl; Of course, you can also use a simpler method by adding an INDEX and PRIMARY KEY to the table to delete duplicate records. The method is as follows: ```sql mysql> ALTER IGNORE TABLE person_tbl -> ADD PRIMARY KEY (last_name, first_name);
← Mysql Sql InjectionMysql Using Sequences β†’